• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pulibrary / bibdata / f04bc944-f9b4-4a42-8b26-dcacd0e3e688

11 Mar 2025 10:27PM UTC coverage: 34.017% (-58.1%) from 92.162%
f04bc944-f9b4-4a42-8b26-dcacd0e3e688

Pull #2653

circleci

christinach
Add new lc_subject_facet field.
Helps with the vocabulary work https://github.com/pulibrary/orangelight/pull/3386
In this new field we index only the lc subject heading and the subdivisions
So that when the user searches using the Details section, they can query solr for
all the subject headings and their divisions.

This is needed for the Subject browse Vocabulary work.
example: "lc_subject_facet": [
             "Booksellers and bookselling—Italy—Directories",
             "Booksellers and bookselling-Italy",
             "Booksellers and bookselling"
              ]
Pull Request #2653: Add new lc_subject_facet field.

1 of 3 new or added lines in 1 file covered. (33.33%)

2215 existing lines in 93 files now uncovered.

1294 of 3804 relevant lines covered (34.02%)

0.99 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

20.37
/marc_to_solr/lib/cache_map.rb
1
require 'faraday'
1✔
2
require 'active_support/core_ext/string'
1✔
3

4
# Cached mapping of ARKs to Bib IDs
5
# Retrieves and stores paginated Solr responses containing the ARK's and BibID's
6
class CacheMap
1✔
7
  def self.cache_key_for(ark:)
1✔
UNCOV
8
    ark.gsub(%r{[:/]}, '_')
×
9
  end
10

11
  # Constructor
12
  # @param cache [ActiveSupport::Cache::Store, CacheAdapter] Low-level cache
13
  # @param host [String] the host for the Blacklight endpoint
14
  # @param path [String] the path for the Blacklight endpoint
15
  # @param rows [Integer] the number of rows for each Solr response
16
  # @param logger [IO] the logging device
17
  def initialize(cache:, host:, path: '/catalog.json', rows: 1000000, logger: STDOUT)
1✔
UNCOV
18
    @cache = cache
×
UNCOV
19
    @host = host
×
UNCOV
20
    @path = path
×
UNCOV
21
    @rows = rows
×
UNCOV
22
    @logger = logger
×
23
  end
24

25
  # Seed the cache
26
  # @param page [Integer] the page number at which to start the caching
27
  def seed!(page: 1)
1✔
UNCOV
28
    @logger.info "Seeding the cache for #{@host} using Solr..."
×
29
    # Determine if the values from the Solr response have been cached
UNCOV
30
    @cached_values = @cache.fetch(cache_key)
×
UNCOV
31
    return if page == 1 && !@cached_values.nil?
×
32

UNCOV
33
    response = query(page:)
×
UNCOV
34
    if response.empty?
×
UNCOV
35
      @logger.warn "No response could be retrieved from Solr for #{@host}"
×
UNCOV
36
      return
×
37
    end
38

UNCOV
39
    pages = response.fetch('pages')
×
40

UNCOV
41
    cache_page(response)
×
42

43
    # Recurse if there are more pages to cache
UNCOV
44
    if pages.fetch('last_page?') == false
×
45
      seed!(page: page + 1)
×
46
    else
47
      # Otherwise, mark within the cache that a thread has populated all of the ARK/BibID pairs
UNCOV
48
      @cache.write(cache_key, cache_key)
×
49
    end
50
  end
51

52
  # Fetch a BibID from the cache
53
  # @param ark [String] the ARK mapped to the BibID
54
  # @return [String, nil] the BibID (or nil if it has not been mapped)
55
  def fetch(ark)
1✔
56
    # Attempt to retrieve this from the cache
UNCOV
57
    value = @cache.fetch(self.class.cache_key_for(ark:))
×
58

UNCOV
59
    if value.nil?
×
UNCOV
60
      @logger.warn "Failed to resolve #{ark}" if URI::ARK.princeton_ark?(url: ark)
×
61
    else
UNCOV
62
      @logger.debug "Resolved #{ark} for #{value}"
×
63
    end
UNCOV
64
    value
×
65
  end
66

67
  private
1✔
68

69
    # Cache a page
70
    # @param page [Hash] Solr response page
71
    def cache_page(page)
1✔
UNCOV
72
      docs = page.fetch('docs')
×
UNCOV
73
      docs.each do |doc|
×
UNCOV
74
        arks = doc.fetch('identifier_ssim', [])
×
UNCOV
75
        bib_ids = doc.fetch('source_metadata_identifier_ssim', [])
×
UNCOV
76
        id = doc.fetch('id')
×
77
        # Grab the human readable type
UNCOV
78
        resource_types = doc.fetch('internal_resource_ssim', nil) || doc.fetch('has_model_ssim', nil)
×
UNCOV
79
        resource_type = resource_types.first
×
80

UNCOV
81
        ark = arks.first
×
UNCOV
82
        bib_id = bib_ids.first
×
83

84
        # Write this to the file cache
UNCOV
85
        key_for_ark = self.class.cache_key_for(ark:)
×
86
        # Handle collisions by refusing to overwrite the first value
UNCOV
87
        unless @cache.exist?(key_for_ark)
×
UNCOV
88
          @cache.write(key_for_ark, id:, source_metadata_identifier: bib_id, internal_resource: resource_type)
×
UNCOV
89
          @logger.debug "Cached the mapping for #{ark} to #{bib_id}"
×
90
        end
91
      end
92
    end
93

94
    # Query the service using the endpoint
95
    # @param [Integer] the page parameter for the query
96
    def query(page: 1)
1✔
97
      begin
UNCOV
98
        url = URI::HTTPS.build(host: @host, path: @path, query: "q=&rows=#{@rows}&page=#{page}&f[identifier_tesim][]=ark")
×
UNCOV
99
        http_response = Faraday.get(url)
×
UNCOV
100
        values = JSON.parse(http_response.body)
×
UNCOV
101
        values.fetch('response')
×
102
      rescue StandardError => e
UNCOV
103
        @logger.error "Failed to seed the ARK cached from Solr: #{e}"
×
UNCOV
104
        {}
×
105
      end
106
    end
107

108
    # Generate the unique key for the cache from the hostname and path for Solr
109
    # @return [String] the cache key
110
    def cache_key
1✔
UNCOV
111
      [@host.gsub(%r{[./]}, '_'), @path.gsub(%r{[./]}, '_')].join('_')
×
112
    end
113
end
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc