• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pulibrary / pymarc_dedupe
100%

Build:
DEFAULT BRANCH: main
Repo Added 21 Dec 2024 08:02PM UTC
Files 20
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH main
branch: SELECT
CHANGE BRANCH
x
  • No branch selected
  • actually_run
  • add_ci
  • add_postgres
  • clean_up_example_files
  • create_confusion_matrix
  • create_csv_from_dictionary
  • find_dups_single_file
  • get_coverage_back_to_100
  • give_both_input_and_output_paths_for_csv
  • goldrush_diacritics
  • i1_move_normalization
  • include_goldrush_in_export
  • increase_code_coverage
  • linting
  • main
  • pass_json_marc_records
  • record_linkage
  • record_linkage_tested
  • refactor_as_goldrush
  • remainder_of_gold_rush
  • remove_attribute_error_exceptions
  • remove_spaces_from_title
  • test_on_scsb_data

27 May 2025 10:43PM UTC coverage: 100.0%. Remained the same
447d729c-fa48-403a-a433-ffb3be0b5c85

push

circleci

web-flow
Use postgres database version for very large data sets (#24)

* Green locally - connect to Postgres DB

- Still need to connect to DB in other environments, including CI
- Need to clear DB between tests
- Need to make sure not to make tons of duplicate records (try to find before creating?)

* Set up DB connection for CircleCI

* Linting fixes

* Ensure that the same records are not re-created

* Try setting up environment using dynaconf

* Fix test DB connection

* Try to fix connection to DB in CI

* Remove pyproject.toml for now

* Checkpoint - green, reads and writes to/from database

* Put blocking in marc_to_db.py

* Try to increase test coverage

* Linting fixes

* Try to increase test coverage

* Test for empty input directory

* Make table creation a class method, work on mapping records to DB

* Use streaming JSON for memory performance

* Linting fixes

* Finish writing to CSV

* Linting, change to cluster_id

* More consistent naming

* Small fixes

* Linting fixes

* Increase test coverage for scoring

* rescue if there is no "a" field in author

* Use threading, override xml_reader for error handling

* Add output of comparison experiment - uses data set from Mark Z

* Setup for db comparison

* Add print statement

* Lint

* Increase coverage, lint

* Increase coverage

* Increase test coverage again

* Try using python orb for easier CI caching

* Fix edition mapping, test coverage

* Add pyproject.toml, ignore cache for ruff

* Formatting

* Re-organize into folders

* Add Goldrush to db & report

* Update data for comparison

* Ensure everything is linted, start adding module comments

* Remove unneeded comments in circleci config

291 of 291 new or added lines in 13 files covered. (100.0%)

842 of 842 relevant lines covered (100.0%)

1.0 hits per line

Relevant lines Covered
Build:
Build:
842 RELEVANT LINES 842 COVERED LINES
1.0 HITS PER LINE
Source Files on main
  • Tree
  • List 20
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line

Recent builds

Builds Branch Commit Type Ran Committer Via Coverage
447d729c... main Use postgres database version for very large data sets (#24) * Green locally - connect to Postgres DB - Still need to connect to DB in other environments, including CI - Need to clear DB between tests - Need to make sure not to make tons of dupl... push 27 May 2025 10:45PM UTC web-flow circleci
100.0
9579bfa7... add_postgres Remove unneeded comments in circleci config Pull #24 27 May 2025 05:52PM UTC maxkadel circleci
100.0
7aee2d0b... add_postgres Ensure everything is linted, start adding module comments Pull #24 27 May 2025 05:38PM UTC maxkadel circleci
100.0
06f7f345... add_postgres Ensure everything is linted, start adding module comments Pull #24 27 May 2025 05:33PM UTC maxkadel circleci
100.0
2bc718ed... add_postgres Update data for comparison Pull #24 27 May 2025 05:12PM UTC maxkadel circleci
100.0
7da669a4... add_postgres Add Goldrush to db & report Pull #24 27 May 2025 04:15PM UTC maxkadel circleci
100.0
9ae3ef12... add_postgres Re-organize into folders Pull #24 26 May 2025 07:56PM UTC maxkadel circleci
100.0
1dc1bd2c... add_postgres Re-organize into folders Pull #24 26 May 2025 07:55PM UTC maxkadel circleci
100.0
75cf1a85... add_postgres Formatting Pull #24 26 May 2025 06:33PM UTC maxkadel circleci
100.0
53ae21ff... add_postgres Add pyproject.toml, ignore cache for ruff Pull #24 26 May 2025 06:30PM UTC maxkadel circleci
100.0
See All Builds (113)
  • Repo on GitHub
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc