03 Jul 08:26

USM-CHU-FGuyon

v0.5.1

d99fc64

v0.5.1 Latest

Latest

🐛 Bugfix:
Error in medication processing of eICU, see #32

Assets 2

26 Jun 14:57

USM-CHU-FGuyon

v0.5.0

c40f454

v0.5.0

⚡Speedups:

Support for time resampling, min-max clipping, and pivoting data to wide format is temporarily dropped. They will be moved later in the pipeline, this provided major speedups.
Exploiting polars laziness to provide fast harmonization and low memory pressure.
No more processing by patient chunk and individual patient files.

The full OMOP-ization pipeline can be run for all 5 databases in a single day :)

🐛bugfix:

ICU stays with missing Length-of-stay data were dropped from the database. All patients are now preserved.
Drug exposures that were not omop-ized were kept, with drug_concept_id = 0, previously they were dropped.

🏃 Getting further:

Drug dosages were partially omop-ized: dosage and routes were extracted. Some units were omop-ized, routes were not harmonized yet.
Observation period table was added
drug strength table is still a work in progress, contributions are welcome ! especially for eICU.

Assets 2

28 May 06:12

USM-CHU-FGuyon

v0.4.2

af362da

v0.4.2

Changes :
⚡ Speedup : Converting MIMIC-III, MIMIC-IV and Amsterdam's csv.gz files as parquet in step 1. This conversion is only done once and allowed speeding up the following step.

Assets 2

03 May 12:37

USM-CHU-FGuyon

v0.4.1

f621248

v0.4.1

Changes :

🐛 Bugfix : visit_occurrence_id is no longer missing from condition_occurrence table.
⚡ Speedup : Converting eICU's csv.gz files as parquet in step 1. This makes re-running 1_extract_eicu.py 3 times faster.

Assets 2

24 Apr 13:28

USM-CHU-FGuyon

v0.4.0

10ac733

v0.4.0

Started to speed up some operations using polars.

Assets 2

12 Mar 06:58

USM-CHU-FGuyon

v0.3.2

7843836

v0.3.2

Corrections on variables and dtypes in final OMOP tables.

Bugfix:

Removed visit_start_date from measurement table, and string values in care_site table's place_of_service_concept_id
Save all OMOP tables to parquet + corrected wrong dtypes on some tables.
Rounding times to the second. This avoids an error due to high precision in time when writing some records to parquet OMOP tables.

Minor changes:

Refactored timeseriespreprocessing to timeseriesprocessor
Option to skip reset_dir() when starting 2_{dataset}.py

Assets 2

07 Mar 08:24

USM-CHU-FGuyon

v0.3.1

d6f6be7

v0.3.1

Major changes:

Generated a numeric patient id for OMOP-standardization. (Issue #15 )
Added some insight for running times of each scripts. (as suggested in Issue #24 )
Simplified the structure of paths.json
Fixed inconsistency in datetimes of OMOP tables : some datetime columns contained the date, other contained the time of day. Now they all contain the full datetime. Issue #26

Minor changes:

Added unit_concept_id to auxillary_files/user_input/timeseries_variables.csv
Fixed harmless SettingWithCopy warning happening in database_processing/dataprocessor.py

Thanks to @mostafaalishahi, and @xinyuejohn for their contribution to the project.