Some fixes to enable training the model:
- committed the missing script
infer.py
- changed config default bert model to
camembert-base
- put
config.cfg
as a dependency, not params - default to cpu training
- allow for missing metadata (i.e. omop's
note_class_source_value
)
Many fixes along the publication of our article:
- Tests for the rule-based components
- Code documentation and cleaning
- Experiment and analysis scripts
- Charts and tables in the Results page of our documentation
Inception ! 🎉
- spaCy project for pseudonymisation
- Pseudonymisation-specific pipelines:
pseudonymisation-rules
for rule-based pseudonymisationpseudonymisation-dates
for date detection and normalisationstructured-data-matcher
for structured data detection (eg first and last name, available in the information system)
- Evaluation methodology