Skip to content

Latest commit

 

History

History
56 lines (45 loc) · 3.06 KB

README.md

File metadata and controls

56 lines (45 loc) · 3.06 KB

Tests Documentation Codecov Black Poetry DVC

EDS-Pseudonymisation

This project aims at detecting identifying entities at AP-HP's Clinical Data Warehouse:

Label Description
ADRESSE Street address, eg 33 boulevard de Picpus
DATE Any absolute date other than a birthdate
DATE_NAISSANCE Birthdate
HOPITAL Hospital name, eg Hôpital Rothschild
IPP Internal AP-HP identifier for patients, displayed as a number
MAIL Email address
NDA Internal AP-HP identifier for visits, displayed as a number
NOM Any last name (patients, doctors, third parties)
PRENOM Any first name (patients, doctors, etc)
SECU Social security number
TEL Any phone number
VILLE Any city
ZIP Any zip code

Publication

Please find our arXiv preprint at the following link: https://arxiv.org/pdf/2303.13451.pdf.

If you use EDS-Pseudo, please cite us as below:

@article{tannier2023development,
  title={Development and validation of a natural language processing algorithm to pseudonymize documents in the context of a clinical data warehouse},
  author={Tannier, Xavier and Wajsb{\"u}rt, Perceval and Calliger, Alice and Dura, Basile and Mouchet, Alexandre and Hilka, Martin and Bey, Romain},
  journal={arXiv preprint arXiv:2303.13451},
  year={2023}
}

Documentation

Visit the documentation for more information!

Acknowledgement

We would like to thank Assistance Publique – Hôpitaux de Paris and AP-HP Foundation for funding this project.