RelEx is a clinical Relation Extraction Framework to identify relations between two entities. The framework is divided into two main components: co-location based (rule based) Relation Extraction and deep learning based (CNN) Relation Extraction.
The system is designed to consider two entities in a sentence and determine whether a relation exists in between the entities. RelEx includes a rule-based approach based on the co-location information of the drug entity. The co-location information of the drug determines with respect to the non-drug entity if the entity that is being referenced is a drug. The Deep learning based approach is further divided into components depending on the representation of what we feed into the Convolutional Neural Network (CNN). The Deep learning based approach consists of 2 major components: Sentence-CNN and Segment-CNN. Sentence-CNN further divides into single label Sentence-CNN and multi label Sentence-CNN.
For example, the sentence
Once her hematocrit stabilized, she was started on a heparin gtt with coumadin overlap
contains a non-drug entity,gtt (Route)and two drugs Heparin and Coumadin. The non-drug entity has a relation with the closest drug occurrence Heparin but not with the other drug Coumadin.
Create a python 3.6 virtual environment and install the requirements as given for each respective approach. For the rule-based approach:
pip install -r Colocation_requirements.txt
For the deep learning-based approach:
pip install -r CNN_requirements.txt
Sample dataset (some files from N2C2 2018 and i2b2 2010 corpus) and sample scripts are provided for both approaches (/RelEx_Colocation/, /relex/). The sample scripts take the paths for the data folder (relative path of the sample dataset) and predicts the relation using the respective algorithms.
The 2 methods are documented separately. The following links directs to their respective documentations.
- Samantha Mahendran - Main author - SamMahen
- Cora Lewis
- Bridget T McInnes
This package is licensed under the GNU General Public License.