Chiyu Zhang, Muhammad Abdul-Mageed, Ganesh Jarwaha
Publish at Findings of ACL 2023
Illustration of our proposed InfoDCL framework. We exploit distant/surrogate labels (i.e., emojis) to supervise two contrastive losses, corpus-aware contrastive loss (CCL) and Light label-aware contrastive loss (LCL-LiT). Sequence representations from our model should keep the cluster of each class distinguishable and preserve semantic relationships between classes.- InfoDCL-RoBERTa trained with TweetEmoji-EN: https://huggingface.co/UBC-NLP/InfoDCL-emoji
- InfoDCL-RoBERTa trained with TweeetHashtag-EN: https://huggingface.co/UBC-NLP/InfoDCL-hashtag
We develop our models based on the scripts of SimCSE (Gao et al., 2021).
git clone https://github.com/UBC-NLP/infodcl
cd infodcl
python setup.py install
Please cite us if you use our code or models.
@article{zhang-2023-infodcl,
author = {
Chiyu Zhang and
Muhammad Abdul-Mageed and
Ganesh Jarwaha
},
title = {Contrastive Learning of Sociopragmatic Meaning in Social Media},
booktitle = {Findings of the Association for Computational Linguistics: {ACL} 2023},
year = {2023},
}
If you have any questions related to the code or the paper, feel free to email Chiyu Zhang ([email protected]).