Abdellatif, M., & Elgammal, A. (2020). ULMFiT replication. Proceedings of the 12th Language Resources and Evaluation Conference, 5579–5587. https://www.aclweb.org/anthology/2020.lrec-1.685
António Rodrigues, J., Branco, R., Silva, J., & Branco, A. (2020). Reproduction and revival of the argument reasoning comprehension task. Proceedings of the 12th Language Resources and Evaluation Conference, 5055–5064. https://www.aclweb.org/anthology/2020.lrec-1.622
Arhiliuc, C., Mitrović, J., & Granitzer, M. (2020). Language proficiency scoring. Proceedings of the 12th Language Resources and Evaluation Conference, 5624–5630. https://www.aclweb.org/anthology/2020.lrec-1.690
Bestgen, Y. (2020). Reproducing monolingual, multilingual and cross-lingual CEFR predictions. Proceedings of the 12th Language Resources and Evaluation Conference, 5595–5602. https://www.aclweb.org/anthology/2020.lrec-1.687
Born, L., Bacher, M., & Markert, K. (2020). Dataset reproducibility and IR methods in timeline summarization. Proceedings of the 12th Language Resources and Evaluation Conference, 1763–1771. https://www.aclweb.org/anthology/2020.lrec-1.218
Branco, A. (2018, May). We are depleting our research subject as we are investigating it: In language technology, more replication and diversity are needed. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). https://www.aclweb.org/anthology/L18-1022
Branco, A., Calzolari, N., Vossen, P., Van Noord, G., Uytvanck, D. van, Silva, J., Gomes, L., Moreira, A., & Elbers, W. (2020). A shared task of a new, collaborative type to foster reproducibility: A first exercise in the area of language science and technology with REPROLANG2020. Proceedings of the 12th Language Resources and Evaluation Conference, 5539–5545. https://www.aclweb.org/anthology/2020.lrec-1.680
Caines, A., & Buttery, P. (2020). REPROLANG 2020: Automatic proficiency scoring of Czech, English, German, Italian, and Spanish learner essays. Proceedings of the 12th Language Resources and Evaluation Conference, 5614–5623. https://www.aclweb.org/anthology/2020.lrec-1.689
Cohen, K. B., Xia, J., Zweigenbaum, P., Callahan, T., Hargraves, O., Goss, F., Ide, N., Névéol, A., Grouin, C., & Hunter, L. E. (2018, May). Three dimensions of reproducibility in natural language processing. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). https://www.aclweb.org/anthology/L18-1025
Cooper, M., & Shardlow, M. (2020). CombiNMT: An exploration into neural text simplification models. Proceedings of the 12th Language Resources and Evaluation Conference, 5588–5594. https://www.aclweb.org/anthology/2020.lrec-1.686
Crane, M. (2018). Questionable answers in question answering research: Reproducibility and variability of published results. Transactions of the Association for Computational Linguistics, 6, 241–252. https://doi.org/10.1162/tacl_a_00018
Dakota, D., & Kübler, S. (2017). Towards replicability in parsing. Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, 185–194. https://doi.org/10.26615/978-954-452-049-6_026
Fares, M., Kutuzov, A., Oepen, S., & Velldal, E. (2017). Word vectors, reuse, and replicability: Towards a community repository of large-text resources. Proceedings of the 21st Nordic Conference on Computational Linguistics, 271–276. https://www.aclweb.org/anthology/W17-0237
Fokkens, A., Erp, M. van, Postma, M., Pedersen, T., Vossen, P., & Freire, N. (2013). Offspring from reproduction problems: What replication failure teaches us. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1691–1701. https://www.aclweb.org/anthology/P13-1166
Fortuna, P., Soler-Company, J., & Nunes, S. (2019). Stop PropagHate at SemEval-2019 tasks 5 and 6: Are abusive language classification results reproducible? Proceedings of the 13th International Workshop on Semantic Evaluation, 745–752. https://doi.org/10.18653/v1/S19-2131
Garneau, N., Godbout, M., Beauchemin, D., Durand, A., & Lamontagne, L. (2020). A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings: Making the method robustly reproducible as well. Proceedings of the 12th Language Resources and Evaluation Conference, 5546–5554. https://www.aclweb.org/anthology/2020.lrec-1.681
Gärtner, M., Hahn, U., & Hermann, S. (2018, May). Preserving workflow reproducibility: The RePlay-DH client as a tool for process documentation. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). https://www.aclweb.org/anthology/L18-1089
Horsmann, T., & Zesch, T. (2018, May). DeepTC – an extension of DKPro text classification for fostering reproducibility of deep learning experiments. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). https://www.aclweb.org/anthology/L18-1403
Horsmann, T., & Zesch, T. (2017). Do LSTMs really work so well for PoS tagging? – a replication study. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 727–736. https://doi.org/10.18653/v1/D17-1076
Htut, P. M., Cho, K., & Bowman, S. (2018a). Grammar induction with neural language models: An unusual replication. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, 371–373. https://doi.org/10.18653/v1/W18-5452
Htut, P. M., Cho, K., & Bowman, S. (2018b). Grammar induction with neural language models: An unusual replication. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 4998–5003. https://doi.org/10.18653/v1/D18-1544
Huber, E., & Çöltekin, Ç. (2020). Reproduction and replication: A case study with automatic essay scoring. Proceedings of the 12th Language Resources and Evaluation Conference, 5603–5613. https://www.aclweb.org/anthology/2020.lrec-1.688
Khoe, Y. H. (2020). Reproducing a morphosyntactic tagger with a meta-BiLSTM model over context sensitive token encodings. Proceedings of the 12th Language Resources and Evaluation Conference, 5563–5568. https://www.aclweb.org/anthology/2020.lrec-1.683
Mieskes, M., Fort, K., Névéol, A., Grouin, C., & Cohen, K. (2019). Community perspective on replicability in natural language processing. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), 768–775. https://doi.org/10.26615/978-954-452-056-4_089
Millour, A., Fort, K., & Magistry, P. (2020). Répliquer et étendre pour l’alsacien “Étiquetage en parties du discours de langues peu dotées par spécialisation des plongements lexicaux” (replicating and extending for Alsatian : “POS tagging for low-resource languages by adapting word embeddings”). Actes de La 6e Conférence Conjointe Journées d’Études Sur La Parole (JEP, 33e édition), Traitement Automatique Des Langues Naturelles (TALN, 27e édition), Rencontre Des Étudiants Chercheurs En Informatique Pour Le Traitement Automatique Des Langues (rÉCITAL, 22e édition). 2e Atelier Éthique Et TRaitemeNt Automatique Des Langues (ETeRNAL), 29–37. https://www.aclweb.org/anthology/2020.jeptalnrecital-eternal.4
Miltenburg, E. van, Kerkhof, M. van de, Koolen, R., Goudbeek, M., & Krahmer, E. (2019). On task effects in NLG corpus elicitation: A replication study using mixed effects modeling. Proceedings of the 12th International Conference on Natural Language Generation, 403–408. https://doi.org/10.18653/v1/W19-8649
Moore, A., & Rayson, P. (2018). Bringing replication and reproduction together with generalisability in NLP: Three reproduction studies for target dependent sentiment analysis. Proceedings of the 27th International Conference on Computational Linguistics, 1132–1144. https://www.aclweb.org/anthology/C18-1097
Morey, M., Muller, P., & Asher, N. (2017). How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 1319–1324. https://doi.org/10.18653/v1/D17-1136
Névéol, A., Cohen, K., Grouin, C., & Robert, A. (2016). Replicability of research in biomedical natural language processing: A pilot evaluation for a coding task. Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis, 78–84. https://doi.org/10.18653/v1/W16-6110
Pluciński, K., Lango, M., & Zimniewicz, M. (2020). A closer look on unsupervised cross-lingual word embeddings mapping. Proceedings of the 12th Language Resources and Evaluation Conference, 5555–5562. https://www.aclweb.org/anthology/2020.lrec-1.682
Rim, K., Tu, J., Lynch, K., & Pustejovsky, J. (2020). Reproducing neural ensemble classifier for semantic relation extraction inScientific papers. Proceedings of the 12th Language Resources and Evaluation Conference, 5569–5578. https://www.aclweb.org/anthology/2020.lrec-1.684
Schwartz, L. (2010). Reproducible results in parsing-based machine translation: The JHU shared task submission. Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, 177–182. https://www.aclweb.org/anthology/W10-1726
Wieling, M., Rawee, J., & Noord, G. van. (2018). Squib: Reproducibility in computational linguistics: Are we willing to share? Computational Linguistics, 44(4), 641–649. https://doi.org/10.1162/coli_a_00330
Wu, T., Ribeiro, M. T., Heer, J., & Weld, D. (2019). Errudite: Scalable, reproducible, and testable error analysis. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 747–763. https://doi.org/10.18653/v1/P19-1073
Zhang, X., & Duh, K. (2020). Reproducible and efficient benchmarks for hyperparameter optimization of neural machine translation systems. Transactions of the Association for Computational Linguistics, 8, 393–408. https://doi.org/10.1162/tacl_a_00322