Datasets of trait-concept pairs for specific trait types in English and Spanish derived from McRae and Norms datasets [1,2]
The 5 trait types that are covered in the datasets are: colours, components, materials, size & shape, and tactile. For more details on how the dataset was constructed from the original datasets and details on the translation to Spanish see the original paper:
@inproceedings{and22-dist-hyp,
title = "Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: {T}rait-based Relational Knowledge and the Impact of Co-occurrences",
author = "Anderson, Mark and Camacho Collados, Jose",
booktitle = "To appear in proceedings of *SEM 2022: The Eleventh Joint Conference on Lexical and Computational Semantics",
month = jul,
year = "2022",
address = "Seattle",
publisher = "Association for Computational Linguistics",
}
There are datasets for each trait type for both the McRae and Norms datasets and both single-labelled and multi-labelled for English. There is only a single-labelled dataset from McRae for Spanish.
[1] Ken McRae, George S. Cree, Mark S. Seidenberg, and Chris McNorgan (2005) Semantic feature production norms for a large set of living and nonliving things. Behavior Research Methods, 37:547–559
[2] Barry Devereux, Lorraine K. Tyler, Jeroen Geertzen, and Billi Randall (2014) The centre for speech, language and the brain (cslb) concept property norms. Behavior Research Methods, 46:1119 – 1127