Articles producció científicaPsicologia

Updating the German Psycholinguistic Word Toolbox with AI-Generated Estimates of Concreteness, Valence, Arousal, Age of Acquisition, and Familiarity

  • Dades identificatives

    Identificador:  imarina:9484717
    Autors:  Conde, Javier; Martinez, Gonzalo; Grandury, Maria; Arriaga, Carlos; Haro, Juan; Schroeder, Sascha; Hintz, Florian; Reviriego, Pedro; Brysbaert, Marc
    Resum:
    This article presents AI-generated estimates for five characteristics of German words: concreteness, valence, arousal, age of acquisition (AoA), and word familiarity. The estimates were generated using GPT-4o-mini, which was selected due to its good performance in previous studies. Validation studies were conducted comparing the AI-generated estimates with both human ratings and previously generated AI data to ensure their usefulness for research applications. The main results are as follows. The GPT estimates of word concreteness, valence, and arousal show a strong correlation with human ratings but are not better than the best available AI-generated estimates based on semantic vectors. The GPT estimates of AoA are good approximations of human ratings and outperform other available alternatives (except for human ratings), especially after the model was fine-tuned based on 2,000 human ratings. Fine-tuned AI-generated estimates of word familiarity have better predictive value than word frequency for word recognition in lexical decision tasks and vocabulary tests. Estimates for concreteness, valence, arousal, and AoA are available for 167,000 words, which are likely to be known to more than 90% of participants in typical adult studies. Word familiarity estimates are presented for 928,000 word forms. All data and codes, including newly collected human familiarity ratings for 11,000 words, are publicly available at https://osf.io/ghjd2/. The data may be freely used for research purposes, but not for commercial purposes.
  • Altres:

    Enllaç font original: https://journalofcognition.org/articles/10.5334/joc.482
    Referència de l'ítem segons les normes APA: Conde, Javier; Martinez, Gonzalo; Grandury, Maria; Arriaga, Carlos; Haro, Juan; Schroeder, Sascha; Hintz, Florian; Reviriego, Pedro; Brysbaert, Marc (2026). Updating the German Psycholinguistic Word Toolbox with AI-Generated Estimates of Concreteness, Valence, Arousal, Age of Acquisition, and Familiarity. Journal Of Cognition, 9(1), 9-. DOI: 10.5334/joc.482
    Referència a l'article segons font original: Journal Of Cognition. 9 (1): 9-
    DOI de l'article: 10.5334/joc.482
    Any de publicació de la revista: 2026-01-01
    Entitat: Universitat Rovira i Virgili
    Versió de l'article dipositat: info:eu-repo/semantics/publishedVersion
    Data d'alta del registre: 2026-02-09
    Autor/s de la URV: Haro Rodriguez, Juan
    Departament: Psicologia
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    Tipus de publicació: Journal Publications
    Autor segons l'article: Conde, Javier; Martinez, Gonzalo; Grandury, Maria; Arriaga, Carlos; Haro, Juan; Schroeder, Sascha; Hintz, Florian; Reviriego, Pedro; Brysbaert, Marc
    Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
    Àrees temàtiques: Social statistics and informatics, Psychology, experimental, Interdisciplinary research in the humanities, Experimental and cognitive psychology, Ciencias sociales
    Adreça de correu electrònic de l'autor: juan.haro@urv.cat
  • Paraules clau:

    Valence
    Ratings
    German language
    Familiarity
    Concreteness
    Arousal
    Ai-generated word norms
    Age of acquisition
    Affective norms
    Experimental and Cognitive Psychology
    Psychology
    Experimental
    Social statistics and informatics
    Interdisciplinary research in the humanities
    Ciencias sociales
  • Documents:

  • Cerca a google

    Search to google scholar