Articles producció científicaPsicologia

Updating the German Psycholinguistic Word Toolbox with AI-Generated Estimates of Concreteness, Valence, Arousal, Age of Acquisition, and Familiarity

  • Identification data

    Identifier:  imarina:9499590
    Authors:  Conde, Javier; Martinez, Gonzalo; Grandury, Maria; Arriaga, Carlos; Haro, Juan; Schroeder, Sascha; Hintz, Florian; Reviriego, Pedro; Brysbaert, Marc
    Abstract:
    This article presents AI-generated estimates for five characteristics of German words: concreteness, valence, arousal, age of acquisition (AoA), and word familiarity. The estimates were generated using GPT-4o-mini, which was selected due to its good performance in previous studies. Validation studies were conducted comparing the AI-generated estimates with both human ratings and previously generated AI data to ensure their usefulness for research applications. The main results are as follows. The GPT estimates of word concreteness, valence, and arousal show a strong correlation with human ratings but are not better than the best available AI-generated estimates based on semantic vectors. The GPT estimates of AoA are good approximations of human ratings and outperform other available alternatives (except for human ratings), especially after the model was fine-tuned based on 2,000 human ratings. Fine-tuned AI-generated estimates of word familiarity have better predictive value than word frequency for word recognition in lexical decision tasks and vocabulary tests. Estimates for concreteness, valence, arousal, and AoA are available for 167,000 words, which are likely to be known to more than 90% of participants in typical adult studies. Word familiarity estimates are presented for 928,000 word forms. All data and codes, including newly collected human familiarity ratings for 11,000 words, are publicly available at https://osf.io/ghjd2/. The data may be freely used for research purposes, but not for commercial purposes.
  • Others:

    Link to the original source: https://journalofcognition.org/articles/10.5334/joc.482
    APA: Conde, Javier; Martinez, Gonzalo; Grandury, Maria; Arriaga, Carlos; Haro, Juan; Schroeder, Sascha; Hintz, Florian; Reviriego, Pedro; Brysbaert, Marc (2026). Updating the German Psycholinguistic Word Toolbox with AI-Generated Estimates of Concreteness, Valence, Arousal, Age of Acquisition, and Familiarity. Journal Of Cognition, 9(1), 9-. DOI: 10.5334/joc.482
    Paper original source: Journal Of Cognition. 9 (1): 9-
    Article's DOI: 10.5334/joc.482
    Journal publication year: 2026-01-01
    Entity: Universitat Rovira i Virgili
    Paper version: info:eu-repo/semantics/publishedVersion
    Record's date: 2026-02-09
    URV's Author/s: Haro Rodriguez, Juan
    Department: Psicologia
    Licence document URL: https://repositori.urv.cat/ca/proteccio-de-dades/
    Publication Type: Journal Publications
    Author, as appears in the article.: Conde, Javier; Martinez, Gonzalo; Grandury, Maria; Arriaga, Carlos; Haro, Juan; Schroeder, Sascha; Hintz, Florian; Reviriego, Pedro; Brysbaert, Marc
    licence for use: https://creativecommons.org/licenses/by/3.0/es/
    Thematic Areas: Ciencias sociales, Experimental and cognitive psychology, Interdisciplinary research in the humanities, Psychology, experimental, Social statistics and informatics
    Author's mail: juan.haro@urv.cat
  • Keywords:

    Affective norms
    Age of acquisition
    Ai-generated word norms
    Arousal
    Concreteness
    Familiarity
    German language
    Ratings
    Valence
    Experimental and Cognitive Psychology
    Psychology
    Experimental
    Ciencias sociales
    Interdisciplinary research in the humanities
    Social statistics and informatics
  • Documents:

  • Cerca a google

    Search to google scholar