About us

Aim of the project

The increasing importance of corpus data in linguistics creates a need for appropriate methods for retrieving semantic information from corpora. In this project, existing computational methods of distributional corpus semantics are further developed in the form of a meaning detection approach based on token clouds, i.e. clusters of distributionally similar attestations of words or expressions in a multidimensional vector space. The first phase of the project has a methodological orientation, focusing on the finetuning of such a ‘nephological’ method for detecting linguistic meanings in corpus data. In the second phase of the project, the method is put to use in two descriptive research lines: lectometrical research into the relationship between language varieties, and variationist grammar research.

Project members

  • Dirk Geeraerts

  • Dirk Speelman

  • Stefania Marzo

  • Benedikt Szmrecsanyi

  • Karlien Franco

  • Kris Heylen

  • Stefano De Pascale

  • Mariana Montes

  • Weiwei Zhang

Publications

The theoretical framework and methodology followed in the project were presented by Mariana Montes and Karlien Franco in the II Jornadas de Lingüística y Gramática Española on October 1, 2021. You can watch the presentation in English or dubbed to Spanish.

Publications using this code

De Pascale, S. 2019. Token-based vector space models as semantic control in lexical lectometry. Leuven: KU Leuven PhD Dissertation. (8 November, 2019).

De Pascale, Stefano & Weiwei Zhang. 2021. Scoring with Token-based Models. A Distributional Semantic Replication of Socioectometric Analyses in Geeraerts, Grondelaers, and Speelman (1999). In Gitte Kristiansen, Karlien Franco, Stefano De Pascale, Laura Rosseel & Weiwei Zhang (eds.), Cognitive Sociolinguistics Revisited, 186–199. De Gruyter. https://doi.org/10.1515/9783110733945-021.

Montes, Mariana. 2021. Cloudspotting: visual analytics for distributional semantics. Leuven: KU Leuven PhD Dissertation.

Montes, Mariana, Karlien Franco & Kris Heylen. 2021. Indestructible Insights. A Case Study in Distributional Prototype Semantics. In Gitte Kristiansen, Karlien Franco, Stefano De Pascale, Laura Rosseel & Weiwei Zhang (eds.), Cognitive Sociolinguistics Revisited, 251–263. De Gruyter. https://doi.org/10.1515/9783110733945-021.

Montes, Mariana & Kris Heylen. 2022. Visualizing Distributional Semantics. In Dennis Tay & Molly Xie Pan (eds.), Data Analytics in Cognitive Linguistics. Methods and Insights. Mouton De Gruyter.