About us
===============================

Aim of the project
------------------

The increasing importance of corpus data in linguistics creates a need
for appropriate methods for retrieving semantic information from corpora.
In this project, existing computational methods of distributional corpus semantics
are further developed in the form of a meaning detection approach based on token clouds,
i.e. clusters of distributionally similar attestations of words or expressions in a
multidimensional vector space. The first phase of the project has a methodological
orientation, focusing on the finetuning of such a 'nephological' method for detecting
linguistic meanings in corpus data. In the second phase of the project, the method is
put to use in two descriptive research lines: lectometrical research into the
relationship between language varieties, and variationist grammar research.

Project members
---------------

* Dirk Geeraerts
* Dirk Speelman
* Stefania Marzo
* Benedikt Szmrecsanyi
* Karlien Franco
* Kris Heylen
* Stefano De Pascale
* Mariana Montes
* Weiwei Zhang

Publications
------------

The theoretical framework and methodology followed in the project
were presented by Mariana Montes and Karlien Franco in the
II Jornadas de Lingüística y Gramática Española on October 1, 2021.
You can watch the presentation in `English <https://www.youtube.com/watch?v=BZnTXSf6heY&t=2508s>`_
or `dubbed to Spanish <https://www.youtube.com/watch?v=lpqgBXZfuPc>`_.

.. _publications:

Publications using this code
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

De Pascale, S. 2019. *Token-based vector space models as semantic control in lexical lectometry*.
Leuven: KU Leuven PhD Dissertation. (8 November, 2019).

De Pascale, Stefano & Weiwei Zhang. 2021. Scoring with Token-based Models.
A Distributional Semantic Replication of Socioectometric Analyses in Geeraerts, Grondelaers, and Speelman (1999).
In Gitte Kristiansen, Karlien Franco, Stefano De Pascale, Laura Rosseel & Weiwei Zhang (eds.),
*Cognitive Sociolinguistics Revisited*, 186–199. De Gruyter. https://doi.org/10.1515/9783110733945-021.

Montes, Mariana. 2021. *Cloudspotting: visual analytics for distributional semantics*.
Leuven: KU Leuven PhD Dissertation.

Montes, Mariana, Karlien Franco & Kris Heylen. 2021. Indestructible Insights.
A Case Study in Distributional Prototype Semantics.
In Gitte Kristiansen, Karlien Franco, Stefano De Pascale, Laura Rosseel & Weiwei Zhang (eds.),
*Cognitive Sociolinguistics Revisited*, 251–263. De Gruyter. https://doi.org/10.1515/9783110733945-021.

Montes, Mariana & Kris Heylen. 2022. Visualizing Distributional Semantics.
In Dennis Tay & Molly Xie Pan (eds.), *Data Analytics in Cognitive Linguistics. Methods and Insights*.
Mouton De Gruyter.

Related publications
^^^^^^^^^^^^^^^^^^^^

Heylen, Kris, Dirk Speelman & Dirk Geeraerts. 2012. Looking at word meaning.
An interactive visualization of Semantic Vector Spaces for Dutch synsets.
In *Proceedings of the eacl 2012 Joint Workshop of LINGVIS & UNCLH*, 16–24. Avignon.

Heylen, Kris, Thomas Wielfaert, Dirk Speelman & Dirk Geeraerts. 2015.
Monitoring polysemy: Word space models as a tool for large-scale lexical semantic analysis.
*Lingua* 157. 153–172.

Speelman, Dirk, Stefan Grondelaers, Benedikt Szmrecsanyi & Kris Heylen. 2020.
Schaalvergroting in het syntactische alternantieonderzoek:
Een nieuwe analyse van het presentatieve er met automatisch gegenereerde predictoren.
*Nederlandse Taalkunde* 25(1). 101–123. https://doi.org/10.5117/NEDTAA2020.1.005.SPEE.