Nephological Semantics¶

Welcome! This is the current home website of the Nephological Semantics Project, developed in the QLVL research group at KU Leuven. You can learn more about the project here.

One of the main products of our project is Nephosem, a Python package with functions to create type- and token-level distributional models, both with bag-of-words and dependency information. On this site you can find the full reference as well as Tutorials.

This package has been used in lexical semantics and lectometry studies within the Nephological Semantics projects; the derived publications are listed here.

Specific applications¶

Semasiological workflow¶

The semasiological workflow looks at the internal structure of individual words based on the contexts of their occurrences. For each word, it creates multiple token-level models -vector representations of each of its instances- combining different parameter settings (i.e. ways of defining context). Then it selects representative models and visualizes them in an interactive tool. A more or less technical explanation of the procedure is explained here.

The Nephosem package is at the core of this workflow, but is then expanded with other tools:

The semasioFlow Python package, which organizes and compacts Nephosem functions in a way specific to the semasiological workflow;
The semcloud R package, which takes the output of semasioFlow and prepares the data for visualization, running dimensionality reduction and clustering and generating annotated concordances 1 .
The NephoVis interactive visualization tool (see link above) for exhaustive, qualitative exploration of the models.
The Level 3 ShinyApp for deeper exploration of individual models.

To start, you can take a look at this notebook, which shows the main steps using semasioFlow and Nephosem, starting with a corpus in conll format (one token per line, columns for different features) and ending with token-by-token distance matrices as well as a number of metadata registers.

Lectometric workflow¶

Coming soon!

Contents:

Indices and tables¶

Footnotes

1: These are not semantic annotations but model-related: context words captured by a given model are highlighted and weighting values may be included as superscript.