Creating a concordance with TokenHandler¶
The type2toks
attribute of the nephosem.TokenHandler
class is a dictionary with type names as keys and nephosem.TypeNode
objects as values. The TypeNode
objects have a tokens
attribute, which is a list of nephosem.TokenNode
objects with information on each collected token. From them, we can create a concordance with a function like tokenConcordance()
below.
[9]:
import sys
nephosemdir = "../../nephosem/"
sys.path.append(nephosemdir)
mydir = "./"
from nephosem import ConfigLoader, Vocab, TokenHandler
from nephosem.utils import save_concordance
conf = ConfigLoader()
settings = conf.update_config('config.ini')
Collect tokens¶
[2]:
query = Vocab({'girl/N' : 0}) # dummy query just for illustration
# alternatively, if you already have a vocabulary, vocab.subvocab(['girl/N'])
[3]:
tokhan = TokenHandler(query, settings=settings)
tokens = tokhan.retrieve_tokens()
tokens
WARNING: Not provide the temporary path!
WARNING: Use the default tmp directory: '~/tmp'!
Scanning tokens of queries in corpus...
[3]:
[21, 39] be/V what/W that/I a/D ,/, ask/V and/C ...
girl/N/StanfDepSents.1/3 NaN NaN NaN NaN NaN NaN NaN ...
girl/N/StanfDepSents.1/13 NaN NaN NaN NaN NaN NaN NaN ...
girl/N/StanfDepSents.1/20 NaN NaN NaN NaN NaN NaN NaN ...
girl/N/StanfDepSents.2/29 -4 NaN NaN NaN NaN NaN NaN ...
girl/N/StanfDepSents.8/3 NaN NaN NaN NaN NaN NaN NaN ...
girl/N/StanfDepSents.8/15 NaN NaN NaN NaN NaN NaN NaN ...
girl/N/StanfDepSents.8/25 NaN NaN NaN NaN NaN NaN -2 ...
... ... ... ... ... ... ... ... ...
[10]:
outputfile = 'output/concordance.tsv'
save_concordance(outputfile, tokhan.type2toks, colloc_fmt='word')
Read concordance¶
nephosem.utils.save_concordance()
directly stores the concordance as a tab-separated dataframe in outputfile
, without headers.
[12]:
import pandas as pd
pd.read_csv(outputfile, sep = '\t', names = ['token_id', 'left', 'target', 'right'])
[12]:
token_id | left | target | right | |
---|---|---|---|---|
0 | girl/N/StanfDepSents.1/3 | The | girl | looks healthy |
1 | girl/N/StanfDepSents.1/13 | boy looks at the | girl | as she eats |
2 | girl/N/StanfDepSents.1/20 | The | girl | eats less healthy food |
3 | girl/N/StanfDepSents.2/29 | are eaten by the | girl | NaN |
4 | girl/N/StanfDepSents.8/3 | The | girl | sat on the apple |
5 | girl/N/StanfDepSents.8/15 | boy looked at the | girl | 's apple |
6 | girl/N/StanfDepSents.8/25 | the boys and the | girls | eat apples |
7 | girl/N/StanfDepSents.4/7 | boy says that the | girl | should eat the apple |
8 | girl/N/StanfDepSents.4/15 | The | girl | eats the apple that |
9 | girl/N/StanfDepSents.9/14 | The older | girl | looks at a boy |
10 | girl/N/StanfDepSents.5/19 | What the | girl | eats was given by |
11 | girl/N/StanfDepSents.11/3 | The | girl | looks at the boy |
12 | girl/N/StanfDepSents.11/19 | the apple which the | girl | gave him |
13 | girl/N/StanfDepSents.11/28 | This year , the | girl | looked at a boy |
14 | girl/N/StanfDepSents.3/21 | The boy and the | girl | eat a healthy and |
15 | girl/N/StanfDepSents.6/6 | The boy gives the | girl | a tasty healthy apple |
16 | girl/N/StanfDepSents.6/21 | The | girl | does n't eat |
17 | girl/N/StanfDepSents.10/13 | The | girl | sits down |
18 | girl/N/StanfDepSents.10/19 | The | girl | eats about ten apples |
19 | girl/N/StanfDepSents.7/7 | old boy gives the | girl | a baby apple |
20 | girl/N/StanfDepSents.7/25 | The boy asked the | girl | about eating apples |