nephosem.deprel package¶

Submodules¶

nephosem.deprel.basic module¶

class nephosem.deprel.basic.DiGraph¶

Bases: object

add_edge(e_id, from_v, to_v, e_label)¶: Add an edge to graph.

add_node(v_id, v_label)¶: Add a node with id and label (optional) to graph.

property edges¶

in_degree(v)¶

property istree¶

property nodes¶

out_degree(v)¶

predcessors(v)¶

successors(v)¶

class nephosem.deprel.basic.FeatureGraph(template, target=- 1, feature_filter={})¶

Bases: nephosem.deprel.basic.TemplateGraph

Class representing a feature graph inherited from the class TemplateGraph. So it will have the same structure of the template from which it is generated. The generating process of a feature object would be: * 1. replicate a (tree) structure of the template * 2. set target node index * 3. set feature properties for each node (except for the target) and each edge.

The feature properties (i.e. True or False) would be stored in attributes of nodes and edges

add_match(matched_nodes, matched_edges)¶

Add matched nodes and edges

Parameters

matched_nodes (dict) – mapping from node index to item string
matched_edges (dict) – mapping from edge index to relation string

set_feature(feature_filter={})¶

set_target(target)¶

show(v_label='label', e_label='rel', figsize=(5.0, 5.0))¶

show_match(index=1, v_label='label', e_label='rel', figsize=(5.0, 5.0))¶

property size¶

class nephosem.deprel.basic.Graph(sentence=None, id2node=None)¶

Bases: object

add_edge(e_from_node, e_to_node, e_label)¶: Add an edge to graph.

add_node(v_id, v_label=None)¶: Add a node with id and label (optional) to graph.

build_graph(sentence=None, id2node=None)¶

Build a graph

Parameters

sentence (iterable) – A list of dependency relations.
id2node (dict) – Node id to node string mapping.

build_graph_raw(sentence)¶

Build a graph from raw text (of a sentence)

Parameters: sentence (iterable) – A list of strings

property edges¶

match(path)¶

Match a graph with path.

Parameters: path (PathTemplate) –
Returns: valid matches
Return type: iterable

property nodes¶

class nephosem.deprel.basic.Path(template, matches=None)¶

Bases: object

Class storing path matches found in corpus

add_path(match)¶

Add a match

Parameters: match (iterable) – A list of str

property len¶: size of template i.e. ‘:NN:amod:VB:’ has a size of one

classmethod load(filename)¶

save(filename, encoding='utf-8')¶

property size¶: number of matches

class nephosem.deprel.basic.PathTemplate(nodes, edges)¶

Bases: object

Class representing a path template

property len¶

match_edge(rel, index=0, u=0, v=0)¶

match_node(item, index=0)¶

class nephosem.deprel.basic.Sentence(s)¶

Bases: object

get_content()¶

parse(s=None)¶: parse sentence text (raw string from corpus file)

class nephosem.deprel.basic.SentenceGraph(nodes=None, edges=None, sentence=None)¶

Bases: nephosem.deprel.basic.DiGraph

build_graph(sentence)¶

Build a graph from raw text (of a sentence)

Parameters: sentence (iterable) – A list of strings

generate_graph(nodes, edges)¶

match_feature(feature)¶: Match a sentence with a feature (and target pair)

match_target_feature(feature)¶

Match a graph with a path (a tree/graph object).

Parameters: feature (FeatureGraph) –
Returns: valid matches
Return type: iterable

show(v_label='label', e_label='rel', figsize=(5.0, 5.0))¶

class nephosem.deprel.basic.TemplateGraph(nodes=None, edges=None, graph=None)¶

Bases: nephosem.deprel.basic.DiGraph

Class representing a dependency template tree/graph

static islinear(template)¶

match_edge(rel, idx=0)¶

match_node(item, idx=0)¶

show(v_label='label', e_label='rel', figsize=(5.0, 5.0))¶

nephosem.deprel.basic.get_depth(gx)¶: Get the depth of a tree

nephosem.deprel.basic.get_depth_of_node(gx, v)¶: Get the depth of a node

nephosem.deprel.basic.get_root(gx)¶: Get the root of a tree

nephosem.deprel.basic.match_level(sentence, feature, currmap)¶

Match the next level based on the index mapping (feature index -> sentence index) of current level

Parameters

sentence (SentenceGraph) –
feature (FeatureGraph) –
currmap (dict) – Index mapping from sentence node to feature node (of current level). e.g. feature node idx -> sentence node idx

Returns

A list of dicts

Return type

feature node idx -> sentence node idx

nephosem.deprel.basic.match_sub_template(sentence=None, feature=None, valid_nodes=None, valid_edges=None)¶

nephosem.deprel.basic.match_successors(sentence, scur, feature, fcur)¶

Match sentence successors with feature successors based on current sentence node and feature node.

Parameters

sentence (SentenceGraph) –
scur (int) – Current node index of sentence
feature (FeatureGraph) –
fcur (int) – Current node index of feature

Returns

A list of dicts

Return type

feature node idx -> sentence node idx

nephosem.deprel.basic.subtree_match(sentence=None, feature=None, lmatches=None)¶

Parameters

sentence (SentenceGraph) –
feature (FeatureGraph) –
lmatches (queue (collections.deque)) – Contains a list of possible matches. Each match is a (finally the length is feature.depth) lists of levels of the feature. Element example: feature node idx -> sentence node idx.

nephosem.deprel.basic.tree_match(sentence, feature)¶

Match the sentence with the feature.

Parameters

sentence (SentenceGraph) –
feature (FeatureGraph) –

nephosem.deprel.corpus module¶

nephosem.deprel.dephandler module¶

class nephosem.deprel.dephandler.DepRelManager(settings)¶

Bases: nephosem.core.handler.BaseHandler

Handler Class for processing dependency relations

build_dep_rel(fnames=None, multicore=True)¶

The function will treat all different word types as possible target or context words.

Parameters

fnames (str, optional) – Filename of a file which records all (a user wants to process) file names of a corpus. Format: corpus_name + settings[“fnames-ext”]
row_vocab (Vocab) – Target words (types) vocabulary. If a non-empty vocabulary is passed, only target words (types) in this vocab should be processed. Otherwise all possible words (types) should be processed.
col_vocab (Vocab) – Context features vocabulary. If a non-empty vocabulary is passed, only context features in this vocab should be processed. Otherwise all possible contexts should be processed.
multicore (bool) – Use multicore version of the method or not.

do_job_single(fnames, **superkwargs)¶

Method doing job for handler class.

Parameters: fnames (iterable) – A list of filenames

merge_results()¶: Merge subprocess matrices into one final matrix. sub-process matrices filename format: …/matches.sub.pid

process(fnames, queue_factor=2)¶

Parameters

fnames –
queue_factor (int, optional) – Multiplier for size of queue -> size = number of workers * queue_factor.

read_features(fname=None, features=None, encoding='utf-8')¶

read_template(fname=None, features=None, encoding='utf-8')¶: Read paths from file

nephosem.deprel.dephandler.update_dep_rel_caller(fnames, tmpdir=None, settings=None)¶

This method will save path template matches of sub-process. Filename format of sub-process objects:

matrix: paths.sub.pid

Parameters

fnames (iterable) – A list of filenames
tmpdir (str) – Temporary folder
settings (dict) –

nephosem.deprel.deputils module¶

nephosem.deprel.deputils.cartesian_product(mapping)¶: {1: (1, 2), 2: (3, 4)} -> [{1:1, 2:3}, {1:2, 2:3}, {1:1, 2:4}, {1:2, 2:4}] Should perform: {1: (1, 2), 2: (1, 2)} -> [{1: 1, 2: 2}] (if the target is not in (1, 2)) [{1: 1, 2: 2}, {1: 2, 2: 1}]

nephosem.deprel.deputils.group(nodemap)¶

nephosem.deprel.deputils.judgeIn(sLine)¶

nephosem.deprel.deputils.judgeIn_v1(sLine)¶

nephosem.deprel.deputils.judgeOut(sLine)¶

nephosem.deprel.deputils.outMap(item)¶

nephosem.deprel.deputils.process_sentence(sLine)¶

Process a sentence line. line : ‘sid:edge1;edge2;edge3…’

Parameters: sLine –

nephosem.deprel.deputils.read_nodes(in_nodes, encoding='utf-8')¶: Read nodes from file. i.e. :

a/DT 4 an/DT 21 …

nephosem.deprel.deputils.split_large_file(filename, encoding='utf-8')¶: Split large corpus file into samller ones for multicore processing

nephosem.deprel package¶

Submodules¶

nephosem.deprel.basic module¶

nephosem.deprel.corpus module¶

nephosem.deprel.dephandler module¶

nephosem.deprel.deputils module¶

nephosem.deprel.tmp module¶

Module contents¶