cnt.rulebase.workflow package

Submodules

cnt.rulebase.workflow.basic_workflow module

Basic Workflow.

class cnt.rulebase.workflow.basic_workflow.BasicConfig[source]

Bases: object

Configuration could be accessed by LabelProcessor and OutputGenerator.

class cnt.rulebase.workflow.basic_workflow.BasicLabelProcessor(input_sequence, index_labels_generator, config)[source]

Bases: object

Define the interface of LabelProcessor.

Parameters
result()[source]

Label processor could generate any return type. Derived class must override this method.

Return type

Any

class cnt.rulebase.workflow.basic_workflow.BasicOutputGenerator(input_sequence, label_processor_result, config)[source]

Bases: object

Define the interface of OutputGenerator.

Parameters
  • input_sequence (str) – The input sequence.

  • label_processor_result (Any) – The result of BasicLabelProcessor.

result()[source]

Output generator could generate any return type. Derived class must override this method.

Return type

Any

class cnt.rulebase.workflow.basic_workflow.BasicSequentialLabeler(input_sequence, config)[source]

Bases: object

Define the interface of SequentialLabeler.

Parameters

input_sequence (str) – The input sequence.

label(index)[source]

Return boolean label for self.input_sequence[index]. Derived class must override this method.

Parameters

index (int) – The index of self.input_sequence.

Return type

bool

class cnt.rulebase.workflow.basic_workflow.BasicWorkflow(sequential_labeler_classes, label_processor_class, output_generator_class)[source]

Bases: object

Define the basic workflow. Use composite pattern to organize the steps of rule-based processing.

Parameters
  • sequential_labeler_classes (Iterable[Type[BasicSequentialLabeler]]) – For char-level sequential labeling.

  • label_processor_class (Type[BasicLabelProcessor]) – Label post-processing. Commonly this step will generate new labels based on the result of sequential_labeler_classes.

  • output_generator_class (Type[BasicOutputGenerator]) – Generate output based on input sequence & labels.

result(input_sequence, config=None)[source]

Execute the workflow.

Parameters

input_sequence (str) – The input sequence.

Return type

Any

cnt.rulebase.workflow.exact_match_labeler module

class cnt.rulebase.workflow.exact_match_labeler.ExactMatchLabeler(input_sequence, config)[source]

Bases: cnt.rulebase.workflow.interval_labeler.IntervalLabeler

Helper to label exact match strings.

AC_AUTOMATION: Any = None
classmethod build_ac_automation_from_strings(keys)[source]
Return type

Any

classmethod build_and_bind_ac_automation_from_strings(keys)[source]
Return type

None

intervals_generator()[source]
Return type

Generator[Tuple[int, int], None, None]

cnt.rulebase.workflow.interval_labeler module

class cnt.rulebase.workflow.interval_labeler.IntervalLabeler(input_sequence, config)[source]

Bases: cnt.rulebase.workflow.basic_workflow.BasicSequentialLabeler

Helper to label intervals.

Parameters

input_sequence (str) – The input sequence.

ITV_RE_PATTERN: Optional[re] = None
classmethod initialize_by_intervals(intervals)[source]

Convert intervals to regular expression pattern.

Parameters

intervals (List[Tuple[int, int]]) – Unicode codepoint intervals.

Return type

None

classmethod initialize_by_regular_expression(pattern)[source]
Return type

None

intervals_generator()[source]
Return type

Generator[Tuple[int, int], None, None]

label(index)[source]

Return boolean label for self.input_sequence[index]. Derived class must override this method.

Parameters

index (int) – The index of self.input_sequence.

Return type

bool

cnt.rulebase.workflow.interval_labeler.build_re_pattern_from_intervals(intervals)[source]

Convert intervals to regular expression pattern.

Parameters

intervals (List[Tuple[int, int]]) – Unicode codepoint intervals.

Return type

re

cnt.rulebase.workflow.type_annotations module

Shared type annotations.

Module contents

Classes to define rule-based processing workflow.