Python: module text_complexity_analyzer_cm.coh_metrix_indices.lexical_diversity

text_complexity_analyzer_cm.coh_metrix_indices.lexical_diversity_indices

index
/home/hans/Proyectos/Python/TextComplexityAnalyzerCM/text_complexity_analyzer_cm/coh_metrix_indices/lexical_diversity_indices.py

Modules

multiprocessing
spacy
string

Classes

builtins.object

LexicalDiversityIndices

class LexicalDiversityIndices(builtins.object)

LexicalDiversityIndices(nlp, language: str = 'es') -> None This class will handle all operations to obtain the lexical diversity indices of a text according to Coh-Metrix

Methods defined here:

__init__(self, nlp, language: str = 'es') -> None
The constructor will initialize this object that calculates the lexical diversity indices for a specific language of those that are available. Parameters: nlp: The spacy model that corresponds to a language. language(str): The language that the texts to process will have. Returns: None.

get_type_token_ratio_between_all_words(self, text: str, workers=-1) -> float
This method returns the type token ratio between all words of a text. Parameters: text(str): The text to be anaylized. workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used. Returns: float: The type token ratio between all words of a text.

get_type_token_ratio_of_content_words(self, text: str, workers=-1) -> float
This method returns the type token ratio of content words of a text. Content words are nouns, verbs, adjectives and adverbs. Parameters: text(str): The text to be anaylized. workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used. Returns: float: The type token ratio between the content words of a text.

Data descriptors defined here:

__dict__

dictionary for instance variables (if defined)

__weakref__

list of weak references to the object (if defined)

Data

ACCEPTED_LANGUAGES = {'es': 'es_core_news_lg'}