| | |
- builtins.object
-
- TextComplexityAnalyzer
class TextComplexityAnalyzer(builtins.object) |
| |
TextComplexityAnalyzer(language: str = 'es') -> None
This class groups all of the indices in order to calculate them in one go. It works for a specific language.
To use this class, instantiate an object with it. For example:
tca = TextComplexityAnalyzer('es')
Notice that a short version of the language was passed. The only languages available for now are: 'es'.
To calculate the implemented coh-metrix indices for a text, do the following:
m1, m2, m3, m4, m5, m6, m7, m8 = tca.calculate_all_indices_for_one_text(text='Example text', workers=-1)
Here, all available cores will be used to analyze the text passed as parameter.
To predict the category of a text, do the following:
prediction = tca.predict_text_category(text='Example text', workers=-1)
The example uses the default classifier stored along the library. |
| |
Methods defined here:
- __init__(self, language: str = 'es') -> None
- This constructor initializes the analizer for a specific language.
Parameters:
language(str): The language that the texts are in.
Returns:
None.
- calculate_all_indices_for_one_text(self, text: str, workers: int = -1) -> (typing.Dict, typing.Dict, typing.Dict, typing.Dict, typing.Dict, typing.Dict, typing.Dict, typing.Dict)
- This method calculates the referential cohesion indices and stores them in a dictionary.
Parameters:
text(str): The text to be analyzed.
workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used.
Returns:
(Dict, Dict, Dict, Dict, Dict, Dict, Dict, Dict): The dictionary with the all the indices.
- calculate_connective_indices_for_one_text(self, text: str, workers: int = -1, word_count: int = None) -> Dict
- This method calculates the connectives indices and stores them in a dictionary.
Parameters:
text(str): The text to be analyzed.
workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used.
word_count(int): The amount of words that the current text has in order to calculate the incidence.
Returns:
Dict: The dictionary with the connectives indices.
- calculate_descriptive_indices_for_one_text(self, text: str, workers: int = -1) -> Dict
- This method calculates the descriptive indices and stores them in a dictionary.
Parameters:
text(str): The text to be analyzed.
workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used.
Returns:
Dict: The dictionary with the descriptive indices.
- calculate_lexical_diversity_indices_for_one_text(self, text: str, workers: int = -1) -> Dict
- This method calculates the lexical diversity indices and stores them in a dictionary.
Parameters:
text(str): The text to be analyzed.
workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used.
word_count(int): The amount of words that the current text has in order to calculate the incidence.
Returns:
Dict: The dictionary with the lexical diversity indices.
- calculate_readability_indices_for_one_text(self, text: str, workers: int = -1, mean_syllables_per_word: int = None, mean_words_per_sentence: int = None) -> Dict
- This method calculates the readability indices and stores them in a dictionary.
Parameters:
text(str): The text to be analyzed.
workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used.
mean_syllables_per_word(int): The mean of syllables per word in the text.
mean_words_per_sentence(int): The mean amount of words per sentences in the text.
Returns:
Dict: The dictionary with the readability indices.
- calculate_referential_cohesion_indices_for_one_text(self, text: str, workers: int = -1) -> Dict
- This method calculates the referential cohesion indices and stores them in a dictionary.
Parameters:
text(str): The text to be analyzed.
workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used.
Returns:
Dict: The dictionary with the readability indices.
- calculate_syntactic_complexity_indices_for_one_text(self, text: str, workers: int = -1) -> Dict
- This method calculates the syntactic complexity indices and stores them in a dictionary.
Parameters:
text(str): The text to be analyzed.
workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used.
Returns:
Dict: The dictionary with the syntactic complexity indices.
- calculate_syntactic_pattern_density_indices_for_one_text(self, text: str, workers: int = -1, word_count: int = None) -> Dict
- This method calculates the syntactic pattern indices and stores them in a dictionary.
Parameters:
text(str): The text to be analyzed.
word_count(int): The amount of words that the current text has in order to calculate the incidence.
Returns:
Dict: The dictionary with the syntactic pattern indices.
- calculate_word_information_indices_for_one_text(self, text: str, workers: int = -1, word_count: int = None) -> Dict
- This method calculates the descriptive indices and stores them in a dictionary.
Parameters:
text(str): The text to be analyzed.
workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used.
word_count(int): The amount of words that the current text has in order to calculate the incidence.
Returns:
Dict: The dictionary with the word information indices.
- predict_text_category(self, text: str, workers: int = -1, classifier=None, scaler=None, indices: List = None) -> int
- This method receives a text and predict its category based on the classification model trained.
Parameters:
text(str): The text to predict its category.
workers(int): Amount of threads that will complete this operation. If it's -1 then all cpu cores will be used.
classifier: Optional. A supervised learning model that implements the 'predict' method. If None, the default classifier is used.
scaler: Optional. A object that implements the 'transform' method that scales the indices of the text to analyze. It must be the same as the one used in the classifier, if a scaler was used. Pass None if no scaler was used during the custom classifier's training.
indices(List): Optional. Ignored if the default classifier is used. The name indices which the classifier was trained with. They must be in the same order as the ones that were used at training and also be the same.
Returns:
int: The category of the text represented as a number
Data descriptors defined here:
- __dict__
- dictionary for instance variables (if defined)
- __weakref__
- list of weak references to the object (if defined)
| |