module documentation

Undocumented

Function clean_paragraph Cleans the given text by removing special characters, punctuation, and stopwords.
Function clean_text Clean the given text by removing square brackets and their contents, removing non-alphanumeric characters except for important symbols and spaces, and removing extra spaces.
Function label_from_normalized_value Assigns a label based on the given normalized value.
Function normalize_value Normalize a value between 0 and 1 based on the given minimum and maximum values.
Function set_vars Set environment variables for Reddit authentication.
def clean_paragraph(text: str) -> List[str]: (source)

Cleans the given text by removing special characters, punctuation, and stopwords. Args: text (str): The text to be cleaned. Returns: List[str]: The cleaned text as list.

def clean_text(text: str) -> str: (source)

Clean the given text by removing square brackets and their contents, removing non-alphanumeric characters except for important symbols and spaces, and removing extra spaces. Args: text (str): The text to be cleaned. Returns: str: The cleaned text.

def label_from_normalized_value(normalized_value: float) -> int: (source)

Assigns a label based on the given normalized value. 0 - 0.3: Low similarity 0.3 - 0.6: Medium similarity 0.6 - 0.8: High similarity 0.8 - 1: Very high similarity Args: normalized_value (float): The normalized value to assign a label to. Returns: int: The label assigned to the normalized value.

def normalize_value(value: float, min_value: float, max_value: float) -> float: (source)

Normalize a value between 0 and 1 based on the given minimum and maximum values. Args: value (float): The value to be normalized. min_value (float): The minimum value of the range. max_value (float): The maximum value of the range. Returns: float: The normalized value between 0 and 1.

def set_vars() -> RedditAuth: (source)

Set environment variables for Reddit authentication. This function get environment variables required for Reddit authentication. Check `config.py` to set the user credentials for Reddit API. Returns: RedditAuth: An instance of the RedditAuth class with the client secret, user agent, and client ID set.