Description: A readability formula that gauges the understandability of English text. It considers characters per word and words per sentence. A higher ARI indicates text that is harder to read.
Example: For a sentence like “The cat sat on the mat.”, the ARI might be relatively low, indicating easy readability.
Description: Estimates the grade level a person must have reached to understand the text. The formula considers the sentence length and the number of syllables per word.
Example: A passage from a children’s book might have a Flesch-Kincaid Grade Level of around 3, indicating it’s suitable for third graders.
Description: Quantifies how well a probability distribution predicts a sample. In NLP, it represents how well a language model predicts a piece of text. A lower perplexity indicates better predictive performance.
Example: If a model predicts the next word in the sentence “The sky is ___” as “blue” with high certainty, it would have low perplexity.
Description: Represents the degree of harmful or inappropriate content in a text. A higher value indicates a higher likelihood of the text being toxic.
Example: For the phrase “I hate you”, a model might assign a higher toxicity level compared to “I love ice cream”.
Description: Evaluates the similarity between machine-generated text and human-generated reference text, commonly used in machine translation.
Example: Translating “Bonjour” to “Hello” in English might yield a high BLEU score.
Description: Measures the cosine of the angle between two non-zero vectors, determining the similarity between two pieces of text.
Example: The phrases “cat” and “kitten” might have a high cosine similarity because they’re both related to felines.
Description: Measures the likeness of meaning or semantics between two pieces of text.
Example: “Sofa” and “couch” would have high semantic similarity as they refer to the same piece of furniture.
Description: Compares the similarity and diversity of sample sets. It’s the size of the intersection divided by the size of the union of two sets.
Example: For sets {A, B} and {B, C}, the Jaccard similarity is 1/3 or 0.33.
Description: Represents the fraction of relevant instances among the retrieved instances. A measure of exactness.
Example: If a search yields 5 relevant results out of 10, the precision is 0.5.
Description: Represents the fraction of relevant instances that were retrieved over the total amount of relevant instances. A measure of completeness.
Example: If there are 5 relevant results in the entire dataset and a search retrieves 4 of them, the recall is 0.8.
Description: The harmonic mean of precision and recall. It provides a balance between the two when their values diverge.
Example: Given a precision of 0.5 and a recall of 0.8, the F1-Score would be approximately 0.63.