falcon-evaluate.github.io

📚 Falcon Evaluate - Evaluation Metrics Guide


1. 📖 Readability and Complexity

📘 ARI (Automated Readability Index)

📙 Flesch-Kincaid Grade Level


2. 🧠 Language Modeling Performance

💡 Perplexity


3. ⚠️ Text Toxicity

☣️ Toxicity Level


4. 🧲 Text Similarity and Relevance

🌐 BLEU (Bilingual Evaluation Understudy) Score

📏 Cosine Similarity

🤝 Semantic Similarity

✂️ Jaccard Similarity


5. 🕵️ Information Retrieval

🎯 Precision

📌 Recall

⚖️ F1-Score