ENTITY F1 score

F1 score

PulseAugur coverage of F1 score — every cluster mentioning F1 score across labs, papers, and developer communities, ranked by signal.

Total · 30d

5

5 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

4

4 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 5 TOTAL

TOOL · CL_62868 · Jun 1 · 04:00

LLM judges outperform traditional metrics in extractive QA evaluations

Researchers have evaluated the effectiveness of using large language models (LLMs) as judges for extractive question-answering tasks. Their study found that LLM-as-a-judge methods correlate much more strongly with human…
TOOL · CL_34127 · May 16 · 06:16

Ranking Metrics Explained for Recommender Systems

This article provides an introduction to ranking metrics used in recommender systems. It explains various metrics such as precision, recall, F1-score, and Mean Average Precision (MAP). The piece aims to help developers …
TOOL · CL_20775 · May 7 · 04:00

Consensus Entropy improves VLM OCR accuracy by measuring inter-model agreement

Researchers have developed a new metric called Consensus Entropy (CE) to assess the reliability of Optical Character Recognition (OCR) outputs from Vision-Language Models (VLMs). CE measures the agreement between multip…
RESEARCH · CL_15558 · May 4 · 02:35

AI fusion of SAR data enhances flood mapping accuracy

Researchers have developed a deep learning framework that fuses cross-polarization Synthetic Aperture Radar (SAR) data for more accurate flood mapping. By combining VV and VH polarization observations, the model can bet…
RESEARCH · CL_06642 · Apr 28 · 04:00

Transformer models improve AI reading comprehension with bias correction and interpretability

This paper introduces a transformer-based AI model designed to improve English reading comprehension assistance for students and teachers. The model integrates attention mechanisms and gradient-based attribution to enha…