ENTITY TriviaQA

TriviaQA

PulseAugur coverage of TriviaQA — every cluster mentioning TriviaQA across labs, papers, and developer communities, ranked by signal.

Total · 30d

9 over 90d

Releases · 30d

0 over 90d

Papers · 30d

9 over 90d

TIER MIX · 90D

RELATIONSHIPS

instance of 2WikiMultiHopQA 60%

SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 9 TOTAL

TOOL · CL_30773 · May 13 · 13:06

PersonalAI 2.0 framework boosts LLM knowledge graph retrieval

Researchers have developed PersonalAI 2.0 (PAI-2), a new framework that enhances LLM systems by integrating external knowledge graphs. PAI-2 employs a dynamic, multi-stage query processing pipeline for adaptive, iterati…
TOOL · CL_20411 · May 7 · 04:00

New method quantifies LLM uncertainty using semantic entropy and conformal calibration

Researchers have developed a new method called Adaptive Conformal Semantic Entropy (ACSE) to better estimate the uncertainty of Large Language Models (LLMs). This approach focuses on the semantic dispersion of different…
RESEARCH · CL_15929 · May 5 · 04:00

New methods like SMF and SAM reduce catastrophic forgetting in LLMs

Two new research papers explore methods to mitigate catastrophic forgetting in language models during fine-tuning. One paper introduces Sparse Memory Finetuning (SMF), which adds memory layers and updates only heavily a…
RESEARCH · CL_08278 · Apr 28 · 07:21

Researchers release Faithfulness-QA dataset to train context-faithful RAG models

Researchers have developed Faithfulness-QA, a new dataset containing nearly 100,000 samples designed to train Retrieval-Augmented Generation (RAG) models to prioritize retrieved context over their internal knowledge. Th…
RESEARCH · CL_07004 · Apr 28 · 04:00

S2G-RAG improves multi-hop QA by judging evidence sufficiency and gaps

Researchers have introduced S2G-RAG, a novel iterative framework designed to improve retrieval-augmented generation (RAG) for multi-hop question answering. The system features a controller, S2G-Judge, which determines i…
RESEARCH · CL_06290 · Apr 27 · 05:53

Gemma 3 4B LLM confidence training shows mixed results, improves accuracy post-hoc

A study on the Gemma 3 4B model investigated methods to improve its verbal confidence in responses. Initial attempts using a filtered dataset for confidence-conditioned supervised fine-tuning (CSFT) yielded negative res…
RESEARCH · CL_13525 · Apr 26 · 16:17

S2G-RAG framework improves multi-hop QA by judging evidence sufficiency

Researchers have introduced S2G-RAG, an iterative framework designed to improve retrieval-augmented question answering, particularly for multi-hop queries. The system features a controller called S2G-Judge that determin…
RESEARCH · CL_05078 · Apr 24 · 06:33

LLMs use internal confidence signals to detect and correct errors

Researchers have investigated how large language models can identify and correct their own mistakes without external input, drawing parallels to second-order confidence models in decision neuroscience. Their findings su…
RESEARCH · CL_04990 · Apr 24 · 04:45

Study finds 3-9B LLMs fail verbal confidence tests, impacting uncertainty estimates

A new study examined the verbal confidence of seven instruction-tuned, open-weight large language models (LLMs) with 3-9 billion parameters. Researchers found that these models failed to meet minimal validity criteria f…

PersonalAI 2.0 framework boosts LLM knowledge graph retrieval

New method quantifies LLM uncertainty using semantic entropy and conformal calibration

New methods like SMF and SAM reduce catastrophic forgetting in LLMs

Researchers release Faithfulness-QA dataset to train context-faithful RAG models

S2G-RAG improves multi-hop QA by judging evidence sufficiency and gaps

Gemma 3 4B LLM confidence training shows mixed results, improves accuracy post-hoc

S2G-RAG framework improves multi-hop QA by judging evidence sufficiency

LLMs use internal confidence signals to detect and correct errors

Study finds 3-9B LLMs fail verbal confidence tests, impacting uncertainty estimates