ENTITY GPT-5.4 Nano

GPT-5.4 Nano

PulseAugur coverage of GPT-5.4 Nano — every cluster mentioning GPT-5.4 Nano across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

5 over 90d

Releases · 30d

0 over 90d

Papers · 30d

5 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

3 day(s) with sentiment data

RECENT · PAGE 1/1 · 5 TOTAL

TOOL · CL_66425 · Jun 2 · 08:38

LLM agents struggle to patch security bugs, leaving vulnerabilities open

A new benchmark, CVE-Bench, was developed to evaluate LLM agents' ability to patch security vulnerabilities in Python projects. Across 18 projects and 20 real-world CVEs, the best performing models achieved only a 50% s…
RESEARCH · CL_60622 · May 30 · 04:32

Qwen2.5 fine-tuned for SRE post-mortems outperforms larger models

A developer has fine-tuned the Qwen2.5-0.5B model to generate summaries for SRE post-mortems. This approach uses a 700-sample training set and 4-bit LoRA quantization, allowing it to run on consumer hardware. The fine-t…
TOOL · CL_28337 · May 11 · 16:32

New benchmark tests LLMs on math text continuations

Researchers have developed a new self-supervised benchmark for evaluating language models on mathematical text continuations. This benchmark uses likelihood scoring to assess how well a model's auxiliary forecast string…
RESEARCH · CL_18272 · May 4 · 20:13

PIIGuard shields webpages from LLM PII harvesting via adversarial fragments

Researchers have developed PIIGuard, a novel webpage-level defense system designed to prevent large language models (LLMs) from harvesting personally identifiable information (PII). This system embeds hidden HTML fragme…
RESEARCH · CL_00033 · Oct 17 · 02:00

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Researchers are developing new benchmarks and evaluation methods for large language models (LLMs) in mathematical reasoning and educational assessment. New datasets like ESTBook and Math-PT aim to go beyond simple accur…

LLM agents struggle to patch security bugs, leaving vulnerabilities open

Qwen2.5 fine-tuned for SRE post-mortems outperforms larger models

New benchmark tests LLMs on math text continuations

PIIGuard shields webpages from LLM PII harvesting via adversarial fragments

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models