PulseAugur
LIVE 00:03:22
ENTITY GPT-2

GPT-2

PulseAugur coverage of GPT-2 — every cluster mentioning GPT-2 across labs, papers, and developer communities, ranked by signal.

Total · 30d
34
34 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
27
27 over 90d
TIER MIX · 90D
RELATIONSHIPS
SENTIMENT · 30D

4 day(s) with sentiment data

RECENT · PAGE 2/2 · 32 TOTAL
  1. COMMENTARY · CL_07342 ·

    Latent reasoning models may offer safer, more interpretable AI

    A LessWrong post explores the potential benefits of latent reasoning models (LRMs) for AI safety and interpretability. These models, which perform Chain-of-Thought (CoT) reasoning within their internal activations rathe…

  2. RESEARCH · CL_06772 ·

    Transformer research probes security flaws, training dynamics, and in-context learning limits

    Researchers have identified vulnerabilities in the shuffling defense mechanism used to secure Transformer models during inference, demonstrating an attack that can extract model weights by aligning permuted activations.…

  3. RESEARCH · CL_06664 ·

    Research: Removing LayerNorm in LLMs acts as implicit regularizer, impacting performance based on training data size.

    Researchers have investigated the impact of removing Layer Normalization (LayerNorm) from neural network architectures, particularly in models like GPT-2 and Llama. Their findings indicate that replacing LayerNorm with …

  4. RESEARCH · CL_13934 ·

    Talkie-1930: New 13B AI model trained on pre-1931 text explores historical knowledge

    A new project called Talkie has released a 13-billion parameter language model trained exclusively on English text from before 1931. This "vintage" model aims to explore AI's ability to predict the future and generate n…

  5. RESEARCH · CL_03012 ·

    New GEM activation functions offer smoother, rational alternatives to ReLU

    Researchers have introduced Geometric Monomial (GEM), a new family of activation functions designed for deep neural networks. These functions utilize purely rational arithmetic and offer $C^{2N}$-smoothness, aiming to i…

  6. RESEARCH · CL_05413 ·

    Researchers find variance doesn't equal importance in transformer compression

    Researchers have conducted a systematic study on transformer compression, analyzing over 40 experiments across GPT-2 and Mistral 7B models. Their findings indicate that variance in activation directions does not correla…

  7. FRONTIER RELEASE · CL_01749 ·

    Anthropic reveals dangerous Claude Mythos model and $30B ARR amid OpenAI rivalry

    Anthropic has announced significant growth, reaching $30 billion in ARR, a substantial increase from $19 billion in March. Concurrently, they unveiled Claude Mythos, a powerful new model deemed too dangerous for public …

  8. FRONTIER RELEASE · CL_01024 ·

    OpenAI launches affordable GPT-4o mini and open-weight gpt-oss models

    OpenAI has released GPT-4o mini, a new, highly cost-efficient small model designed to broaden AI accessibility and application development. This model demonstrates superior performance on benchmarks like MMLU, MGSM, and…

  9. RESEARCH · CL_00954 ·

    EleutherAI releases open-source tool for interpreting AI model features

    EleutherAI has released an open-source library for automatically interpreting features within sparse autoencoders, a method used to decompose model activations. This tool leverages large language models like Llama 3.1 a…

  10. RESEARCH · CL_00955 ·

    OpenAI explores weak-to-strong generalization for AI alignment

    OpenAI has introduced a new research direction called weak-to-strong generalization, aiming to address the challenge of aligning future superintelligent AI systems with human supervision. Their initial experiments show …

  11. SIGNIFICANT · CL_02483 ·

    OpenAI names Mira Murati interim CEO amid Sam Altman's departure

    OpenAI has announced a significant leadership transition, with CEO Sam Altman departing the company. Chief Technology Officer Mira Murati has been appointed interim CEO, effective immediately, while the board initiates …

  12. FRONTIER RELEASE · CL_00176 ·

    Cracking the code of failed AI pilots

    Anthropic has withheld its new Claude Mythos model from public release due to its advanced capabilities in finding and exploiting software vulnerabilities. The company is instead providing access to select cybersecurity…