generative pre-trained transformer
PulseAugur coverage of generative pre-trained transformer — every cluster mentioning generative pre-trained transformer across labs, papers, and developer communities, ranked by signal.
No coverage in the last 90 days.
6 day(s) with sentiment data
-
Perplexity details research on SFT+RL pipeline for accurate, efficient AI answers
Perplexity has detailed its proprietary post-training pipeline that enhances base models for search-augmented question answering. This process involves initial fine-tuning for instruction following and safety, followed …
-
Amazon invests $5B in Anthropic, committing $100B to AWS cloud
Amazon is significantly deepening its partnership with Anthropic through a substantial investment and a long-term cloud computing commitment. This move, totaling up to $33 billion in investment and $100 billion in AWS s…
-
Anthropic's Claude Code faces backlash over token inflation and performance issues
Developers are reporting that Anthropic's Claude Code tool is consuming tokens at an unexpectedly high rate, potentially due to silently injected tokens in requests or changes in how the model processes information. Thi…
-
Show HN: OpenSwarm – Multi‑Agent Claude CLI Orchestrator for Linear/GitHub
OpenSwarm is a new command-line interface tool designed to orchestrate multiple AI agents for autonomous code-related tasks. It can integrate with various AI models, including Anthropic's Claude, OpenAI's GPT and Codex,…
-
Google Cloud C4, Intel, and Hugging Face partner for 70% TCO improvement on GPT OSS
Google Cloud's C4 platform, in collaboration with Intel and Hugging Face, has achieved a significant total cost of ownership (TCO) improvement of 70% for running open-source GPT models. This optimization is realized thr…
-
Sora 2 System Card
OpenAI has released Sora 2, an advanced video and audio generation model that builds upon its predecessor. This new iteration boasts improved physics simulation, enhanced realism, synchronized audio, and greater user co…
-
New MCP server bridges LLMs like Claude and Gemini to ROS robots
A new open-source project, ROS-MCP-Server, has been developed to bridge large language models with robots. This tool allows LLMs like Claude, GPT, and Gemini to control robots and access their sensor data without modify…
-
Navigating a Broken Dev Culture
A developer working on an AI team describes a dysfunctional corporate culture with nonexistent engineering practices, where management is overly reliant on AI hype. The developer, who has self-taught various AI and deve…
-
Eugene Yan curates essential language modeling papers for study groups
Eugene Yan has compiled a reading list of fundamental language modeling papers, intended to facilitate group study sessions. The list includes seminal works like "Attention Is All You Need," "BERT," and "GPT-3," each ac…
-
Sam Altman shifts OpenAI focus from AGI to broad AI deployment, acknowledging scaling limits
Sam Altman has indicated that achieving Artificial General Intelligence (AGI) will require breakthroughs beyond simply scaling current models, suggesting a need for new architectures. This marks a shift from his previou…
-
RWKV project revives RNNs to challenge Transformer dominance in LLMs
The RWKV (Receptance Weighted Key Value) project introduces a novel architecture that revives Recurrent Neural Networks (RNNs) while incorporating advantages typically found in Transformers. This approach aims to overco…