Together AI
PulseAugur coverage of Together AI — every cluster mentioning Together AI across labs, papers, and developer communities, ranked by signal.
- partners with Pearl Research Labs 90%
- uses Gemma-4-31B-it-Pearl 90%
- founded Vipul Ved Prakash 90%
- uses Deepgram 90%
- uses Nvidia Blackwell B200 90%
- used by NVIDIA Parakeet-TDT 0.6B v3 90%
- employed by Dan Fu 90%
- developed Gemma-4-31B-it-Pearl 90%
- developed Together Code Interpreter 90%
- partners with MiniMax AI 80%
- used by MiniMax AI 75%
- used by DeepSeek-R1 70%
- 2026-06-09 partnership Together AI partnered with Pax8 to offer AI infrastructure and models to small and medium-sized businesses. source
- 2026-06-01 product_launch Together AI is announcing a new model called M3. source
- 2026-05-29 product_launch Together AI is now serving the two fastest speech-to-text models, including NVIDIA Parakeet-TDT 0.6B v3. source
- 2026-05-29 product_launch Together AI launched a new open-source AI translation application. source
- 2026-05-22 product_launch Together AI launched updates to its Fine-Tuning Platform, adding support for new LLMs and extending context lengths. source
- 2026-05-22 product_launch Together AI announced the addition of 1,000 NVIDIA H100 and H200 GPUs to its infrastructure. source
- 2026-05-22 product_launch Together AI launches GPU clusters with NVIDIA Blackwell platform and optimized kernel collection, achieving significant performance gains. source
- 2026-05-22 product_launch Together AI launched major upgrades to its Batch Inference API. source
- 2026-05-22 product_launch Together AI released FlashAttention-3 and FlashAttention-4, optimized attention mechanisms for GPUs. source
- 2026-05-22 product_launch Together AI launched access to the Qwen3.7-Max model. source
- 2026-05-15 partnership Together AI and Pearl Research Labs formed a partnership to integrate blockchain for AI inference cost reduction. source
- 2026-05-14 research_milestone Together AI's speech-to-text models achieved top rankings for transcription speed on a benchmark leaderboard. source
- 2026-05-08 product_launch Together AI launched a new feature enabling deployment of any Hugging Face model via their Dedicated Container Inference infrastructure using the Goose CLI agent. source
- 2026-04-30 research_milestone Together AI details its rapid response and mitigation strategy for the Copy Fail Linux kernel vulnerability. source
- 2026-04-30 partnership Together AI announced a partnership with Adaption to integrate their data optimization and model fine-tuning services. source
18 day(s) with sentiment data
Together AI's ATLAS system demonstrates superior inference speed on par with specialized hardware
Together AI's newly launched ATLAS system, an adaptive-learning inference engine, is showing remarkable performance, achieving up to 500 TPS on DeepSeek-V3.1. This performance rivals that of specialized hardware like Groq, suggesting Together AI is effectively optimizing LLM inference beyond standard GPU capabilities.
Together AI to offer ATLAS as a distinct inference optimization service
Given the significant performance gains demonstrated by ATLAS, Together AI may soon offer this adaptive-learning inference system as a standalone service or an add-on feature for their existing GPU offerings. This would allow customers to leverage ATLAS's dynamic optimization without needing to manage the underlying infrastructure themselves.
Together AI significantly bolsters inference capacity with H100/H200 GPU expansion
The addition of one thousand NVIDIA H100 and H200 GPUs to Together AI's infrastructure represents a substantial investment in inference capabilities. This move directly supports the growing demand for high-throughput AI model serving and is likely intended to power both their internal services and external customer workloads.
Together AI to integrate NVIDIA Blackwell features into all core services
The 90% training speed boost achieved with NVIDIA Blackwell and custom kernels indicates a deep integration. It's likely Together AI will leverage Blackwell's capabilities across their entire platform, including their new instant clusters and fine-tuning services, to offer a performance edge over competitors.
Together AI's ATLAS system shows strong performance against specialized hardware
The reported performance of Together AI's ATLAS system, achieving up to 500 TPS on DeepSeek-V3.1 and outperforming specialized hardware like Groq, is a significant technical achievement. This suggests their adaptive inference approach is highly effective and could set a new benchmark for LLM inference speed and efficiency.
-
Together AI releases RedPajama-3B open-source model
Together AI has released a new open-source model named RedPajama-3B. This model is designed for efficient inference and is available for public use. The release aims to provide a capable yet lightweight option for resea…
-
Together AI partners with Pax8 to offer AI to SMBs
Together AI has partnered with Pax8 to make advanced AI infrastructure and open-source models accessible to small and medium-sized businesses. This collaboration aims to democratize access to powerful AI tools, ensuring…
-
Together AI adds thousands of NVIDIA B200/B300 chips for inference
Together AI has significantly expanded its cloud computing resources, adding thousands of new chips including NVIDIA's B200 and B300 accelerators. This move is aimed at bolstering their dedicated model inference service…
-
AI Developer Directory Lists 180+ Tools and Agents
A comprehensive directory lists over 180 AI tools and agents designed for developers, covering a wide range of applications from coding assistance to creative suites. The list aims to help users navigate the rapidly evo…
-
NVIDIA releases open 550B Nemotron 3 models for agents and ASR
NVIDIA has released its Nemotron 3 family of open-source models, including Nemotron 3 Ultra and Nemotron 3.5 ASR. Nemotron 3 Ultra is a 550 billion parameter model designed for long-running AI agents, featuring a hybrid…
-
Together AI launches free, open-source agent powered by open models
Together AI is launching a new open-source agent powered by open models. The agent will be free to use and aims to democratize access to AI capabilities.
-
MiniMax M3 model ships with 1M context, multimodal capabilities
MiniMax AI has released its M3 model, boasting enhanced speed and a 1 million token context window. The model also incorporates sparse attention mechanisms and multimodality. Together AI played a key role in optimizing …
-
MiniMax AI highlights M3 model's Sparse Attention mechanism
MiniMax AI recently held a live session discussing their M3 model, highlighting the MiniMax Sparse Attention (MSA) mechanism. Unlike other attention methods that compress the KV cache, MSA preserves the uncompressed KV …
-
Together AI and MiniMax AI Host Live Chat on MiniMax M3 Model
Together AI and MiniMax AI are co-hosting a live Spaces chat to discuss the MiniMax M3 model. The event is scheduled to begin in four hours, and attendees are encouraged to pre-submit their questions.
-
Together AI to unveil M3 model with 1M context window
Together AI is announcing a new model called M3, which features sparse attention and a 1 million token context window. The announcement is scheduled for tomorrow and will be a live event, with MiniMax AI also participat…
-
MiniMax AI and Together AI to Launch New Model
MiniMax AI is launching a new model in collaboration with Together AI. The announcement was made via X, with the launch scheduled for the following day. Further details about the model or its capabilities were not immed…
-
MiniMax M3 launches with 1M-token context and MSA architecture
MiniMax has released its M3 model, featuring a novel Sparse Attention (MSA) architecture that enables a 1 million token context window and native multimodality. This new architecture significantly reduces computational …
-
AI crypto Pearl's GPU mining rush sees profitability slide
A new cryptocurrency called Pearl, which uses AI matrix multiplication for its proof-of-work, has triggered a surge in GPU mining. Despite initial high returns, profitability has rapidly declined as more miners join the…
-
Together AI releases Frontier Agents open-source inference model
Together AI has released a new open-source model called Frontier Agents. This model is designed for inference and is the latest development from their Frontier Agents Research team, led by James Y. Zou.
-
Together AI releases open-source Hot Wings model for inference
Together AI has released a new open-source model called Hot Wings, designed for inference. The company showcased this model at NVIDIA's GTC conference, highlighting its capabilities. This release aims to provide a power…
-
Ideogram releases open-weight Ideogram 4 model with 2K resolution
Ideogram has released Ideogram 4, an open-weight text-to-image model that excels in design-oriented tasks and text rendering. The model offers native 2K resolution and advanced features like bounding box control and str…
-
Together AI serves fastest speech-to-text models
Together AI is now serving the two fastest speech-to-text models, according to Artificial Analysis. The NVIDIA Parakeet-TDT 0.6B v3 model can transcribe 20 hours of audio in less than 10 seconds. This performance is ach…
-
Together AI releases open-source AI translation app
Together AI has released an open-source AI translation application, built using their inference tools. The application is designed to be fun and accessible for users.
-
Together AI builds world's fastest speech-to-text stack
Together AI has developed a highly efficient speech-to-text system, significantly outperforming existing models in speed. Their approach addresses the unique challenges of audio data processing, which is substantially l…
-
Together AI open-sources OSCAR for efficient LLM serving
Together AI has open-sourced OSCAR, a new system for 2-bit KV cache quantization. This technique aims to improve the efficiency of serving large language models, particularly those with long context windows. The develop…