Brief

last 24h

[50/497] 223 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · arXiv cs.CL English(EN) · 1d · [2 sources]

PriFT: Prior-Support Guided Supervised Fine-Tuning

Researchers have introduced PriFT, a novel supervised fine-tuning method designed to improve model generalization. PriFT addresses limitations in standard fine-tuning by deriving token weights from a frozen pretrained model, providing a stable reweighting signal. This approach, which estimates "prior support" for target tokens, consistently enhances performance across various tasks and serves as a superior initialization for reinforcement learning. AI

IMPACT Enhances model generalization and provides better initialization for RL, potentially improving performance on complex tasks like reasoning and code generation.
- Reinforcement Learning
- Supervised Fine-Tuning
RESEARCH · Hugging Face Daily Papers English(EN) · 1d · [3 sources]

TUDSR: Twice Upsampling-Diffusion for Higher Super-Resolution

Researchers have developed TUDSR, a novel framework for image super-resolution that utilizes a two-stage diffusion process to achieve higher resolutions than previously possible. This method addresses the limitations of current diffusion models in handling large upsampling ratios and native resolutions by employing a looped chunk-based training strategy. The TUDSR framework, built upon SD2.1-base, demonstrates state-of-the-art performance, generating high-quality images at resolutions up to $2048^2$, surpassing existing techniques. AI

IMPACT Enables higher-resolution image generation from diffusion models, potentially improving detail in AI-generated imagery.
- TUDSR
- SD2.1-base
RESEARCH · arXiv cs.CV English(EN) · 1d · [2 sources]

vesselFM-CT: Segmenting All Blood Vessels in CT Images for System-Level Cardiovascular Analysis

Researchers have developed vesselFM-CT, a novel model designed to segment all blood vessels within CT images. This advancement aims to overcome the limitations of previous studies that focused on isolated vascular segments, enabling a more comprehensive analysis of the entire cardiovascular system. The model utilizes an iterative training process and a new TubeLoss function to handle the diverse structural variations of blood vessels, from large arteries to minuscule mesenteric vessels. AI

IMPACT Enables comprehensive cardiovascular system analysis from CT scans, potentially improving disease classification and understanding of vascular physiology.
- Bastian Wittmann
- vesselFM-CT
RESEARCH · arXiv cs.CV English(EN) · 1d · [2 sources]

CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning

Researchers have developed CapRL++, a novel framework for training image and video captioning models using reinforcement learning with verifiable rewards. This approach moves beyond traditional supervised fine-tuning by using a vision-free language model to assess caption quality based on its ability to answer questions about the visual content. Evaluations across numerous benchmarks demonstrate that CapRL++ enhances caption quality and pretraining, leading to significant downstream performance gains and enabling smaller models to match the capabilities of much larger ones. AI

IMPACT This new training framework could lead to more capable and efficient vision-language models, improving accessibility and downstream applications.
RESEARCH · arXiv cs.CV English(EN) · 1d · [2 sources]

Echo-DM: Ultrasound Marker Removal via Conditional Latent Diffusion and Region-Aware Fusion

Researchers have developed Echo-DM, a novel framework for removing artificial markers from clinical ultrasound images. This method utilizes a conditional latent diffusion model combined with region-aware fusion to restore images without relying on masks, preserving anatomical details. Experiments on the Echo-PAIR dataset show Echo-DM outperforms existing methods in marker removal and anatomical fidelity, offering efficient deployment options. AI

IMPACT This new method could improve the accuracy of automated analysis in clinical ultrasound imaging by removing distracting artificial markers.
- Echo-PAIR
- Echo-DM
RESEARCH · arXiv cs.CV English(EN) · 1d · [2 sources]

ExDet: Open-Domain Open-Vocabulary Detection with Cross-modal Extrapolation and Rectification

Researchers have introduced ExDet, a novel framework designed to improve open-domain open-vocabulary detection (ODOVD) capabilities. This lightweight system enhances the generalization of existing detectors to new categories and unseen domains without requiring training from scratch. ExDet utilizes text-guided extrapolation to infer visual prototypes and a detector-compatible rectification module to adjust representations, achieving state-of-the-art results on several benchmark datasets. AI

IMPACT Enhances generalization for object detection models, potentially improving performance in real-world applications with novel objects and diverse environments.
- MSOSB
- ExDet
- arXiv
- OD-LVIS
- OV-LVIS
- Objects365
COMMENTARY · r/MachineLearning English(EN) · 2h

What will be the next breakthrough in ASR? [D]

The field of Automatic Speech Recognition (ASR) is seeing rapid advancements driven by two primary factors: the increasing availability of pseudo-labeled data and the emergence of new model architectures. While models like Whisper-large-v3 and Nvidia Parakeet v3 demonstrate the power of large-scale supervised training, the discussion questions whether self-supervised learning approaches will be phased out for ASR tasks. This contrasts with computer vision, where self-supervised methods like Dinov3 are highly performant, prompting speculation about a similar breakthrough in speech processing. AI

IMPACT Discussion explores the potential shift from self-supervised to supervised learning in ASR, impacting future model development and research focus.
RESEARCH · arXiv cs.LG English(EN) · 1d · [2 sources]

Machine-Learning Emulation of Satellite Greenhouse Gas Retrievals: Stability over Time

Researchers have investigated the temporal stability of machine learning models used to emulate satellite-based greenhouse gas retrievals. Their study, using data from the Greenhouse Gases Observing SATellite (GOSAT), found that prediction accuracy degrades over time when models are tested on data outside their training period. Incorporating time as a feature significantly improved methane predictions, with a simple Lasso model outperforming more complex neural networks and demonstrating greater stability. AI

IMPACT Highlights the need for temporal validation in ML models for scientific applications, potentially impacting climate monitoring systems.
RESEARCH · arXiv cs.LG English(EN) · 1d · [2 sources]

PRISM: Topology-Aware Cross-Modal Imputation for Modality-Deficient Federated Graph Learning

Researchers have introduced PRISM, a novel framework for federated graph learning that addresses the challenge of modality deficiency across different clients. PRISM enables collaborative learning from decentralized graphs containing text and images, even when individual clients lack complete multimodal data. The framework proactively retrieves and imputes missing modality semantics from the federation, integrating them into local graph propagation with topology-aware control. Experiments demonstrate PRISM's effectiveness, showing an average improvement of 4.48% over state-of-the-art baselines on six multimodal graph datasets. AI

IMPACT Enhances collaborative learning from decentralized multimodal data, potentially improving AI applications that rely on diverse data sources.
COMMENTARY · r/singularity English(EN) · 1h

Place your GPT-6 rumors by this week here

Speculation is mounting regarding OpenAI's next-generation model, GPT-6, with users on Reddit actively sharing rumors and predictions. The anticipation follows the recent release of GPT-4o, suggesting a rapid development cycle for OpenAI's flagship AI. AI

IMPACT Anticipation for GPT-6 suggests a continued rapid pace of AI model development and potential future capabilities.
- GPT-4o
- GPT-6
- OpenAI
RESEARCH · arXiv cs.LG English(EN) · 1d · [2 sources]

Internalizing Geometric Law: Learning from Solver Residuals for Precision-Critical Generation

Researchers have developed a new method called Saturating Additive Rewards (SAR) to improve the precision of large language models in geometric tasks. This approach addresses a failure mode known as Outlier Gradient Masking, where a single constraint violation can hinder learning across all constraints. SAR decomposes rewards into bounded per-constraint terms, preserving partial progress and ensuring consistent gradients. An 8B parameter model using SAR achieved a 2.3x improvement in solving complex geometric problems compared to standard MSE-based rewards. AI

IMPACT Enhances LLM capabilities in precision-critical domains, potentially enabling more reliable AI-driven design and technical diagramming.
RESEARCH · arXiv cs.CV English(EN) · 1d · [2 sources]

Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning

Researchers have developed a novel two-stage framework called Rea2Seg for image segmentation tasks that leverage multimodal large language models (MLLMs). This approach first identifies candidate masks from an MLLM's attention maps and then uses the MLLM to reason over these candidates and select the most accurate one. To further evaluate and advance these capabilities, a new benchmark, ReasonSeg-SGDR, has been introduced to assess perception, grounding, and reasoning abilities across various dimensions. AI

IMPACT Introduces a new method for improving MLLM-based image segmentation and a benchmark to evaluate these models.
RESEARCH · arXiv cs.CV English(EN) · 1d · [2 sources]

Self-supervised Learning Matters: A Simple Ensemble Solution for Micro-Gesture Recognition

Researchers from XInsight Lab have developed a novel ensemble framework for micro-gesture recognition, achieving a new state-of-the-art result in the 4th MiGA Challenge at IJCAI 2026. Their approach integrates a self-supervised RGB model, pre-trained on a large unlabeled video dataset, with existing supervised models. This self-supervised component significantly improved performance, reaching 74.419% top-1 accuracy and outperforming previous benchmarks by over 1.2 percentage points. AI

IMPACT Demonstrates the effectiveness of self-supervised learning for specialized visual recognition tasks, potentially improving performance in areas like human-computer interaction.
RESEARCH · arXiv cs.CV English(EN) · 1d · [2 sources]

LiteVSR: Lightweight Adaptation of Frozen Diffusion Transformers for Video Super-Resolution

Researchers have developed LiteVSR, a new framework for adapting pre-trained diffusion transformers for video super-resolution tasks. This approach uses a lightweight State-Aware Adapter that requires significantly fewer trainable parameters and less training time compared to existing methods. LiteVSR leverages flow matching to efficiently adapt the frozen transformer, enabling competitive restoration quality with minimal computational resources. AI

IMPACT Offers a more computationally efficient method for adapting large generative models to specific video enhancement tasks.
RESEARCH · arXiv cs.LG English(EN) · 1d · [2 sources]

Counterfactual Reasoning for Fine-Grained Evidence Disentanglement in VideoQA

Researchers have developed a new framework called CREDiT to improve the reliability of video question-answering systems. This framework uses counterfactual reasoning and structural causal models to disentangle causal evidence from spurious correlations in video data. By decomposing representations into causal and non-causal components and employing feature-level causal interventions, CREDiT aims to create more trustworthy AI systems that can accurately localize evidence. AI

IMPACT Enhances the trustworthiness and accuracy of AI systems in understanding and reasoning about video content.
- CREDiT
- SPORTU-video
- SportsQA
- NExT-GQA
- VideoQA
COMMENTARY · Mastodon — fosstodon.org 中文(ZH) · 1h

Two Weeks of Fable 5 # ai # ant

A user shared their experience using Fable 5, an AI model, for two weeks. They noted that it was developed by Anthropic and is related to their Claude series of models. The user found the model to be capable and suitable for their needs during the trial period. AI

IMPACT Provides a user perspective on the performance of a specific AI model, potentially influencing adoption decisions.
- Claude
- Anthropic
RESEARCH · arXiv stat.ML English(EN) · 1d · [2 sources]

INFUSER: Influence-Guided Self-Evolution Improves Reasoning

Researchers have developed INFUSER, a novel framework for self-evolving language models that enhances reasoning capabilities. This iterative co-training system features a Generator that creates questions and answers from documents, and a Solver that learns from them. The Generator is rewarded based on an influence score, ensuring it produces questions that genuinely improve the Solver's performance, rather than just difficult ones. INFUSER demonstrated significant improvements, with an 8B model outperforming a larger 32B model on math and coding tasks. AI

IMPACT Enhances LLM reasoning capabilities by creating adaptive training curricula, potentially leading to more capable AI agents.
- DuGRPO
- SuperGPQA
- Qwen3-8B-Base
- Olympiad
- GRPO
RESEARCH · arXiv cs.CV English(EN) · 1d · [2 sources]

OmniGen-AR: AutoRegressive Any-to-Image Generation

Researchers have introduced OmniGen-AR, a novel autoregressive framework designed for versatile image generation. This unified model can synthesize images from various inputs, including text, segmentation maps, depth information, and even existing images for editing or video prediction. To prevent condition tokens from influencing content tokens, the framework employs Disentangled Causal Attention (DCA), a technique that separates attention mechanisms during training. OmniGen-AR has demonstrated state-of-the-art performance on benchmarks like GenEval and VBench. AI

IMPACT Introduces a unified framework for multi-modal image generation, potentially simplifying complex visual synthesis tasks.
RESEARCH · arXiv cs.CV English(EN) · 1d · [2 sources]

Ultra Flash: Scaling Real-Time Streaming Video Generation to High Resolutions

Researchers have introduced Ultra Flash, a novel cascaded streaming framework designed to generate high-resolution video in real-time. This system overcomes the limitations of previous models that were restricted to lower resolutions. Ultra Flash achieves impressive frame rates at 1K and 2K resolutions on a single GPU by employing a unique super-resolution training paradigm and a causal streaming latent upsampler. AI

IMPACT Enables real-time high-resolution video generation, potentially impacting content creation and streaming services.
COMMENTARY · r/Anthropic English(EN) · 1h

Why do the latest models write like a lawyer?

Users are reporting that Anthropic's latest models, specifically versions 4.8 and Fable, exhibit a peculiar writing style that resembles that of a lawyer. This observation has led to discussions among users about the nature of this stylistic shift and whether others have noticed the same phenomenon. AI

IMPACT Users are discussing stylistic changes in recent AI models, indicating a focus on output quality and user perception.
- Anthropic
- Fable
RESEARCH · Hugging Face Daily Papers English(EN) · 1d · [3 sources]

EditSSC: Toward Editable Semantic Occupancy Scenes with Unconditional Diffusion Models

Researchers have developed EditSSC, a new method for generating and editing 3D semantic scenes using 2D Bird's Eye View (BEV) representations. This approach repurposes components from Stable Diffusion, enabling training-free editing capabilities like sketch-guided generation, inpainting, and outpainting. EditSSC demonstrates superior performance on unconditional generation compared to existing 3D-specific methods, highlighting the potential of 2D diffusion models for 3D scene manipulation. AI

IMPACT Enables more accessible and flexible 3D scene generation for applications like autonomous driving.
SIGNIFICANT · Mastodon — mastodon.social English(EN) · 1d · [13 sources]

NotebookLM's Gemini 3.5 upgrade adds a cloud computer and help finding sources https://www.theverge.com/tech/944325/google-notebooklm-ai-gemini-update # AI # Te

Google has significantly updated its AI-powered note-taking application, NotebookLM, integrating the advanced Gemini 3.5 model. This upgrade enhances the app's ability to provide more accurate and reliable information, with evaluations showing a 65% improvement over the previous version. NotebookLM also now features Antigravity, a secure cloud computer that allows it to write and run code for research purposes, expanding its capabilities with over 100 software skills. AI

IMPACT Enhances research and analysis capabilities for users, potentially accelerating workflows with integrated coding and improved AI accuracy.
RESEARCH · Hugging Face Daily Papers English(EN) · 1d · [3 sources]

Vision-Language Guided Hyperspectral Object Tracking via Semantics Fusion and Contextual Template Updating

Researchers have developed VLHTrack, a new framework for hyperspectral object tracking that integrates vision and language models. This approach uses language priors to guide band selection, reducing redundancy and highlighting key spectral features. The system also incorporates a dynamic template update mechanism using Mamba to handle appearance variations and deformations in long sequences. Experiments show VLHTrack surpasses current state-of-the-art methods on benchmark datasets. AI

IMPACT Introduces a novel method for improving object tracking accuracy by leveraging LLMs for spectral feature selection and dynamic template updating.
RESEARCH · arXiv stat.ML English(EN) · 1d · [2 sources]

Backward Coherence and Hidden-State Stability in Recurrent Neural Networks: A Quasi-Reverse-Martingale Theory

Researchers have developed a new theoretical framework called backward coherence to analyze hidden-state stability in recurrent neural networks (RNNs). This approach treats the hidden-state sequence as a quasi-reverse-martingale, enabling more stable and interpretable representations. Simulations and real-world data studies demonstrate that this method can significantly improve stability, reduce tracking errors, and enhance forecasting accuracy, particularly under concept drift. AI

IMPACT Introduces a theoretical framework to enhance stability and interpretability in RNNs, potentially improving performance in time-series forecasting and data analysis tasks.
COMMENTARY · r/Anthropic Norsk(NO) · 2h · [2 sources]

Fable 5 nerfed???

Users are reporting a significant decline in the performance and capabilities of Anthropic's Fable 5 model. Many users feel the model has been "nerfed" or "lobotomized," or "enshitified" since its release, with performance drops so severe that some have canceled subscriptions. The perceived degradation has led users to seek alternative models like Codex or Cleverbot. AI

IMPACT User sentiment suggests a potential decline in perceived value for a specific AI model, prompting users to explore alternatives.
RESEARCH · arXiv cs.LG English(EN) · 1d · [2 sources]

Latent Geometry Beyond Search: Amortizing Planning in World Models

Researchers have developed new methods for long-horizon planning in world models, addressing limitations of existing techniques. One approach, FF-JEPA, uses a hierarchical structure with two forward dynamics models, including an action-free latent planner to predict subgoals, thus removing the need for explicit goal images and enabling planning over extended periods. Another method, building on a pretrained LeWorldModel, amortizes planning into a latent inverse-dynamics mapping, replacing iterative optimization with a faster, goal-conditioned inverse dynamics model that significantly reduces computational cost while maintaining or exceeding performance. AI

IMPACT These advancements could enable more sophisticated AI agents capable of complex, multi-step tasks in real-world environments.
- iCEM
- CEM
- Xiaohao Xu
- LeWorldModel
- FF-JEPA
- arXiv
RESEARCH · arXiv cs.AI English(EN) · 1d · [2 sources]

SAGE: Shape-Adapting Gated Experts for Adaptive Histopathology Image Segmentation

Researchers have developed two novel frameworks, SAGE and SegMoTE, to improve medical image segmentation. SAGE utilizes a dynamic expert routing system to adapt to variations in cell size and shape, achieving high Dice scores on multiple datasets. SegMoTE, on the other hand, efficiently adapts general segmentation models like SAM to medical imaging tasks with minimal learnable parameters and reduced annotation costs. Both approaches aim to enhance the accuracy and practicality of AI in clinical diagnostics. AI

IMPACT These new segmentation models offer improved accuracy and efficiency for clinical diagnostics, potentially reducing annotation costs and enhancing the deployment of AI in healthcare.
- MedSeg-HQ
- Yujie Lu
- SAM
- SegMoTE
- SAGE
- Vision Transformer UNet
- Nguyen Vu
- ConvNeXt
RESEARCH · arXiv cs.AI English(EN) · 1d · [2 sources]

Enhancing Video Representations with Spatiotemporal-Semantic Residual to Mitigate Hallucinations in Video Large Multimodal Models

Researchers have developed new methods to combat hallucinations in large vision-language models (LVLMs). One approach, ViSSRes, enhances video representations using a lightweight network to improve spatiotemporal and semantic consistency, significantly reducing hallucination rates on benchmarks like EventHallusion. Another method focuses on refining textual embeddings to encourage better integration of visual information, leading to more balanced multimodal reasoning and improved performance on benchmarks such as MMVP and POPE. AI

IMPACT These methods offer potential solutions for improving the reliability and accuracy of multimodal AI systems, crucial for applications requiring precise visual understanding.
RESEARCH · arXiv cs.CV English(EN) · 1d · [4 sources]

SwiftVR: Real-Time One-Step Generative Video Restoration

Researchers have developed SwiftVR, a novel framework for real-time generative video restoration that addresses key bottlenecks in existing diffusion-based models. By employing mask-free shifted-window self-attention and a lightweight autoencoder, SwiftVR achieves high frame rates at resolutions up to 4K on powerful hardware and real-time 1080p streaming on consumer-grade GPUs. This advancement makes high-quality video restoration more accessible and practical for live streaming applications. AI

IMPACT Enables practical real-time video restoration on consumer hardware, potentially improving live streaming quality and accessibility.
- RTX 5090
- arXiv
- SwiftVR
- Hugging Face
SIGNIFICANT · 36氪 (36Kr) 中文(ZH) · 1d · [2 sources]

Most popular Chinese concept stocks rose in pre-market trading, Bilibili rose more than 5%

Apple has unveiled a significant upgrade to Siri, integrating new AI capabilities into its operating system. Meanwhile, OpenAI is reportedly preparing for its initial public offering by secretly filing IPO documents. In other tech news, ROKID has addressed an incident involving its smart glasses allegedly being used for surreptitious filming. AI

IMPACT Apple's AI-enhanced Siri could significantly alter user interaction with devices, while OpenAI's IPO filing signals major financial market activity.
- 36Kr
- ChatGPT
- OpenAI
- Siri
- Apple
- ROKID
RESEARCH · Google DeepMind English(EN) · 1d · [3 sources]

Introducing Gemma 4 12B: a unified, encoder-free multimodal model

Researchers have introduced IMUG-Bench, a new benchmark designed to evaluate unified multimodal models (UMMs) in complex, multi-turn image-text dialogue scenarios. Existing benchmarks often fall short by focusing on static or single-turn interactions, failing to capture the nuances of real-world applications. IMUG-Bench addresses this by assessing both understanding and generation capabilities across three classes of dialogue, revealing limitations in current UMMs, particularly regarding exposure bias in generation. The study also explores strategies like Chain-of-Thought and Self-Verification to improve UMM performance and mitigate these biases. AI

IMPACT Provides a new evaluation standard for multimodal models, potentially driving improvements in their ability to handle complex, interactive dialogues.
FRONTIER RELEASE · Fortune English(EN) · 2d · [4 sources]

Anthropic releases its first Mythos-class model to the public

Anthropic has released Fable 5, its first "Mythos-class" AI model to the general public, marking a significant step in making more powerful AI capabilities widely accessible. This release follows earlier concerns about the model's potential for misuse, particularly in cybersecurity, but Anthropic states that new safety guardrails are now sufficient to mitigate these risks. The company is also offering Claude Mythos 5 to vetted partners, which has fewer restrictions than the public Fable 5. Fable 5 demonstrates advanced performance in coding, knowledge work, and vision, with notable improvements in long-horizon memory management and self-verification. AI

IMPACT Sets a new benchmark for public access to highly capable AI, potentially accelerating adoption in complex tasks while raising ongoing safety discussions.
TOOL · The Verge — AI English(EN) · 1d · [9 sources]

Apple is embracing the fantasy of AI photo editing

Apple announced at WWDC 2026 that iOS 27 will introduce new AI-powered editing features to its Photos app. These tools include a "Reframe" function to adjust image perspective, an "Extend" tool to expand scenes without cropping, and an enhanced "Cleanup" feature for more realistic removal of distractions. These capabilities leverage Apple Intelligence to generate new content and fill gaps, improving the overall quality and flexibility of photo editing. AI

IMPACT Enhances user experience for mobile photography and content creation.
TOOL · 36氪 (36Kr) 中文(ZH) · 20h

Market style rotation discussion heats up, brokerages upgrade ratings for over a dozen listed companies

ChatGPT is reportedly set to receive its most significant upgrade to date, moving beyond simple chat functionalities. This development is part of a broader trend where AI advancements are being integrated into various sectors, including education, with AI proctors being introduced for the Gaokao (national college entrance exam) to monitor for anomalies. AI

IMPACT This upgrade could significantly enhance AI's conversational and functional capabilities, while AI proctors signal a new frontier for AI in educational integrity.
- ChatGPT
- AI
RESEARCH · 36氪 (36Kr) 中文(ZH) · 20h

WestJet Airlines plans to put its first Boeing 737 MAX 10 aircraft into service in early 2027

ChatGPT is poised for its most significant upgrade, with reports indicating a substantial overhaul is imminent. This update is expected to go beyond simple conversational enhancements, suggesting a fundamental shift in its capabilities. Additionally, the高考 (Gaokao) exam will incorporate AI proctors capable of automatically capturing abnormal video footage. AI

IMPACT This major ChatGPT upgrade could redefine user expectations and applications, while AI proctoring signals a new era of automated oversight in education.
- AI
- ChatGPT
RESEARCH · 36氪 (36Kr) 中文(ZH) · 20h

Morgan Stanley: Whether the dollar rally can continue depends on the Fed's interest rate path

ChatGPT is poised for its most significant update, moving beyond simple chat functionalities. This upgrade is expected to be the largest in the model's history. Concurrently, financial institutions like Goldman Sachs and JPMorgan Chase are exploring financial products tied to the cost of computing power, specifically focusing on GPUs, which are critical for AI development. AI

IMPACT This major ChatGPT upgrade could significantly enhance AI capabilities, while new GPU-based financial products may impact AI infrastructure investment.
SIGNIFICANT · X — MiniMax AI Bahasa(ID) · 23h · [2 sources]

RT @BAI_AGI: https://t.co/z8Ofg9zHc5

MiniMax AI has released its M3 model, which has achieved a score of 55 on the Artificial Analysis Intelligence Index. The company plans to release the model's weights, which are expected to position it as a leading AI. AI

IMPACT Sets a new benchmark score, with potential to lead the field upon weight release.
- MiniMax AI
- Artificial Analysis Intelligence Index
SIGNIFICANT · Mastodon — fosstodon.org Dansk(DA) · 1d · [2 sources]

Apple's digital assistant Siri, which functions as a voice-controlled helper on iPhones, among other devices, is facing a major overhaul. Artificial intelligence is coming

Apple is significantly overhauling its Siri voice assistant by integrating advanced artificial intelligence, rebranding it as "Siri AI." This transformation aims to shift Siri from a basic voice-controlled tool to a more capable AI companion. The updated assistant will also feature its own dedicated integrated app. AI

IMPACT This AI-powered Siri aims to make voice assistants more capable and integrated into daily tasks.
- Apple
- Siri
TOOL · Mastodon — sigmoid.social English(EN) · 12h

🚀 Reve AI (RF): The New King of Precision! 👑 Tired of messy AI art? Reve 2.0 builds a "blueprint" first for perfect 4K results. ✨ Key Features: Layer Editing: C

Reve AI has launched Reve 2.0, an AI art generator that first creates a "blueprint" for precise image generation. This new version allows users to edit specific objects within an image, such as changing colors or repositioning elements, using text prompts and a drag-and-drop interface. Reve 2.0 also aims to produce sharp text within images and offers a free tier with daily high-quality generations. AI

IMPACT Enhances user control over AI image generation with layer editing and precise object manipulation.
- Reve 2.0
- Reve AI
TOOL · Mastodon — fosstodon.org English(EN) · 12h

it is a thing of immense joy just how incredibly badly the current generation of LLMs perform on ARC AGI3 https:// arcprize.org/blog/arc-agi-3-gp t-5-5-opus-4-7

New evaluations of the ARC AGI3 benchmark reveal that current leading large language models, including OpenAI's GPT-5.5 and Anthropic's Opus 4.7, perform poorly. The ARC prize website highlights these findings, indicating a significant gap in the models' reasoning capabilities on this specific task. AI

IMPACT Highlights limitations in current LLM reasoning, suggesting a need for improved architectures to tackle complex problem-solving.
- Anthropic
- OpenAI
- Opus 4.7
- ARC AGI3
- GPT-5.5
TOOL · r/LocalLLaMA English(EN) · 19h

I fine-tuned Parakeet 0.6B for medical ASR — open weights, local Mac/CUDA/CPU

Omi Health founder has released Omi Med STT v1, a fine-tuned version of NVIDIA's Parakeet TDT 0.6B model for medical Automatic Speech Recognition (ASR). This open-weight model is designed to run locally on devices, ensuring patient audio privacy. Benchmarked against other models, Omi Med STT v1 demonstrates competitive performance in medical word error rate (M-WER) while being significantly smaller and faster than larger models. AI

IMPACT Enables local, private transcription of medical audio, potentially improving patient data security and workflow efficiency for smaller clinics.
SIGNIFICANT · X — MiniMax AI English(EN) · 18h

Pick M3 as your base model on AgentBox to deploy with frontier coding, 1M-token context, and native multimodality all in one click.

MiniMax AI has released its M3 model, now available on the AgentBox platform. This model offers advanced coding capabilities, a 1 million token context window, and built-in multimodality. Users can deploy M3 with a single click through AgentBox. AI

IMPACT Provides advanced coding, 1M context, and multimodality for easier deployment.
- AgentBox
- MiniMax AI
TOOL · Mastodon — fosstodon.org 한국어(KO) · 15h

ComfyUI (@ComfyUI) Krea AI has unveiled Krea 2 as its first self-developed foundation model. It claims that instead of simply rendering text as is, it interprets the mood, style, and intent of the prompt to generate diverse results and more faithfully reflects visual references. https

Krea AI has launched Krea 2, its first proprietary foundation model for image generation. This model aims to go beyond simple text rendering by interpreting the mood, style, and intent of prompts to produce diverse outputs. Krea 2 also claims to more faithfully incorporate visual references into its image generation process. AI

IMPACT Krea 2's ability to interpret prompt nuances and incorporate visual references could enhance creative workflows and push the boundaries of AI-generated art.
- Krea 2
TOOL · r/StableDiffusion English(EN) · 13h

Wan 2.2: Bernini is what we had hope for with Wan Animate

A new model called Bernini, version 2.2 of Wan Animate, has been released and is receiving positive feedback. Users describe it as simple to use, efficient, and effective at its intended purpose. The model is considered a significant improvement and a great application for its intended use. AI

IMPACT This model release offers improved usability and performance for generative art, potentially enhancing creative workflows.
SIGNIFICANT · r/LocalLLaMA English(EN) · 19h

New local AFM model is 20B

Apple has introduced its third generation of foundation models, with parameters ranging from 1 billion to 4 billion. These models are designed to operate with active parameters, meaning the full weight does not need to be loaded into DRAM. This advancement allows for more efficient local processing of AI tasks. AI

IMPACT Enables more efficient on-device AI processing, potentially improving user experience and privacy for AI-powered features.
- Apple
- Apple Foundation Models
TOOL · r/LocalLLaMA English(EN) · 18h

Jetbrains Mellum 2: a really good and performant model

A user on r/LocalLLaMA has shared positive impressions of JetBrains Mellum 2, a 12B Mixture-of-Experts model. Despite its size, the model demonstrates impressive performance, achieving 111.2 t/s generation speed and maintaining over 100 t/s even with a context window of 131,072 tokens on an AMD Radeon RX 7900 XT. The user highlighted its capability in handling complex tasks like tool calls and data reconstruction, outperforming other models like Qwen3.5-9B on the same hardware. AI

IMPACT This model's strong performance and large context window could influence the development of more efficient and capable local LLMs.
TOOL · r/LocalLLaMA English(EN) · 14h

silx-ai/Quasar-Preview • Huggingface (5M context length)

The silx-ai/Quasar-Preview model has been released on Hugging Face, boasting an impressive 5 million token context length. This significant increase in context window allows for processing and understanding much larger amounts of information in a single pass. The model is available for local deployment, catering to users who prefer running AI models on their own hardware. AI

IMPACT Enables processing of significantly larger documents and datasets, potentially improving performance on complex reasoning and summarization tasks.
- silx-ai/Quasar-Preview
- Hugging Face
RESEARCH · Mastodon — fosstodon.org Polski(PL) · 13h

OpenAI implements a new memory architecture that automatically synthesizes context from previous conversations. The system eliminates the need for manual fact-saving

OpenAI has introduced a new memory architecture for its AI models that automatically synthesizes context from past conversations. This system aims to eliminate the need for users to manually save facts, offering a more personalized experience through in-depth analysis of chat history. The new architecture allows the AI to recall and utilize information from previous interactions, enhancing continuity and relevance in conversations. AI

IMPACT Enhances AI conversational continuity and personalization, potentially improving user experience and utility.
- OpenAI
TOOL · r/StableDiffusion English(EN) · 19h

Alphgreed showcase

Alphgreed, a new open-source AI model, has been released on Civitai. The model is reported to perform comparably to closed-source alternatives. Its release aims to provide users with a powerful new tool for creative endeavors. AI

IMPACT Provides a new open-source alternative for AI image generation, potentially fostering community development and innovation.
- Alphgreed
- Civitai
TOOL · Mastodon — mastodon.social 日本語(JA) · 13h

Did you know that you can use ChatGPT-level image generation AI from LINE? https://ascii.jp/elem/000/004/409/4409254/?rss # ascii # AI

LINE has integrated a new AI image generation tool, similar in capability to ChatGPT, into its messaging platform. This feature allows users to create images directly within their chats. The tool aims to enhance user experience by providing advanced AI capabilities within a familiar communication environment. AI

IMPACT Enhances user engagement by bringing advanced AI image generation directly into a popular messaging platform.
- ChatGPT
- LINE