Standard Intelligence is developing a novel approach to training general AI agents by focusing on raw video data of computer usage, rather than language-based methods. Their thesis is that scaling action data through video is the most promising path to creating capable agents. The company has built a massive dataset of computer actions and an efficient video encoder, enabling their foundation model, FDM-1, to perform complex tasks like CAD design and autonomous driving after fine-tuning. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT This video-centric pre-training approach could unlock new agent capabilities by scaling action data more effectively than language models.
RANK_REASON The cluster describes a new pre-training paradigm and a foundation model from a startup, detailed in a blog post that functions as a research paper.