Pulse

last 48h

[50/1912] 89 sources

What AI is actually talking about — clusters surfacing on Bluesky, Reddit, HN, Mastodon and Lobsters, re-ranked to elevate originality and crush noise.

RESEARCH · OpenAI News · 97mo · [774 sources] · HNLOBSTERSMASTOBLOGREDDITX

AI and compute

Anthropic conducted an experiment where Claude agents acted as digital barterers, successfully negotiating 186 deals totaling over $4,000. Participants found the deals fair, with nearly half expressing willingness to pay for such a service. The experiment highlighted that while model quality, such as Opus versus Haiku, significantly impacted deal outcomes, human participants did not perceive this difference. AI

IMPACT Demonstrates potential for AI agents in complex negotiation and commerce, suggesting future market viability.
RESEARCH · Eugene Yan · 87mo · [2 sources] · BLOG

DataScience SG x ODSC Meetup - Applying ML to Healthcare

Eugene Yan presented a case study on how uCare.ai developed a machine learning system for Parkway Pantai Group, Southeast Asia's largest healthcare provider. This system estimates patient pre-admission costs, enhancing transparency and patient experience. The implementation significantly reduced prediction errors, with mean absolute error decreasing by 55% and root mean squared error by 60%. Yan emphasized that building such data products is a team effort, with machine learning comprising only about 20% of the overall work, highlighting the importance of engineering and methodology. AI

IMPACT Demonstrates practical application of ML in healthcare for cost prediction, improving patient experience and operational efficiency.
RESEARCH · Eugene Yan · 77mo · [2 sources] · BLOG

Beating the Baseline Recommender with Graph & NLP in Pytorch

Eugene Yan's blog posts detail methods for building recommender systems that outperform baseline matrix factorization models. The approach involves using Natural Language Processing (NLP) techniques, specifically word2vec, to generate vector representations of products based on their relationships. These product embeddings are then used to make recommendations by identifying similar items, drawing inspiration from graph-based learning methods like DeepWalk. AI
RESEARCH · Eugene Yan · 75mo · BLOG

Simpler Experimentation with Jupyter, Papermill, and MLflow

Eugene Yan's article details a streamlined workflow for machine learning experimentation using Jupyter, Papermill, and MLflow. This approach avoids notebook duplication and manual tracking by parameterizing notebooks with Papermill for running multiple experiments and logging results. MLflow then centralizes the metrics and artifacts, providing a unified interface for managing and referencing experiment outputs, which is particularly useful for tasks like fraud detection across different regions or stock index prediction. AI
RESEARCH · Eugene Yan · 71mo · [2 sources] · BLOG

My Notes From Spark+AI Summit 2020 (Application-Specific Talks)

Eugene Yan's notes from the Spark+AI Summit 2020 cover practical applications and agnostic talks in deep learning and data engineering. Application-specific sessions highlighted frameworks like Airbnb's Zipline for feature engineering and Sputnik for data engineering, alongside Gojek's Feast and Netflix's data quality approaches. The agnostic talks focused on improving deep learning efficiency through techniques such as model pruning, quantization, and distillation, with examples from IBM and Instagram. AI
RESEARCH · Eugene Yan · 70mo · BLOG

How to Set Up a HTML App with FastAPI, Jinja, Forms & Templates

Eugene Yan has published a guide detailing how to create HTML applications using FastAPI, Jinja, and HTML forms. The article addresses a gap in existing documentation by explaining how to serve HTML content with FastAPI, a framework Yan recently adopted from Flask. The tutorial includes code examples for setting up the necessary dependencies, creating a basic REST API, and integrating Jinja templating for dynamic web pages, along with a GitHub repository for reference. AI
RESEARCH · Eugene Yan · 63mo · BLOG

How to Win a Data Hackathon (Hacklytics 2021)

Eugene Yan, a mentor and judge at Hacklytics 2021, observed that winning teams in the datathon prioritized using readily available datasets and APIs over time-consuming data scraping. Many successful teams leveraged pre-trained models or simple sentiment analysis tools like Vader for tasks such as fake news detection and analyzing social media posts. The top projects often featured user-friendly interfaces that clearly demonstrated their functionality and potential impact. AI
RESEARCH · Eugene Yan · 62mo · BLOG

Bukalapak - Fireside Chat with the Data Science team

Eugene Yan recently shared insights from a fireside chat with Bukalapak's data science team, focusing on the strategic positioning of data within organizations. He highlighted the distinction between data as a cost center versus a profit center, arguing that in e-commerce, data functions as a profit driver by enhancing key business metrics. Yan also emphasized the importance of trust and proactive communication, detailing methods like internal newsletters and informal AI
RESEARCH · Eugene Yan · 61mo · BLOG

Search: Query Matching via Lexical, Graph, and Embedding Methods

Eugene Yan's article explores three primary methods for matching search queries to documents: lexical, graph, and embedding-based approaches. Lexical methods involve direct query string manipulation like normalization, spell checking, and expansion/relaxation. Graph-based techniques leverage knowledge graphs for deeper query understanding and expansion. Embedding-based methods utilize learned representations to achieve similar goals. The post details preprocessing steps, query expansion strategies, and how these techniques are applied in real-world systems like DoorDash's. AI
RESEARCH · Eugene Yan · 58mo · [3 sources] · BLOG

Bootstrapping Labels via ___ Supervision & Human-In-The-Loop

A new paper from Timothy Christensen proposes a coupled-label bootstrap method to address biases in OLS estimators that arise when using AI/ML-generated labels as covariates in economic regressions. The research highlights that standard fixed-label bootstrap methods are often invalid unless specific independence conditions are met. The proposed coupled-label bootstrap jointly resamples true and imputed labels, offering a more robust solution without these stringent conditions, and includes finite-sample adjustments for improved accuracy. This work is illustrated with simulations and applied to analyze the relationship between wages and remote work status. AI

IMPACT Provides a statistical method to improve the reliability of economic analyses that incorporate AI-generated data labels.
RESEARCH · Eugene Yan · 58mo · [2 sources] · BLOG

MLOps Community - System Design for RecSys & Search

Eugene Yan recently presented on system design for recommendation systems and search at two separate meetups: the MLOps Community and SF Big Analytics. The talks, which occurred in September 2021 and July 2021 respectively, covered key aspects of building and deploying such systems. Yan's presentations are available as recorded talks and slides, with citations provided for academic use. AI
RESEARCH · Eugene Yan · 49mo · BLOG

Counterfactual Evaluation for Recommendation Systems

Eugene Yan's article discusses the limitations of traditional offline evaluation for recommendation systems, arguing that they treat an interventional problem as observational. Current methods evaluate how well recommendations fit historical data rather than predicting user behavior with new recommendations. The author proposes counterfactual evaluation, particularly using Inverse Propensity Scoring (IPS), as a method to estimate the impact of new recommendations without live A/B testing. AI
RESEARCH · Eugene Yan · 49mo · BLOG

How to Measure and Mitigate Position Bias

Position bias, where higher-ranked items receive more engagement regardless of relevance, poses a challenge for recommender systems. This bias can stem from user trust in algorithms, presentation effects, or a tendency to stop searching after finding a satisfactory result. To address this, methods like randomizing result positions or exploiting inherent randomness in logged data can be employed to measure and mitigate the impact of position bias, ensuring that truly relevant items are not overlooked. AI
RESEARCH · Eugene Yan · 46mo · BLOG

Uncommon Uses of Python in Commonly Used Libraries

This article explores an advanced Python programming technique involving the "super()" function, particularly its use within base classes. While typically used in child class initializers to call parent methods, calling "super()" in a base class enables cooperative multiple inheritance. Without this, initialization calls in subsequent parent classes can be skipped, leading to errors or missing attributes. The author demonstrates this with examples using "requests" and "scikit-learn" patterns, highlighting how "super()" ensures proper initialization across complex inheritance hierarchies. AI
RESEARCH · Eugene Yan · 68mo · [3 sources] · BLOG

RecSys 2022: Recap, Favorite Papers, and Lessons

Eugene Yan's RecSys 2022 recap highlights a significant increase in industry submissions and a focus on algorithmic advancements and real-world applications. Key papers explored efficient training for sequential recommendations using recency sampling and the application of bandit algorithms to simulate industry challenges, particularly concerning concept drift. The conference also saw continued emphasis on fairness, privacy, and reproducibility, with several papers reproducing established models like BERT4Rec. AI
RESEARCH · Eugene Yan · 70mo · [2 sources] · BLOG

Building the Same App Using Various Web Frameworks

Eugene Yan details his experience building a web application using various modern frameworks, including FastHTML, Next.js, and SvelteKit. He compares their developer experiences by implementing the same data manipulation app in each. Yan also explores extending a FastAPI application with interactive elements like checkboxes and download buttons, demonstrating how to handle form submissions and file responses. AI

IMPACT Provides practical examples of web app development using Python frameworks and interactive HTML elements.
RESEARCH · Eugene Yan · 65mo · [9 sources] · BLOG

Improving Recommendation Systems & Search in the Age of LLMs

A new paper explores the critical role of user state representation in contextual multi-armed bandit (CMAB) recommender systems, finding that variations in state representation can yield greater performance improvements than changes to the bandit algorithm itself. The research highlights that no single embedding or aggregation strategy is universally superior, emphasizing the need for domain-specific evaluations. Another study introduces BEAR, a novel fine-tuning objective for Large Language Models (LLMs) in recommendation tasks that explicitly accounts for beam search behavior during training to address inconsistencies between training and inference. Additionally, a paper proposes a methodology to measure the stability and plasticity of recommender systems, evaluating how models adapt to retraining and changes in data patterns. AI

IMPACT Advances in user state representation and LLM fine-tuning for recommendations could lead to more personalized and effective user experiences.
RESEARCH · METR (Model Evaluation & Threat Research) · 54mo · [5 sources] · BLOG

2023 Year In Review

METR, an AI safety research organization, detailed its 2023 accomplishments, including developing methodologies for evaluating AI agents on autonomous tasks and contributing to OpenAI's GPT-4 system card. The organization also proposed "Responsible Scaling Policies" (RSPs), a framework for AI safety that gained traction among researchers and companies like Anthropic and OpenAI. Additionally, METR partnered with the UK AI Safety Institute and evaluated GPT-5.1 for catastrophic risks. AI
RESEARCH · OpenAI News · 108mo · [2 sources] · BLOG

Learning from human preferences

OpenAI and DeepMind have developed a new algorithm that learns desired behaviors from human feedback, reducing the need for explicit goal functions. This method uses a three-step cycle where humans compare two agent behaviors, allowing the AI to infer the reward function and improve its performance. The approach has shown promising sample efficiency, requiring minimal human input to learn complex tasks like a backflip, and has achieved strong results in simulated robotics and Atari games, sometimes surpassing performance with standard reward functions. However, the system can be susceptible to agents that trick human evaluators, a problem being addressed with additional visual cues. AI
RESEARCH · Lil'Log (Lilian Weng) · 107mo · [2 sources] · BLOG

Predict Stock Prices Using RNN: Part 2

Lilian Weng's blog posts detail the construction of a recurrent neural network (RNN) using TensorFlow for stock price prediction. The first part focuses on building a basic RNN with LSTM cells to predict S&P 500 closing prices using historical data from Yahoo! Finance. The second part extends this model to handle multiple stocks by incorporating stock symbol embeddings as input, allowing the network to differentiate patterns across various price sequences. AI
RESEARCH · Lil'Log (Lilian Weng) · 106mo · BLOG

How to Explain the Prediction of a Machine Learning Model?

Lilian Weng's blog post delves into the critical need for machine learning model interpretability, especially as AI systems are increasingly deployed in sensitive sectors like finance, healthcare, and criminal justice. The post highlights how regulatory requirements and the inherent 'black-box' nature of deep learning models necessitate methods to understand their decision-making processes. Weng discusses the properties of interpretable models and explores interpretation techniques for classic models such as linear regression and Naive Bayes, while also acknowledging the ongoing development of new tools for more complex models. AI
RESEARCH · Lil'Log (Lilian Weng) · 106mo · BLOG

From GAN to WGAN

This article explains the mathematical underpinnings of Generative Adversarial Networks (GANs), a type of generative model inspired by game theory. It details the roles of the generator and discriminator models, which compete to improve each other's performance. The post also discusses challenges in training GANs, such as instability, and introduces variations like Wasserstein GAN (WGAN) designed to address these issues by modifying the loss function. AI
RESEARCH · Lil'Log (Lilian Weng) · 104mo · [9 sources] · BLOG

Learning Word Embedding

Hugging Face has released a suite of tools and guides for training and fine-tuning various types of sentence embedding and reranker models. These resources leverage the Sentence Transformers library, offering methods for static embeddings, multimodal embeddings, and sparse embeddings. The guides cover training with up to 1 billion training pairs and achieving significant speedups, aiming to make advanced embedding model development more accessible. AI
RESEARCH · Lil'Log (Lilian Weng) · 101mo · [33 sources] · BLOG

The Multi-Armed Bandit Problem and Its Solutions

Several recent arXiv papers explore advancements in multi-armed bandit problems, a framework for sequential decision-making under uncertainty. Research includes handling changing action availability with "Flickering Multi-Armed Bandits" and improving regret bounds in logistic bandits without strict context diversity assumptions. Other work focuses on geometry-aware offline-to-online learning, spectral bandits for smooth functions on graphs, and privacy-preserving algorithms for generalized linear contextual bandits. AI

IMPACT Advances in bandit algorithms could lead to more efficient online learning systems and improved decision-making in recommendation, advertising, and resource allocation.
RESEARCH · Lil'Log (Lilian Weng) · 96mo · BLOG

Attention? Attention!

This 2018 blog post by Lilian Weng explains the concept of attention mechanisms in deep learning, drawing parallels to human visual and linguistic attention. It details how attention allows models to weigh the importance of different input elements when generating an output, addressing limitations of traditional sequence-to-sequence models that struggled with long inputs. The post highlights that attention was initially developed to improve neural machine translation by creating direct connections between the output and the entire input sequence. AI
RESEARCH · Lil'Log (Lilian Weng) · 94mo · BLOG

From Autoencoder to Beta-VAE

This article provides a detailed explanation of autoencoders, a type of neural network used for unsupervised learning to reconstruct high-dimensional data. Autoencoders consist of an encoder that compresses input into a low-dimensional latent code and a decoder that reconstructs the original data from this code. A key variant, the Denoising Autoencoder, improves robustness by training the model to recover the original input from a corrupted version, forcing it to learn underlying data relationships. AI
RESEARCH · Lil'Log (Lilian Weng) · 92mo · [2 sources] · BLOG

Flow-based Deep Generative Models

GFlowState is a new visual analytics system designed to improve the interpretability of Generative Flow Networks (GFlowNets), a probabilistic framework used for generating samples proportional to a reward function. The system offers multiple visualization tools, such as trajectory analysis and state projections, to help developers understand how these models explore the sample space and evolve their sampling probabilities during training. By making the structural dynamics of GFlowNets observable, GFlowState aims to accelerate their development and debugging across various application domains. AI
RESEARCH · Lil'Log (Lilian Weng) · 104mo · [6 sources] · BLOG

Object Detection Part 4: Fast Detection Models

Two new research papers propose novel approaches to object detection. VFM4SDG aims to improve single-domain generalized object detection by using a frozen vision foundation model to maintain cross-domain stability, addressing issues with weather and illumination changes. UHR-DETR tackles the challenge of detecting small objects in ultra-high-resolution remote sensing imagery by efficiently allocating computational resources and integrating global and local scene information. AI
RESEARCH · Lil'Log (Lilian Weng) · 87mo · BLOG

Are Deep Neural Networks Dramatically Overfitted?

This post delves into the question of why deep neural networks, despite their numerous parameters, can generalize well to new data. It explores classic principles like Occam's Razor and the Minimum Description Length (MDL) principle, which suggest that simpler models are more likely to be correct and that learning can be viewed as data compression. The MDL principle, in particular, formalizes the idea that a good model should not only explain the data but also be concise, thereby aiding generalization. AI
RESEARCH · Lil'Log (Lilian Weng) · 85mo · BLOG

Domain Randomization for Sim2Real Transfer

Domain Randomization (DR) is a technique used in robotics to bridge the gap between simulated training environments and the real world. This method involves training models across a wide variety of simulated scenarios with randomized physical parameters and visual appearances. The goal is for the trained model to generalize effectively to the real-world environment, which is assumed to be one of the many variations encountered during training. DR is particularly useful because it can require minimal or no real-world data, unlike domain adaptation methods. AI
RESEARCH · Lil'Log (Lilian Weng) · 79mo · BLOG

Self-Supervised Representation Learning

This post explores self-supervised learning, a method that leverages readily available unlabeled data by creating supervised tasks from the data itself. The core idea is to train models on these 'pretext' tasks, not for their own sake, but to learn intermediate representations that are useful for various downstream applications. This approach addresses the high cost and limited scalability of manual data labeling, enabling the exploitation of vast amounts of unlabeled text and images. The post highlights its application in language modeling and discusses image-based self-supervised learning techniques. AI
RESEARCH · Lil'Log (Lilian Weng) · 70mo · BLOG

Neural Architecture Search

Neural Architecture Search (NAS) is a field focused on automating the design of high-performance neural network architectures. It typically involves three main components: a search space defining possible operations and connections, a search algorithm to sample candidate architectures, and an evaluation strategy to assess their performance. Early NAS methods, like those by Zoph & Le and Baker et al., used sequential layer-wise operations, which were computationally intensive, requiring hundreds of GPUs for extended periods. More recent approaches, inspired by successful modular designs, employ cell-based representations to improve efficiency. AI
RESEARCH · Lil'Log (Lilian Weng) · 67mo · BLOG

How to Build an Open-Domain Question Answering System?

Lilian Weng's blog post details methods for constructing open-domain question-answering (ODQA) systems, focusing on Transformer-based language models. The post distinguishes ODQA from reading comprehension by highlighting the absence of provided context for factual questions. It also discusses challenges in QA data fine-tuning, where test-set questions or answers may appear in training sets, potentially inflating performance metrics. AI
RESEARCH · Lil'Log (Lilian Weng) · 65mo · BLOG

Controllable Neural Text Generation

This post explores methods for controlling the output of large language models, which are typically trained on vast amounts of unsupervised web data. Current methods aim to steer these models without altering their core weights, focusing on techniques like guided decoding strategies and prompt design. While these approaches offer ways to influence generated text attributes such as topic and style, the author notes that true model steerability remains an active research area with ongoing exploration of various pros and cons. AI
RESEARCH · Lil'Log (Lilian Weng) · 60mo · [3 sources] · BLOG

Contrastive Representation Learning

Contrastive learning is a machine learning technique that creates an embedding space where similar data points are grouped together and dissimilar ones are separated. This method can be applied in both supervised and unsupervised settings, offering advantages over traditional cross-entropy loss functions, particularly in safety-critical applications. Research indicates that supervised contrastive learning can lead to more trustworthy and transparent neural networks by improving feature attribution explanations. AI
RESEARCH · Lil'Log (Lilian Weng) · 58mo · BLOG

What are Diffusion Models?

Lilian Weng's blog post provides a comprehensive overview of diffusion models, a type of generative model inspired by non-equilibrium thermodynamics. The post details the forward diffusion process, where noise is gradually added to data until it resembles a Gaussian distribution. It also explains the reverse diffusion process, which learns to reconstruct data from noise, and discusses connections to stochastic gradient Langevin dynamics. The article has been updated multiple times to include recent advancements like classifier-free guidance and latent diffusion models. AI
RESEARCH · Lil'Log (Lilian Weng) · 56mo · [2 sources] · BLOG

How to Train Really Large Models on Many GPUs?

Training extremely large neural network models presents significant challenges due to their immense memory requirements and lengthy training times, often exceeding the capacity of individual GPUs. To address this, various parallelism techniques are employed, including data parallelism where models are replicated across multiple workers, and model parallelism where the model itself is partitioned across machines. Advanced methods like gradient accumulation and techniques to offload parameters to CPU memory are also utilized to optimize training efficiency and manage resource constraints. AI
RESEARCH · Lil'Log (Lilian Weng) · 104mo · [13 sources] · BLOG

Learning with not Enough Data Part 3: Data Generation

Google Research has introduced "Nested Learning," a novel machine learning paradigm designed to address the challenge of catastrophic forgetting in continual learning. This approach views models as interconnected optimization problems, allowing them to acquire new knowledge without losing proficiency on previous tasks. A proof-of-concept architecture named "Hope" has demonstrated superior performance in language modeling and long-context memory management using this paradigm. OpenAI has also published research on meta-learning algorithms, including Reptile, which focuses on learning how to learn efficiently for new tasks, and a hierarchical reinforcement learning algorithm that enables faster task completion by breaking down complex problems into high-level actions. AI
RESEARCH · Lil'Log (Lilian Weng) · 88mo · [2 sources] · BLOG

Generalized Visual Language Models

Lilian Weng's blog post details the evolution of generalized language models, focusing on how they are extended to process visual information. Early approaches like VisualBERT fused image patches with text tokens, using self-attention to align visual and textual data for tasks such as image captioning. More recent models like SimVLM treat encoded images as prefixes for language models, leveraging large datasets for pre-training. These methods aim to create unified models capable of understanding and generating content across both visual and textual modalities. AI
RESEARCH · Lil'Log (Lilian Weng) · 64mo · [4 sources] · BLOGREDDIT

Large Transformer Model Inference Optimization

Large transformer models present significant inference challenges due to their substantial memory footprint and computation costs, which scale quadratically with input length. Researchers and practitioners are exploring various optimization techniques to mitigate these issues. These methods include network compression strategies like pruning, quantization, and knowledge distillation, as well as architectural improvements and efficient parallelism. The goal is to reduce memory usage, computation complexity, and inference latency for practical, large-scale deployment. AI
RESEARCH · Lil'Log (Lilian Weng) · 74mo · [2 sources] · BLOG

The Transformer Family Version 2.0

Lilian Weng has updated her comprehensive blog post detailing the Transformer architecture and its numerous advancements since its initial introduction. The updated version, "The Transformer Family Version 2.0," significantly expands on the original, incorporating recent research and modifications to the foundational model. It delves into core concepts like attention, self-attention, multi-head self-attention, and the encoder-decoder structure, providing a detailed overview of how these components function and have been enhanced. AI
RESEARCH · EleutherAI Blog · 45mo · [7 sources] · REDDIT

Alignment Research @ EleutherAI

OpenAI has detailed its iterative, empirical approach to AI alignment research, focusing on scalable training signals aligned with human intent. Their strategy involves training AI systems using human feedback, assisting human evaluation, and conducting alignment research itself. While current models like InstructGPT show promise, OpenAI acknowledges they are far from perfectly aligned and aims to share its findings to advance the field. AI

IMPACT This research highlights the ongoing efforts and challenges in aligning AI systems with human values, crucial for the safe development of advanced AI.
RESEARCH · Lex Fridman Podcast · 70mo · [2 sources] · BLOG

#111 – Richard Karp: Algorithms and Computational Complexity

Azeem Azhar's Exponential View newsletter discusses the AI
RESEARCH · Practical AI · 76mo · [2 sources] · BLOG

Testing ML systems

Eugene Yan's article details a comprehensive approach to testing machine learning systems, differentiating between traditional software tests and ML-specific tests. ML tests are further categorized into pre-train tests for implementation correctness, post-train tests for expected learned behavior, and evaluation metrics for performance assessment. The author uses a DecisionTree implementation and the Titanic dataset to demonstrate these testing methodologies, incorporating practices like unit testing, code coverage, linting, and type checking. AI
RESEARCH · OpenAI News · 90mo · [407 sources] · HNLOBSTERSMASTOBLOGREDDIT

Better language models and their implications

Google DeepMind has introduced the FACTS Benchmark Suite, a new set of evaluations designed to systematically assess the factuality of large language models across various use cases. This suite includes benchmarks for parametric knowledge, search-based information retrieval, and multimodal understanding, alongside an updated grounding benchmark. The initiative aims to provide a more comprehensive measure of LLM accuracy and is being launched with a public leaderboard on Kaggle to track progress across leading models. AI

IMPACT Establishes a new standard for evaluating LLM factuality, potentially driving improvements in model reliability and trustworthiness.
RESEARCH · Practical AI · 91mo · [14 sources] · BLOG

The mathematics of machine learning

Eugene Yan's series of articles explores practical aspects of applying machine learning in real-world systems. He emphasizes starting projects with heuristics before implementing ML, the importance of design patterns for efficient data processing and system maintenance, and the need for careful problem selection based on cost-benefit analysis. Yan also details common challenges encountered after deploying ML models, such as data contamination and feedback loops, and suggests strategies for effective project management and system upkeep. AI
RESEARCH · Practical AI · 55mo · [19 sources] · MASTO

Friendly federated learning 🌼

Researchers have developed several new methods to improve federated learning, a distributed machine learning approach that trains models on decentralized data without sharing raw information. FedHarmony addresses challenges in modeling label correlations across heterogeneous client data by introducing a consensus mechanism. "Who Trains Matters" tackles selection biases in federated learning by proposing an inverse-probability-weighted aggregation scheme to ensure training representativeness. Additionally, new techniques like Subspace Optimization (SSF), FedSLoP, and GradsSharding aim to enhance efficiency by reducing communication and memory overhead, particularly for large models on serverless platforms. AI

IMPACT New federated learning algorithms promise improved efficiency and accuracy, especially for large models and heterogeneous data.
RESEARCH · Practical AI · 64mo · [8 sources] · BLOG

Quick, beautiful web UIs for ML apps

The Machine Learning Compilation (MLC) group, led by Tianqi Chen at CMU, is developing frameworks like MLC Chat and Web LLM to enable running large language models on consumer hardware, including iPhones and web browsers. This initiative aims to mitigate the current GPU shortage by allowing models to run locally on devices with AMD cards or even just CPUs. Projects like Hugging Face's text-to-webapp generator and Gradio are also contributing to easier deployment and accessibility of ML models for developers and end-users. AI
RESEARCH · Hugging Face Blog · 47mo · [163 sources] · HN

The Annotated Diffusion Model

Apple's research paper explores the mechanisms behind compositional generalization in conditional diffusion models, specifically focusing on how they handle combinations of conditions not seen during training. The study validates that models exhibiting local conditional scores are better at generalizing, and that enforcing this locality can improve performance. Separately, Hugging Face has released several blog posts detailing various methods for fine-tuning and optimizing Stable Diffusion models, including techniques like DDPO, LoRA, and optimizations for Intel CPUs, as well as instruction-tuning and Japanese language support. AI

IMPACT Research into diffusion model generalization and practical fine-tuning methods advance core AI capabilities and accessibility.
RESEARCH · Practical AI · 76mo · [6 sources] · BLOGX

Stanford's AI Index Report 2024

Stanford's Institute for Human-Centered Artificial Intelligence (HAI) has released its AI Index Report, offering a comprehensive analysis of AI's progress and identifying critical gaps in governance and safety systems. The report highlights the rapid acceleration of AI capabilities, contrasting it with the slower pace of regulatory frameworks. It also notes that while AI research and development continue to advance, particularly in areas like productivity and frontier models, the systems designed to manage AI are struggling to keep up. AI