ENTITY stochastic gradient descent

stochastic gradient descent

PulseAugur coverage of stochastic gradient descent — every cluster mentioning stochastic gradient descent across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

24 over 90d

Releases · 30d

0 over 90d

Papers · 30d

24 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

9 day(s) with sentiment data

RECENT · PAGE 1/2 · 24 TOTAL

TOOL · CL_80134 · Jun 9 · 04:00

New study reveals SGD noise-covariance link to loss landscape curvature

Researchers have uncovered a new relationship between the noise introduced by Stochastic Gradient Descent (SGD) and the curvature of the loss landscape in deep learning models. Their findings indicate that this noise is…
RESEARCH · CL_79219 · Jun 6 · 19:40

New method predicts neural network generalization using Fourier fractal dimension

Researchers have developed a new method to predict how well deep neural networks will generalize without needing separate validation data. This approach uses the Fourier fractal dimension of the network's weight variati…
RESEARCH · CL_77144 · Jun 4 · 23:04

Deep Neural Networks Achieve Optimal Generalization Rates

Two new papers submitted to arXiv analyze the generalization performance of gradient descent methods in deep neural networks. The research establishes minimax-optimal rates for excess population risk in deep ReLU networ…
RESEARCH · CL_62198 · May 29 · 13:41

Lyapunov framework analyzes stochastic algorithm convergence

Researchers have published a paper detailing a Lyapunov-based framework for analyzing the finite-time convergence of stochastic iterative algorithms. This approach uses generalized Moreau envelopes as universal Lyapunov…
RESEARCH · CL_44035 · May 21 · 15:50

New paper re-evaluates SGD dynamics, challenging Brownian motion analogy

A new paper challenges the common assumption that Stochastic Gradient Descent (SGD) noise behaves like Brownian motion. Researchers propose an alternative model where SGD dynamics occur within a fluctuating loss landsca…
RESEARCH · CL_43952 · May 21 · 13:39

Simple Random Node Sampling outperforms full-graph training for GNNs

Researchers have found that a simple Random Node Sampling (RNS) method for training Graph Neural Networks (GNNs) can match or exceed the performance of full-graph training. This surprising result holds true across numer…
RESEARCH · CL_40767 · May 19 · 15:39

New Bayesian Framework Optimizes Neural Network Learning Rates

Researchers have introduced a novel probabilistic framework to optimize the learning rate in neural network training, moving beyond empirical trial-and-error. This new approach develops classic Bayesian statistics into …
RESEARCH · CL_39978 · May 19 · 10:24

New method adds missingness to SGD to reduce bias in incomplete data

Researchers have developed a novel method called Richardson-SGD to address gradient bias in stochastic gradient descent when dealing with incomplete data. The technique involves deliberately introducing additional missi…
RESEARCH · CL_39974 · May 19 · 03:10

Factor Augmented SGD optimizes high-dimensional machine learning

Researchers have introduced Factor-Augmented SGD (FSGD), a novel optimization method designed for high-dimensional machine learning tasks. FSGD operates on streaming data, enabling scalability for large-scale problems w…
RESEARCH · CL_37641 · May 18 · 20:18

Adam optimizer corrects SGD's frequency bias in language model training

New research highlights a frequency bias in Stochastic Gradient Descent (SGD) when training language models on imbalanced token distributions. This bias causes parameters for common tokens to converge quickly, while tho…
RESEARCH · CL_38336 · May 18 · 16:18

New theory shows momentum enables perfect parallelization in SGD

Researchers have developed a new theory explaining how classical momentum schemes like Polyak's heavy ball can accelerate stochastic gradient descent (SGD) for large-scale machine learning. The theory applies to quadrat…
RESEARCH · CL_44682 · May 18 · 03:09

LLM training research explores distillation, feedback, and optimizers

New research explores methods to improve Large Language Model (LLM) training efficiency and effectiveness. One study challenges the necessity of a strong teacher model in knowledge distillation, finding that even smalle…
RESEARCH · CL_38201 · May 16 · 22:26

Paper analyzes SGD dynamics in high-dimensional linear networks

A new paper details the high-dimensional behavior of stochastic gradient descent (SGD) on diagonal linear networks. The research shows that in high dimensions, SGD dynamics can be accurately modeled by a stochastic diff…
RESEARCH · CL_28342 · May 11 · 16:08

New papers analyze gradient descent convergence in neural networks

Two new research papers explore the convergence properties of gradient descent in neural network training. The first paper, focusing on wide shallow models with bounded nonlinearities, proves that non-global minimizers …
TOOL · CL_25633 · May 8 · 10:02

New method tackles unbounded variance in variational inference

Researchers have developed a new approach to optimize Black-Box Variational Inference (BBVI) by addressing the inherent unbounded variance in its stochastic gradients. Their method, detailed in a new paper, focuses on t…
TOOL · CL_22024 · May 8 · 04:00

New research derives advanced optimizers from evolutionary principles

Researchers have developed a new method to derive advanced optimization algorithms directly from evolutionary principles, unifying previously disparate views of evolution. This approach introduces Darwinian Lineage Simu…
TOOL · CL_20532 · May 7 · 04:00

Bayesian Parameter Shift Rule enhances VQE gradient estimation

Researchers have introduced a Bayesian variant of the parameter shift rule (PSR) for variational quantum eigensolvers (VQEs). This new method utilizes Gaussian processes to estimate objective function gradients, offerin…
TOOL · CL_15827 · May 5 · 04:00

Researchers explore efficient parameter estimation for truncated Boolean product distributions

Researchers have developed a new method for estimating parameters of truncated Boolean product distributions, a problem previously unaddressed in discrete settings. The approach relies on a concept of 'fatness' for the …
TOOL · CL_15818 · May 5 · 04:00

Researchers develop novel bootstrap for SGD confidence sets

Researchers have developed a novel method for constructing confidence sets in Stochastic Gradient Descent (SGD) algorithms. This new approach utilizes the multiplier bootstrap procedure and establishes its non-asymptoti…
RESEARCH · CL_16296 · May 4 · 14:39

Evolutionary game theory deciphers shortcut learning in deep neural networks

Researchers have developed a new theoretical framework using evolutionary game theory to understand shortcut learning in deep neural networks. The study formally defines core and shortcut features, modeling data samples…

New study reveals SGD noise-covariance link to loss landscape curvature

New method predicts neural network generalization using Fourier fractal dimension

Deep Neural Networks Achieve Optimal Generalization Rates

Lyapunov framework analyzes stochastic algorithm convergence

New paper re-evaluates SGD dynamics, challenging Brownian motion analogy

Simple Random Node Sampling outperforms full-graph training for GNNs

New Bayesian Framework Optimizes Neural Network Learning Rates

New method adds missingness to SGD to reduce bias in incomplete data

Factor Augmented SGD optimizes high-dimensional machine learning

Adam optimizer corrects SGD's frequency bias in language model training

New theory shows momentum enables perfect parallelization in SGD

LLM training research explores distillation, feedback, and optimizers

Paper analyzes SGD dynamics in high-dimensional linear networks

New papers analyze gradient descent convergence in neural networks

New method tackles unbounded variance in variational inference

New research derives advanced optimizers from evolutionary principles

Bayesian Parameter Shift Rule enhances VQE gradient estimation

Researchers explore efficient parameter estimation for truncated Boolean product distributions

Researchers develop novel bootstrap for SGD confidence sets

Evolutionary game theory deciphers shortcut learning in deep neural networks