Last ten or so pubs of Manfred K. Warmuth
Recent pubs of Manfred K. Warmuth

Noise misleads rotation invariant algorithms on sparse targets,
arXiv paper

Tempered Calculus for ML: Application to Hyperbolic Model Embedding,
arXiv paper

The Tempered Hilbert Simplex Distance and its Application to Nonlinear Embeddings of TEMs,
arXiv paper

A Mechanism for SampleEfficient InContext Learning
for Sparse Retrieval Tasks,
ALT24 paper

Optimal Transport with Tempered Exponential Measures,
AAAI24 paper

Boosting with Tempered Exponential Measures.
NeurIPS23 paper

Open Problem: Learning sparse linear concepts
by priming the features
COLT23 paper
talk

Layerwise Bregman Representation Learning with Applications
to Knowledge Distillation.
TMLR23 paper

Clustering above Exponential Families with Tempered Exponential Measures.
AISTATS23 paper
software

Unlabeled sample compression schemes and corner peelings for
ample and maximum classes.
Journal of Computer and System Sciences paper

LocoProp: Enhancing BackProp via Local Loss Optimization.
AISTATS22 paper

A case where a spindly twolayer linear network
decisively outperforms any neural network
with a fully connected input layer
ALT21 paper
video
long talk

Reparameterizing Mirror Descent as Gradient Descent.
NeurIPS20 paper
talk
video
poster
See also previous version which contains more material on
matrix updates and experiments on deep neural nets:
Interpolating Between Gradient Descent and Exponentiated Gradient
Using Reparameterized Gradient Descent
arXiv,v1

Ranksmoothed Pairwise Learning in Perceptual Quality
Assessment.
ICIP20 paper
talk
video

DivergenceBased Motivation for Online EM
and Combining Hidden Variable Models.
UAI20 paper
video
talk

Winnowing with Gradient Descent.
COLT20 paper
video
talk
video Long talk that also covers
"Reparameterization Mirror Descent as Gradient Descent"
long talk

TriMap: Largescale dimensionality reduction using triplets
arXiv
Also: A more globally accurate dimensionality reduction method using triplets
arXiv
And: github
twitter
thread

An Implicit Form of Krasulina's kPCA Update without the Orthonormality Constraint.
AAAI20 paper
AAAI20 poster

Robust BiTempered Logistic Loss Based on Bregman
Divergences.
Neurips19 paper
poster
talk
Updated arXiv version w. appendix

Unlabeled sample compression schemes and corner peelings
for ample and maximum classes
ICALP19 paper

Minimax experimental design: Bridging the gap between
statistical and worstcase approaches to least squares
regression
COLT19 paper

Adaptive scaleinvariant online algorithms for learning
linear models
ICML19 paper
supplement

Twotemperature logistic regression based on the Tsallis
divergence
AISTAT19 paper
supplement
poster