Last ten or so pubs of Manfred K. Warmuth
Recent pubs of Manfred K. Warmuth
-
Noise misleads rotation invariant algorithms on sparse targets,
arXiv paper
-
The Tempered Hilbert Simplex Distance and its Application to Non-linear Embeddings of TEMs,
arXiv paper
-
Hyperbolic Embeddings of Supervised Models,
to appear in NeurIPS24
-
A Mechanism for Sample-Efficient In-Context Learning
for Sparse Retrieval Tasks,
ALT24 paper
-
Optimal Transport with Tempered Exponential Measures,
AAAI24 paper
-
Boosting with Tempered Exponential Measures.
NeurIPS23 paper
-
Open Problem: Learning sparse linear concepts
by priming the features
COLT23 paper
talk
-
Layerwise Bregman Representation Learning with Applications
to Knowledge Distillation.
TMLR23 paper
-
Clustering above Exponential Families with Tempered Exponential Measures.
AISTATS23 paper
software
-
Unlabeled sample compression schemes and corner peelings for
ample and maximum classes.
Journal of Computer and System Sciences paper
-
LocoProp: Enhancing BackProp via Local Loss Optimization.
AISTATS22 paper
-
A case where a spindly two-layer linear network
decisively outperforms any neural network
with a fully connected input layer
ALT21 paper
video
long talk
-
Reparameterizing Mirror Descent as Gradient Descent.
NeurIPS20 paper
talk
video
poster
See also previous version which contains more material on
matrix updates and experiments on deep neural nets:
Interpolating Between Gradient Descent and Exponentiated Gradient
Using Reparameterized Gradient Descent
arXiv,v1
-
Rank-smoothed Pairwise Learning in Perceptual Quality
Assessment.
ICIP20 paper
talk
video
-
Divergence-Based Motivation for Online EM
and Combining Hidden Variable Models.
UAI20 paper
video
talk
-
Winnowing with Gradient Descent.
COLT20 paper
video
talk
video Long talk that also covers
"Reparameterization Mirror Descent as Gradient Descent"
long talk
-
TriMap: Large-scale dimensionality reduction using triplets
arXiv
Also: A more globally accurate dimensionality reduction method using triplets
arXiv
And: github
twitter
thread
-
An Implicit Form of Krasulina's k-PCA Update without the Orthonormality Constraint.
AAAI20 paper
AAAI20 poster
-
Robust Bi-Tempered Logistic Loss Based on Bregman
Divergences.
Neurips19 paper
poster
talk
Updated arXiv version w. appendix
-
Unlabeled sample compression schemes and corner peelings
for ample and maximum classes
ICALP19 paper
-
Minimax experimental design: Bridging the gap between
statistical and worst-case approaches to least squares
regression
COLT19 paper
-
Adaptive scale-invariant online algorithms for learning
linear models
ICML19 paper
supplement
-
Two-temperature logistic regression based on the Tsallis
divergence
AISTAT19 paper
supplement
poster