Last ten or so pubs of Manfred K. Warmuth

Recent pubs of Manfred K. Warmuth



  1. Hard labels sampled from sparse targets mislead rotation invariant algorithms, arXiv paper, March 21, 2026.
    Talk by co-auther Avrajit Ghosh @ CPAL2026 conference

  2. Selective Matching Losses - Not All Scores Are Created Equal, arXiv paper, June 9, 2025

  3. Noise misleads rotation invariant algorithms on sparse targets, ALT25 paper talk poster

  4. The Tempered Hilbert Simplex Distance and its Application to Non-linear Embeddings of TEMs, arXiv paper

  5. Hyperbolic Embeddings of Supervised Models, NeurIPS24 paper

  6. A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks, ALT24 paper

  7. Optimal Transport with Tempered Exponential Measures, AAAI24 paper

  8. Boosting with Tempered Exponential Measures. NeurIPS23 paper

  9. Open Problem: Learning sparse linear concepts by priming the features COLT23 paper talk

  10. Layerwise Bregman Representation Learning with Applications to Knowledge Distillation. TMLR23 paper

  11. Clustering above Exponential Families with Tempered Exponential Measures. AISTATS23 paper software

  12. Unlabeled sample compression schemes and corner peelings for ample and maximum classes. Journal of Computer and System Sciences paper

  13. LocoProp: Enhancing BackProp via Local Loss Optimization. AISTATS22 paper

  14. A case where a spindly two-layer linear network decisively outperforms any neural network with a fully connected input layer ALT21 paper video long talk

  15. Reparameterizing Mirror Descent as Gradient Descent. NeurIPS20 paper talk video poster
    See also previous version which contains more material on matrix updates and experiments on deep neural nets:
    Interpolating Between Gradient Descent and Exponentiated Gradient Using Reparameterized Gradient Descent arXiv,v1

  16. Rank-smoothed Pairwise Learning in Perceptual Quality Assessment. ICIP20 paper talk video

  17. Divergence-Based Motivation for Online EM and Combining Hidden Variable Models. UAI20 paper video talk

  18. Winnowing with Gradient Descent. COLT20 paper video talk

    video Long talk that also covers "Reparameterization Mirror Descent as Gradient Descent" long talk

  19. TriMap: Large-scale dimensionality reduction using triplets arXiv
  20. Also: A more globally accurate dimensionality reduction method using triplets arXiv
    And: github twitter thread

  21. An Implicit Form of Krasulina's k-PCA Update without the Orthonormality Constraint. AAAI20 paper AAAI20 poster

  22. Robust Bi-Tempered Logistic Loss Based on Bregman Divergences. Neurips19 paper poster talk Updated arXiv version w. appendix

  23. Unlabeled sample compression schemes and corner peelings for ample and maximum classes ICALP19 paper

  24. Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression COLT19 paper

  25. Adaptive scale-invariant online algorithms for learning linear models ICML19 paper supplement

  26. Two-temperature logistic regression based on the Tsallis divergence AISTAT19 paper supplement poster