Last ten or so pubs of Manfred K. Warmuth

Recent pubs of Manfred K. Warmuth



  1. Noise misleads rotation invariant algorithms on sparse targets, arXiv paper

  2. The Tempered Hilbert Simplex Distance and its Application to Non-linear Embeddings of TEMs, arXiv paper

  3. Hyperbolic Embeddings of Supervised Models, to appear in NeurIPS24

  4. A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks, ALT24 paper

  5. Optimal Transport with Tempered Exponential Measures, AAAI24 paper

  6. Boosting with Tempered Exponential Measures. NeurIPS23 paper

  7. Open Problem: Learning sparse linear concepts by priming the features COLT23 paper talk

  8. Layerwise Bregman Representation Learning with Applications to Knowledge Distillation. TMLR23 paper

  9. Clustering above Exponential Families with Tempered Exponential Measures. AISTATS23 paper software

  10. Unlabeled sample compression schemes and corner peelings for ample and maximum classes. Journal of Computer and System Sciences paper

  11. LocoProp: Enhancing BackProp via Local Loss Optimization. AISTATS22 paper

  12. A case where a spindly two-layer linear network decisively outperforms any neural network with a fully connected input layer ALT21 paper video long talk

  13. Reparameterizing Mirror Descent as Gradient Descent. NeurIPS20 paper talk video poster
    See also previous version which contains more material on matrix updates and experiments on deep neural nets:
    Interpolating Between Gradient Descent and Exponentiated Gradient Using Reparameterized Gradient Descent arXiv,v1

  14. Rank-smoothed Pairwise Learning in Perceptual Quality Assessment. ICIP20 paper talk video

  15. Divergence-Based Motivation for Online EM and Combining Hidden Variable Models. UAI20 paper video talk

  16. Winnowing with Gradient Descent. COLT20 paper video talk

    video Long talk that also covers "Reparameterization Mirror Descent as Gradient Descent" long talk

  17. TriMap: Large-scale dimensionality reduction using triplets arXiv
  18. Also: A more globally accurate dimensionality reduction method using triplets arXiv
    And: github twitter thread

  19. An Implicit Form of Krasulina's k-PCA Update without the Orthonormality Constraint. AAAI20 paper AAAI20 poster

  20. Robust Bi-Tempered Logistic Loss Based on Bregman Divergences. Neurips19 paper poster talk Updated arXiv version w. appendix

  21. Unlabeled sample compression schemes and corner peelings for ample and maximum classes ICALP19 paper

  22. Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression COLT19 paper

  23. Adaptive scale-invariant online algorithms for learning linear models ICML19 paper supplement

  24. Two-temperature logistic regression based on the Tsallis divergence AISTAT19 paper supplement poster