Seguir
Kwangjun Ahn
Kwangjun Ahn
PhD Student, MIT
Dirección de correo verificada de mit.edu - Página principal
Título
Citado por
Citado por
Año
Transformers learn to implement preconditioned gradient descent for in-context learning
K Ahn, X Cheng, H Daneshmand, S Sra
Advances in Neural Information Processing Systems 36, 2024
822024
From Nesterov's Estimate Sequence to Riemannian Acceleration
K Ahn, S Sra
Proceedings of Thirty Third Conference on Learning Theory (COLT), PMLR 125 …, 2020
762020
Hypergraph spectral clustering in the weighted stochastic block model
K Ahn, K Lee, C Suh
IEEE Journal of Selected Topics in Signal Processing 12 (5), 959-974, 2018
722018
Optimal dimension dependence of the metropolis-adjusted langevin algorithm
S Chewi, C Lu, K Ahn, X Cheng, T Le Gouic, P Rigollet
Conference on Learning Theory (COLT), 1260-1300, 2021
652021
Sgd with shuffling: optimal rates without component convexity and large epoch requirements
K Ahn, C Yun, S Sra
Advances in Neural Information Processing Systems 33, 17526-17535, 2020
652020
Understanding the unstable convergence of gradient descent
K Ahn, J Zhang, S Sra
International Conference on Machine Learning, 247-257, 2022
642022
Efficient constrained sampling via the mirror-Langevin algorithm
K Ahn, S Chewi
Advances in Neural Information Processing Systems 34, 28405-28418, 2021
552021
Community recovery in hypergraphs
K Ahn, K Lee, C Suh
IEEE Transactions on Information Theory 65 (10), 6561-6579, 2019
412019
Binary rating estimation with graph side information
K Ahn, K Lee, H Cha, C Suh
Advances in neural information processing systems 31, 2018
352018
Learning threshold neurons via edge of stability
K Ahn, S Bubeck, S Chewi, YT Lee, F Suarez, Y Zhang
Advances in Neural Information Processing Systems 36, 2024
322024
Graph Matrices: Norm Bounds and Applications
K Ahn, D Medarametla, A Potechin
arXiv preprint 1604.03423, 2020
31*2020
Linear attention is (maybe) all you need (to understand transformer optimization)
K Ahn, X Cheng, M Song, C Yun, A Jadbabaie, S Sra
ICLR 2024 (arXiv:2310.01082), 2023
192023
Reproducibility in optimization: Theoretical framework and limits
K Ahn, P Jain, Z Ji, S Kale, P Netrapalli, GI Shamir
Advances in Neural Information Processing Systems 35, 18022-18033, 2022
162022
Riemannian perspective on matrix factorization
K Ahn, F Suarez
arXiv preprint arXiv:2102.00937, 2021
142021
Mirror descent maximizes generalized margin and can be implemented efficiently
H Sun, K Ahn, C Thrampoulidis, N Azizan
Advances in Neural Information Processing Systems 35, 31089-31101, 2022
122022
Understanding Nesterov's Acceleration via Proximal Point Method
K Ahn, S Sra
Symposium on Simplicity in Algorithms (SOSA), 117-130, 2022
122022
The crucial role of normalization in sharpness-aware minimization
Y Dai, K Ahn, S Sra
Advances in Neural Information Processing Systems 36, 2024
92024
One-pass learning via bridging orthogonal gradient descent and recursive least-squares
Y Min, K Ahn, N Azizan
2022 IEEE 61st Conference on Decision and Control (CDC), 4720-4725, 2022
82022
On tight convergence rates of without-replacement sgd
K Ahn, S Sra
arXiv preprint arXiv:2004.08657, 2020
72020
From proximal point method to Nesterov’s acceleration
K Ahn
arXiv preprint arXiv:2005.08304, 2020
72020
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20