Seguir
Yuanzhi Li
Yuanzhi Li
Assistant Professor at CMU
Dirección de correo verificada de andrew.cmu.edu - Página principal
Título
Citado por
Citado por
Año
Lora: Low-rank adaptation of large language models
EJ Hu, Y Shen, P Wallis, Z Allen-Zhu, Y Li, S Wang, L Wang, W Chen
arXiv preprint arXiv:2106.09685, 2021
40322021
Sparks of artificial general intelligence: Early experiments with gpt-4
S Bubeck, V Chandrasekaran, R Eldan, J Gehrke, E Horvitz, E Kamar, ...
arXiv preprint arXiv:2303.12712, 2023
20782023
A convergence theory for deep learning via over-parameterization
Z Allen-Zhu, Y Li, Z Song
International conference on machine learning, 242-252, 2019
14832019
Learning and generalization in overparameterized neural networks, going beyond two layers
Z Allen-Zhu, Y Li, Y Liang
Advances in neural information processing systems 32, 2019
8162019
Convergence analysis of two-layer neural networks with relu activation
Y Li, Y Yuan
Advances in neural information processing systems 30, 2017
7282017
Learning overparameterized neural networks via stochastic gradient descent on structured data
Y Li, Y Liang
Advances in neural information processing systems 31, 2018
6842018
A theoretical analysis of NDCG type ranking measures
Y Wang, L Wang, Y Li, D He, TY Liu
Conference on learning theory, 25-54, 2013
6732013
A latent variable model approach to pmi-based word embeddings
S Arora, Y Li, Y Liang, T Ma, A Risteski
Transactions of the Association for Computational Linguistics 4, 385-399, 2016
6502016
Towards understanding ensemble, knowledge distillation and self-distillation in deep learning
Z Allen-Zhu, Y Li
arXiv preprint arXiv:2012.09816, 2020
3352020
An alternative view: When does SGD escape local minima?
B Kleinberg, Y Li, Y Yuan
International conference on machine learning, 2698-2707, 2018
3282018
Algorithmic regularization in over-parameterized matrix sensing and neural networks with quadratic activations
Y Li, T Ma, H Zhang
Conference On Learning Theory, 2-47, 2018
3282018
Towards explaining the regularization effect of initial large learning rate in training neural networks
Y Li, C Wei, T Ma
Advances in neural information processing systems 32, 2019
3052019
Linear algebraic structure of word senses, with applications to polysemy
S Arora, Y Li, Y Liang, T Ma, A Risteski
Transactions of the Association for Computational Linguistics 6, 483-495, 2018
2512018
Algorithmic framework for model-based deep reinforcement learning with theoretical guarantees
Y Luo, H Xu, Y Li, Y Tian, T Darrell, T Ma
arXiv preprint arXiv:1807.03858, 2018
2342018
What can resnet learn efficiently, going beyond kernels?
Z Allen-Zhu, Y Li
Advances in Neural Information Processing Systems 32, 2019
2092019
Textbooks are all you need
S Gunasekar, Y Zhang, J Aneja, CCT Mendes, A Del Giorno, S Gopi, ...
arXiv preprint arXiv:2306.11644, 2023
1962023
Gradient descent on neural networks typically occurs at the edge of stability
J Cohen, S Kaur, Y Li, JZ Kolter, A Talwalkar
International Conference on Learning Representations, 2020
1952020
On the convergence rate of training recurrent neural networks
Z Allen-Zhu, Y Li, Z Song
Advances in neural information processing systems 32, 2019
1902019
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions
S Chen, S Chewi, J Li, Y Li, A Salim, AR Zhang
arXiv preprint arXiv:2209.11215, 2022
1572022
Feature purification: How adversarial training performs robust deep learning
Z Allen-Zhu, Y Li
2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS …, 2022
1502022
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20