Neural Tangent Kernel: Convergence and Generalization in Neural Networks Paper • 1806.07572 • Published Jun 20, 2018 • 1
Unraveling the Gradient Descent Dynamics of Transformers Paper • 2411.07538 • Published Nov 12, 2024 • 2