Publications
2024
- Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow AnalysisAdvances in Neural Information Processing Systems, 2024
- Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTKJournal of Machine Learning Research, 2024
- Pruning before training may improve generalizationJournal of Machine Learning Research, 2024
2023
- On the neural tangent kernel analysis of randomly pruned neural networksIn International Conference on Artificial Intelligence and Statistics, 2023
2021
- Comparison of accuracy and scalability of gauss–Newton and alternating least squares for CANDECOMC/PARAFAC decompositionSIAM Journal on Scientific Computing, 2021
2020
- Continuous regular functionsLogical Methods in Computer Science, 2020