Localizing Reasoning Training-Induced Changes in Large Language Models
Published in Mechanistic Interpretability Workshop at Neural Information Processing Systems (NeurIPS), 2025
Published in Mechanistic Interpretability Workshop at Neural Information Processing Systems (NeurIPS), 2025
Published in International Conference on Learning Representations (ICLR), 2025
Published in Second Workshop on Representational Alignment at International Conference on Learning Representations (ICLR), 2025
Published in ACM Computing Surveys, 2025
Published in UniReps Workshop at Neural Information Processing Systems (NeurIPS), 2023
Published in European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2022
Published in arXiv preprint, 2020