Investigating the catastrophic forgetting in multimodal large language models Y Zhai, S Tong, X Li, M Cai, Q Qu, YJ Lee, Y Ma CPAL 2024, 2023 | 60* | 2023 |
Eyes wide shut? exploring the visual shortcomings of multimodal llms S Tong, Z Liu, Y Zhai, Y Ma, Y LeCun, S Xie CVPR 2024 (Oral), 2024 | 57 | 2024 |
White-box transformers via sparse rate reduction Y Yu, S Buchanan, D Pai, T Chu, Z Wu, S Tong, B Haeffele, Y Ma NIPS 2023, 2023 | 36 | 2023 |
Ctrl: Closed-loop transcription to an ldr via minimaxing rate reduction X Dai*, S Tong*, M Li*, Z Wu*, M Psenka, KHR Chan, P Zhai, Y Yu, ... Entropy 24 (4), 456, 2022 | 30* | 2022 |
Mass-producing failures of multimodal systems with language models S Tong*, E Jones*, J Steinhardt NIPS 2023, 2023 | 18 | 2023 |
Incremental learning of structured memory via closed-loop transcription S Tong, X Dai, Z Wu, M Li, B Yi, Y Ma ICLR 2023, 2022 | 18 | 2022 |
Revisiting sparse convolutional model for visual recognition M Li, P Zhai, S Tong, X Gao, SL Huang, Z Zhu, C You, Y Ma NIPS 2022, 2022 | 16 | 2022 |
Emp-ssl: Towards self-supervised learning in one training epoch S Tong*, Y Chen*, Y Ma, Y LeCun arXiv preprint arXiv:2304.03977, 2023 | 15 | 2023 |
Emergence of segmentation with minimalistic white-box transformers Y Yu*, T Chu*, S Tong, Z Wu, D Pai, S Buchanan, Y Ma CPAL 2024, 2023 | 14 | 2023 |
Unsupervised manifold linearizing and clustering T Ding, S Tong, KHR Chan, X Dai, Y Ma, BD Haeffele ICCV 2023, 2023 | 7 | 2023 |
Image clustering via the principle of rate reduction in the age of pretrained models T Chu*, S Tong*, T Ding*, X Dai, BD Haeffele, R Vidal, Y Ma ICLR 2024, 2023 | 5 | 2023 |
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning Y Zhai, H Bai*, Z Lin*, J Pan*, S Tong*, Y Zhou*, A Suhr, S Xie, Y LeCun, ... arXiv preprint arXiv:2405.10292, 2024 | 3 | 2024 |
White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is? Y Yu, S Buchanan, D Pai, T Chu, Z Wu, S Tong, H Bai, Y Zhai, ... arXiv preprint arXiv:2311.13110, 2023 | 3 | 2023 |
Closed-loop transcription via convolutional sparse coding X Dai, K Chen, S Tong, J Zhang, X Gao, M Li, D Pai, Y Zhai, XI Yuan, ... CPAL 2024, 2023 | 3 | 2023 |
Unsupervised learning of structured representations via closed-loop transcription S Tong*, X Dai*, Y Chen, M Li, Z Li, B Yi, Y LeCun, Y Ma CPAL 2024, 2022 | 3 | 2022 |
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs S Tong*, E Brown*, P Wu*, S Woo, M Middepogu, SC Akula, J Yang, ... arXiv preprint arXiv:2406.16860, 2024 | | 2024 |
Ctrl123: Consistent Novel View Synthesis via Closed-Loop Transcription H Zhao*, X Dai*, J Wang, S Tong, J Zhang, W Wang, L Zhang, Y Ma arXiv preprint arXiv:2403.10953, 2024 | | 2024 |