🔥 2025.2 I started my internship at DeepSeek, working towards AGI!
🔥 2025.1 I completed my internship at ByteDance. I am grateful to my wonderful mentor, Mingxuan Wang, and my colleagues. During my time there, I worked on an academic project involving autoregressive diffusion and a production project focused on an O1-style model Doubao-AS1-preview.
🔥 2024.5 I started my internship at ByteDance’s LLM team!
🔥 2024.4 We released LEGENT. An Open Platform for Embodied Agents, the main contributor Zhili Cheng is awesome at unity programming!
🔥 2024.4 We released the paper of MiniCPM in Arxiv. A small LLM with 2.4B non-embedding parameters that rivals Llama-13B or Mistral-7B. We also introduce the WSD learning rate scheduler.