
Varun Singh is currently pre-training lead at Arcee AI where he works on the end to end pre-training of large language models, with a strong interest in architecture and optimization. He has led the pre-training of Arcee's Trinity series of models, ranging from a 6B mixture-of-experts to a 400B mixture-of-experts model.
Varun Singh went from software engineer to leading pre-training on a 400B-parameter sparse MoE model (Trinity Large) at Arcee AI in roughly one year, and is first author of the Trinity Large technical report — making him a rare practitioner with hands-on frontier open-source pre-training experience at startup scale. Attending his session offers direct insight into the practical engineering and data decisions that let Arcee train a model rivaling closed-source frontier systems on a fraction of the budget.
Public activity researched automatically · as of Jun 2026