Salman Munaf

Salman Munaf

Salman Munaf

Lead Site Reliability Engineer · TikTok

Bio

Salman Munaf is a Lead Site Reliability Engineer at TikTok, where he builds and operates large-scale video infrastructure serving millions of users. He specializes in distributed systems, observability, and reliability at scale, with prior experience as a Software Engineer at Meta. Salman is passionate about helping developers embed reliability into their workflows from day one, making complex systems more resilient and easier to operate.

Session (1)

Salman Munaf is a Lead Site Reliability Engineer at TikTok who presented "Observability for LLMs: Understanding What's Happening Under the Hood" at SREcon26 Americas (March 2026), covering why traditional metrics fall short for AI systems and which signals (token throughput, GPU utilization, memory pressure) actually matter. Attendees can expect production-tested frameworks for monitoring LLM-driven products drawn from one of the world's highest-traffic platforms.

Recent talks (1)

Public activity researched automatically · as of Jun 2026