
Philip Kiely leads Developer Relations at Baseten. Prior to joining Baseten in 2022, he worked across software engineering and technical writing for a variety of startups. Outside of work, you'll find Philip practicing martial arts, reading a new book, or cheering for his adopted bay area sports teams.
Philip Kiely leads Developer Relations at Baseten and is the author of "Inference Engineering" (Baseten, 2026), a practical guide to running AI models in production covering quantization, speculative decoding, KV cache reuse, model parallelism, and multi-modal serving. His recent talks, podcast appearances, and technical writing make him a high-signal speaker for engineers building or optimizing production AI inference infrastructure.
Inference Engineering for Hypergrowth with Philip KielySigsum 2025 · Oct 2025
Inference Engineering Is the Next Big Role in AI w/ Philip KielyBig Ideas in App Architecture
Interview: Philip Kiely (Baseten) on production-grade AI and the shift toward open-source inferenceDec 2025Sponsor Session: Low-Precision Inference without Quality Loss: Selective Quantization and MicroscalingPyTorch Conference 2025 · Oct 2025
Philip Kiely | Inference EngineeringCTO ConfidentialPublic activity researched automatically · as of Jun 2026