LATEST ESSAY — MAY 2026
Notes on serving the systems we don't yet understand.
Long-form essays on the infrastructure, evaluation, and governance of modern AI — written for engineers who have to ship them and for the people who have to trust the result.
Senior Manager, AI & Data Security at Hyland · Writing on serving infrastructure, agentic evaluation, and the gap between deployment and governance.
Featured
Why You Can't Serve LLMs Like Regular Models
MAY 2026
•
25 MIN READ
Evergreen
AI INFRASTRUCTURE
A practical walk through the five fundamental differences between traditional ML inference and modern LLM serving — continuous batching, prefill/decode disaggregation, PagedAttention, prefix-aware routing, and Mixture-of-Experts sharding. Built around hand-drawn diagrams and interactive simulators.
Varun Varia
Hyland · AI & Data Security