LATEST ESSAY — MAY 2026

Notes on serving the systems we don't yet understand.

Long-form essays on the infrastructure, evaluation, and governance of modern AI — written for engineers who have to ship them and for the people who have to trust the result.

Read the latest essay

Senior Manager, AI & Data Security at Hyland · Writing on serving infrastructure, agentic evaluation, and the gap between deployment and governance.

Featured

Why You Can't Serve LLMs Like Regular Models

MAY 2026 25 MIN READ Evergreen AI INFRASTRUCTURE

A practical walk through the five fundamental differences between traditional ML inference and modern LLM serving — continuous batching, prefill/decode disaggregation, PagedAttention, prefix-aware routing, and Mixture-of-Experts sharding. Built around hand-drawn diagrams and interactive simulators.

Varun Varia
Hyland · AI & Data Security