In this talk, we'll explore the practical journey of building and running a local AI coding setup: choosing models, hosting them on consumer hardware, connecting frontends like LM Studio, and evaluating what really works (and what doesn't). We'll discuss trade-offs in latency, memory, and tool integration, the role of KV cache and model routing, and how far open-source models can go in replicating commercial AI dev environments.