# Production AI Engineering > A common-sense guide for engineers shipping LLM-backed systems to production. Covers foundations, retrieval, agents, evaluation, and production concerns in a single opinionated document. Written for engineers, not procurement. The site hosts one long-form article. The markdown source is the preferred form for LLM consumption — it is the same content the HTML is rendered from, with no navigation chrome. Headings, tables, and code blocks are stable section anchors. Stance: opinionated where evidence supports it (hybrid retrieval over pure vector, native structured outputs over "please return JSON", host-UI HITL over model-generated confirmations), neutral where it does not. Does not claim novelty — claims usefulness as a single reference. ## Article - [Production AI Engineering (markdown)](https://ai.jokokko.com/production-ai-engineering.md): Full article in raw markdown. Preferred source for LLM ingestion. - [Production AI Engineering (llms-full.txt)](https://ai.jokokko.com/llms-full.txt): Full article with a metadata header inlined — single-fetch ingestion target for AI crawlers. - [Production AI Engineering (HTML)](https://ai.jokokko.com/): Same content rendered for web reading. Self-contained single-file page. - [Production AI Engineering (PDF)](https://ai.jokokko.com/production-ai-engineering.pdf): Printable/readable PDF generated from the web version. ## Sections - [TL;DR](https://ai.jokokko.com/#tldr-top-5-if-you-read-nothing-else): Five highest-leverage recommendations — eval set, prompt caching, hybrid retrieval + rerank, native structured outputs, agent budget caps. - [1. Foundations](https://ai.jokokko.com/#1-foundations): Classical vs. LLM systems, request cycle, model selection, transformer mechanics, controlling randomness (temperature, top_p). - [2. Context engineering and RAG](https://ai.jokokko.com/#2-context-engineering-and-rag): Prompt-engineering principles, chunking and enrichment, hybrid retrieval (vector + BM25 via RRF), reranking, long-context vs. RAG tradeoffs, retrieval evaluation, semantic caching. - [3. Agents](https://ai.jokokko.com/#3-agents): Agent loop, MCP, workflow vs. agent patterns, structured outputs and constrained decoding, tool design, resilience, side effects, sandboxing, HITL. - [4. Evaluation](https://ai.jokokko.com/#4-evaluation): The eval loop, error analysis, LLM-as-judge, human evaluation, synthetic and adversarial testing. - [5. Production](https://ai.jokokko.com/#5-production): Quantization, fine-tuning, guardrails, observability and telemetry, streaming UX, pre-launch checklist. ## Optional - [Source repository](https://github.com/jokokko/ai.jokokko.com): GitHub repo containing the article source and rendered web artifacts. - [Author](https://jokokko.com): Joona-Pekka Kokko. - [License](https://github.com/jokokko/ai.jokokko.com/blob/master/LICENSE): Content licensed under CC BY 4.0.