Engineering LLM systems course: From PoC purgatory to production + $2,500 in free compute credits

Includes guest lectures from DeepMind, Moderna, and more — and a structured approach to building real LLM apps.

Apr 03, 2025

Most LLM apps aren’t apps.

They’re bundles of brittle prompts, chained together with glue code and the latest agentic framework no one can introspect.

They break when data shifts or you change the model. Fail silently. And rarely improve with usage.

That’s why I started teaching Building LLM Applications for Data Scientists and Software Engineers — a course for engineers and data scientists who want to ship real systems. Not vibes-based demos. Not fragile agents. Just structured, evaluable, production-grade software.

The next cohort starts April 7th! Don't worry if you can't attend all live sessions - you'll get lifetime access to all workshops, guest lectures, and code. If you've been following my work on evaluation frameworks, synthetic data, and structured development workflows — this is the deeper dive.

💸 The course includes over $2,500 in cloud credits — from Modal, BaseTen, Google Cloud, and more — so you can actually build and deploy during the course.

Guest lecturers include builders and engineers from DeepMind, Moderna, and more — including Ravin Kumar, Eric Ma, Katharine Jarmul, Charles Frye, and Hamel Husain.

👉 I know not everyone’s in a position to join right now, so I’ve also included a full stack of free resources at the end of this post. Let me know if they’re useful.

Traditional versus GenAI software: Excitement builds steadily—or crashes after the demo.

Thanks for reading Vanishing Gradients! This post is public so feel free to share it.

“This course covers a wide range of essential topics and uses a practical teaching style that cuts through complexity… The guest lectures are especially informative, as they show how different LLM-based solutions are being deployed in production.”

– Nishant H., Staff Analytics Engineer, Netflix

This isn’t prompt engineering theater or agentic cosplay. It’s software engineering — applied to LLMs and GenAI.

In this course, you’ll build systems that don’t just run — they evolve, adapt, and hold up in production. You’ll learn to:

Design, test, and deploy structured LLM apps;
Evaluate them rigorously — before they break in prod;
Add observability, logging, and feedback loops.

Walk away with a portfolio-ready project and clear technical process.

🧠 What’s new in Cohort 2

This round, we’re adding more structure, support, and depth.

🧱 Builders in Residence

We launched a Builders-in-Residence program: engineers from Carvana, Included Health, and Salesforce — who built serious GenAI systems during Cohort 1 — are now back, running weekly sessions to help you debug, iterate, and ship.

Nathan Danielsen (Carvana) — building domain-specific RAG pipelines using small + frontier LLMs;
William Horton (Included Health) — built internal GenAI platforms for clinicians and ops;
Geoffrey Pidcock (ex- Salesforce, Atlassian, ANSTO) — brings product intuition and platform discipline.

They’ll run weekly sessions and stay active in Discord to offer practical support.

🔁 SDLC-first approach

We structure the course around the full software development lifecycle:

Build → Deploy → Monitor → Evaluate → Repeat

That’s what separates GenAI apps that survive from ones that don’t.

To support all this, we’ve partnered with leading providers to give you the compute and tools you need to actually build (see below) — from open-weight models (Mistral) to cloud-native agent orchestration (Gemini, Modal, HuggingFace).

These changes make Cohort 2 more hands-on, production-focused, and community-driven than ever before.

In AI systems, evaluation and monitoring don’t come last—they drive the build process from day one.

🎓 Guest lectures from industry experts & Cohort 1 lectures now unlocked

In Cohort 1, we brought in engineers from across the industry – not only are we bringing more in Cohort 2:

Ravin Kumar (DeepMind) — End-to-end LLM product development
Eric Ma (Moderna) — How We Build Agents at Moderna Therapeutics
Hamel Husain (Parlance Labs) – A Field Guide to AI Engineering
Katharine Jarmul (kjamistan) – Adversarial Threats in LLM Systems: A Practical Guide
Ines Montani (spaCy / Prodigy) – Human-in-the-Loop Development and Distillation Workflows (OR Applied NLP in the Age of Generative AI)

And more! But you’ll also get access to all guest sessions from Cohort 1.

“This course has been an invaluable learning experience that has far surpassed all my expectations… very engaging and high-quality guest speaker sessions.”

– Pasquale, Data Scientist, US Air Force

“I subscribed to this course because of its title on how to build production grade LLM systems. It definitely lived up to the expectations I had… It taught me theoretical as well as practical mode of implementing a LLM based product.”

– Badrinath, Principal Engineer, Wells Fargo

🔧 Tooling and compute included

Cohort 2 includes access to the real-world tools you’ll need to build:

💻 $1,000 in Modal credits for cloud compute – giving you resources to power real AI applications;
🤖$1,000 in BaseTen credits for deploying, monitoring, and iterating on LLM-powered apps;
☁️ $300 Google Cloud & Gemini credits (first 100);
🔓 $100 Mistral credits (first 50) — for open-weight experimentation;
🎨$100 in Replicate Credits (first 50) for building and deploying multimodal apps;
🤗 6 months HuggingFace Pro (first 250);
✍️ 3 months Learn Prompting Plus.
🎨 6 months of free access to Prodigy (from the creators of spaCy) – a powerful annotation and model improvement tool used for human-in-the-loop training, rapid iteration, and custom NLP workflows.

These aren’t gimmicks. They’re infrastructure for building and iterating during (and after) the course. Visualizing how we move from prompting to measurable improvement, here’s what evaluation actually looks like inside the course:

*The evaluation loop we teach: test sets, structured outputs, and LLM-as-judge — before real users ever see your app.*

🎁 Free resources for serious builders

Not ready to join? No problem — here’s a set of high-signal resources that lay the foundation for building LLM systems that actually work:

📘 Evaluation-Driven Development: Escaping PoC Purgatory (O’Reilly Radar) - How to move beyond fragile prototypes and vibes-based iteration — introducing a practical SDLC for GenAI centered around evaluation, feedback loops, and testability.
📘 Beyond Prompt-and-Pray (O’Reilly Radar) - A short, direct piece on why prompt engineering alone isn’t enough — and how to think more like a software engineer when building with LLMs.
🎥 Lightning Lesson: Evaluation & Synthetic Data - A 30-minute crash course in evaluation-driven development — showing how to use synthetic data to define success before real users arrive.
✉️ 10-Day Email Course: Build Reliable LLM-Powered Apps with Evaluation-Driven Development - A structured walkthrough of modern LLM app development — covering prompting, iteration, testing, observability, and agentic workflows, with a focus on real-world use.
📚 Essential LLM & AI Engineering Resources: A Curated Guide for Engineers & Data Scientists - A curated set of tools, papers, and guides for building structured LLM apps — from prompt chaining and logging to vector DBs and beyond.
🤖 Master AI Agents: A Handpicked Guide to the Best Resources - A systems-first collection of design patterns, failure modes, and debugging strategies — for building agents you can introspect, test, and trust.
🎓 Building AI Apps for Real-World Use Cases: From Basics to Production with Ravin Kumar (Deep Mind, Google, ex-Tesla) - A live walkthrough of building a fully local LLM app using Ollama and Google’s Gemma models, covering everything from retrieval to evaluation to observability — no cloud dependencies, no LangChain.
⚙️ Building AI Agents with Gemma 3 - A live session focused on production-grade AI agents: using function calling, open models, and transparent execution — co-led with Ravin (DeepMind).

This is the mindset shift we teach — and the foundation for evaluation-driven development:

*From subjective vibes to structured, reproducible evaluation — this is the shift the course (and this lesson) helps you make.*

💸 Substack reader bonus

Use code LLM10 for 10% off — and I’ll also offer a free 15-minute consult to help you scope your LLM project, evaluate tools, or talk through your goals (Available to the first 15 who use the code).

Want to bring a colleague? Email me and we’ll set you both up with 20% off.

🚀 Join Cohort 2 (starts April 7)

If you’re serious about moving beyond PoCs and building systems that work — we’d love to have you.

👉 Enroll now (starts April 7) or view the full syllabus

— Hugo

Vanishing Gradients