Building Reliable GenAI Systems: Lessons, Conversations, and Tools

Exploring the Foundations of Modern Data Science

Dec 03, 2024

Welcome back to Vanishing Gradients! In this edition, we explore the challenges and opportunities in building reliable generative AI systems, featuring key takeaways from industry leaders and recent workshops. Whether you’re scaling AI into production or guiding data teams, there’s plenty here to help you tackle today’s pressing challenges.

Here’s what’s inside:

Lessons from Stefan Krawczyk (CEO of Dagworks, ex-StitchFix) on building robust GenAI systems and overcoming challenges like hallucinations and unpredictability.
A discussion with Gabriel Weintraub, the Amman Mineral Professor at Stanford GSB, on how to create data-driven cultures and the value of experimentation in organizations.
A conversation with Ravin Kumar, Senior Research Data Scientist at Google Labs, about starting with evaluations and building AI systems that deliver real-world value.
Tutorials, fireside chats, and live events tailored for professionals working to build reliable ML and AI systems.

Let’s dive in!
📖 Reading time: 8 minutes

Building Reliable GenAI Systems: Lessons from a Lightning Lesson with Stefan Krawczyk

What’s the difference between traditional software development and generative AI? It’s not just the technology—it’s the process. Traditional software follows a clear path: build, test, deploy. GenAI, on the other hand, requires a continuous, iterative approach that spans both development and production.

Last week, Stefan Krawczyk (CEO of Dagworks, ex-StitchFix) and I hosted a Maven Lightning Lesson that brought together over 300 participants to tackle these unique challenges.

We explored the real hurdles many face after the initial excitement of a flashy demo:

Unpredictable outputs (non-determinism): How do you manage variability in GenAI systems?
Accuracy issues (hallucinations): What steps can you take to ensure reliable results?
Scaling to production: How do you turn MVPs into production-ready systems with measurable ROI?

In one of the key takeaways, Stefan shared insights on how to adapt traditional software development practices to GenAI, emphasizing the need for robust logging, iterative development, and aligning outputs with business objectives.

If you missed it, here’s a link to the full session, along with two short clips where we discuss these challenges:

1️⃣ Stefan on navigating the differences between traditional software and GenAI development:

2️⃣ My thoughts on tackling hallucinations, unpredictability, and building for business impact:

And if this resonated with you, we’re expanding on these ideas in our 4-week course, Building GenAI Applications Using First Principles. In the course, you’ll learn how to:

Design and deploy robust, end-to-end GenAI systems.
Confront non-determinism and align outputs with business goals.
Iterate effectively with real-world logging and monitoring techniques.
Build production-ready applications, like a PDF-querying system powered by multimodal models.

🛠️ Cost: $800 (discounts available; email me for a code).

📍 Learn more and register for the course here.
Let me know what’s been your biggest challenge in building GenAI systems—I’d love to hear from you!

Building Data-Driven Cultures: A Conversation with Gabriel Weintraub

💭 Does your leadership ask for AI solutions before your team even has access to clean, reliable data? Too often, organizations leap into AI without laying the necessary groundwork, leading to wasted effort and frustration.

In the latest episode of High Signal, I spoke with Gabriel Weintraub, the Amman Mineral Professor at Stanford Graduate School of Business, about the strategies organizations need to build successful data-driven cultures.

We covered foundational strategies, the value of experimentation, and the importance of local innovation:

📊 Building Foundations Before AI: Start with reliable data and high-ROI, low-complexity projects to ensure sustainable AI success.
🤝 Closing the Gap Between Leadership and Data Teams: Collaboration between executives and technical teams is essential for aligning priorities and solving real business problems.
🔬 Experimentation as a Cultural Shift: Even “negative” results are valuable. Gabriel discusses how fostering a culture of experimentation leads to better learning and outcomes.
🌍 The Role of Startups and Local Innovation: Gabriel highlights Latin America’s untapped potential in AI and the importance of building local solutions.
🎬 Watch Gabriel share why starting with reliable data and foundational analytics is critical for success:

Catch the full episode here or wherever you listen to podcasts.

💭 How does your organization approach decision-making—by letting the data speak or by deferring to the loudest voice in the room? Let me know what you think!

Start with Evaluations: Insights from Ravin Kumar on Building Better AI Systems

Building generative AI systems isn’t just about choosing the right model—it’s about creating solutions that deliver real-world impact. In the latest episode of Vanishing Gradients, I sat down with Ravin Kumar, Senior Research Data Scientist at Google Labs, to dive into the principles of building impactful AI systems.

Here’s what we discussed:
👉 Ravin’s unique journey from SpaceX to Sweetgreen to Google Labs, and the lessons he’s learned about creating AI systems that work in practice.

⚙️ Defining meaningful evaluations: Why starting with evaluations—rather than jumping straight into model selection—is key to building principled AI products.

📈 Real-world applications: From helping small businesses like bakeries to scaling generative AI systems at Google, Ravin shared how AI can bridge the gap between technology and business outcomes.

Want to hear more? Catch the full episode here (or on your app of choice).

High Signal: Exploring the Foundations of Modern Data Science

Last week, Duncan Gilchrist (Delphina, ex-Uber) and I joined Demetrios on an MLOps Community LinkedIn Live to talk about the High Signal podcast and why we’re doing it: to bring the best from the best and give everyone clear, actionable insights about their careers in data and AI.

Here are some of the critical themes we discussed from the podcast, drawing from top minds in the field:

Reasoning Under Uncertainty: Michael Jordan (UC Berkeley) emphasized frameworks that integrate computation, statistics, and economics to build intelligent AI infrastructure.
Simulation and Data-Generating Processes: Andrew Gelman (Columbia) highlighted the importance of simulations to understand how data is generated and inform robust analysis.
Organizational Strategies for AI Success: Chiara Farronato (Harvard Business School) shared how social and organizational structures can enable rapid, business-aligned product delivery.
Becoming Self-Learning Organizations: Ramesh Johari (Stanford, Uber, Airbnb, and more) discussed leveraging online experimentation to drive continuous improvement.
Strong Data Foundations: Gabriel Weintraub (Stanford) underscored the pivotal role of well-structured data in powering AI systems.

📢 Coming up: Hilary Mason will share her thoughts on data science in the age of large language models (LLMs), offering a timely perspective on how the field is evolving.

Hit reply and let me know: Do these themes resonate with you?Are there other concerns or ideas you think are fundamental to modern data science and AI?

Check out High Signal wherever you listen to podcasts to hear more from these conversations and also check out the cool stuff Demetrios and team are up to at MLOPs Community.

Other Highlights

Here are a few other things I’ve been up to recently and what’s coming up:

🎙️ Upcoming Data Dialog with Geetu Ambwani

Join me on Thursday, December 5, at 12 PM ET, for a conversation with Geetu Ambwani (Spring Health, Flatiron Health, HuffPost) about:

Positioning data science as a strategic function
Building impactful data products in the generative AI era.
Navigating career pathways for data leaders.

📅 Apply to join

💻 PyData NYC Tutorial Live on YT!

My PyData NYC tutorial, "Building Your First Multimodal GenAI App," is now available on YouTube!

Watch the tutorial
Explore the code on GitHub

🔥 Outerbounds Fireside Chat

I recently had a conversation with Alexander Filipchik (Head of Infrastructure, Cloud Kitchens) about turning ML and AI into engineering disciplines. You can check it out here.

That’s It for Now

Thanks for reading Vanishing Gradients! I’d love to hear your thoughts—what resonated with you, what you’d like to see more of, and how I can make this newsletter more relevant to your interests.

To stay up to date on livestreams, events, and new episodes, subscribe to the Vanishing Gradients calendar on lu.ma or follow us on YouTube.

Looking forward to continuing the conversation in the next edition!
Hugo

Vanishing Gradients