Forget Agent Skills
Everything we’re learning from builders who use agents every day: workflows, harnesses, review loops, and production-ready systems.
Forget agent skills, forget subagents, forget OpenClaw, forget autoresearch, forget ralph loops, forget your Twitter timeline.
What are people at the top of the game actually building with agents, and what does their day-to-day workflow look like?
Below you’ll find lessons, conversations, and resources on agent workflows, production-ready agents, and agentic data science, including:
Why builders we trust seem more obsessed with verification, memory, review, personal software, and workflow design than with swarms or autonomous loops
How coding-agent harnesses are changing as frontier models absorb planning, orchestration, and reasoning patterns into the model itself (Nicolay Gerold, Amp Code)
How data science can get back to better decisions, using agents, causal tooling, and Bayesian workflows instead of just making dashboards faster (Thomas Wiecki, PyMC Labs)
Quick links below to what’s coming up, what just dropped, and how to plug in:
Live online events
Jun 18: Coding Agents are Dead with Nicolay Gerold (Amp Code)
Jun 19: Show Us Your (Agent) Skills Ep. 05 with Isaac Flath (Kentro Tech, ex-Answer.AI), John Berryman (Arcturus Labs, early GitHub Copilot, O’Reilly author), Thomas Wiecki (PyMC Labs), and Matt Palmer (Conductor, ex-Replit)
Jul 2: How to Build A Coding Agent with Nicolay Gerold (Amp Code)
Podcasts & Recordings
Building effective agent harnesses with Doug Turnbull (led search at Reddit, Shopify, and Wikipedia): Watch the lightning lesson
The Future of Agentic Data Science with Thomas Wiecki (PyMC Labs): Listen to podcast
The Economic Reality of AI: Friction, Talent, and the Future of the Firm with Steve Tadelis (UC Berkeley, ex-eBay and Amazon): Listen to podcast
Agent-Harness.ipynb* with Vincent Warmerdam (marimo): Listen to podcast
Agentic Engineering and the Lost Art of Verification with Wes McKinney (Posit, creator of pandas), Jeremiah Lowin (Prefect), and Randy Olson (Good Eye Labs): Listen to podcast
The 100-Year Lead: What Baseball Teaches Us About the Future of AI with Chris Fonnesbeck (PyMC Labs, creator of PyMC): Listen to podcast
How to Evaluate Agentic Workflows (Show Us Your (Agent) Skills Ep. 04) with Hamel Husain, Chris Fonnesbeck, and Doug Turnbull. Skill scepticism, plan review, implementation review, agentic search, and hidden holdout tests: Watch on YouTube
From Skills to Agent Harnesses (Show Us Your (Agent) Skills Ep. 03) with Paul Iusztin, Eleanor Berger, Alan Nichol, Vincent Warmerdam, Nicolay Gerold, Matthew Honnibal, and Ines Montani. Research memory, local boxes, debug panes, live notebooks, video generation, and code repair: Watch on YouTube
“Notebooks as Canvas: 3 Live Demos for the Agent Era” with Vincent Warmerdam (marimo): Watch on YouTube
The Future of Agentic Data Science with Thomas Wiecki (PyMC Labs, Decision.AI): Watch on YouTube
Building Agents That Improve the Workflow (Show Us Your (Agent) Skills Ep. 02) with Hilary Mason, Bryan Bischof, Eric Ma, and Tomasz Tunguz. Prompt & context refinement, eval-driven charts, human-in-the-loop EDA, and local-first inference: Watch on YouTube
Blog posts and essays
The Agentic Data Science Research Lab with Thomas Wiecki (PyMC Labs)
The Agentic Software Factory with Thomas Wiecki (PyMC Labs), Wes McKinney (Posit, creator of pandas), Jeremiah Lowin (Prefect), and Randy Olson (Good Eye Labs)
15 Privacy Questions Every AI Builder is Asking with Katharine Jarmul (author of Practical Data Privacy)
Courses
Aug 27: Build Production-Ready AI Agents for the Enterprise with Doug Turnbull (led search at Reddit, Shopify, and Wikipedia)
Build Production-Ready AI Agents for the Enterprise
We asked 1,000 builders what they needed to learn next.
Again and again, the answer was production-ready AI agents for the enterprise.
So I’m launching a new hands-on course with Doug Turnbull, agentic search expert and former search lead at Reddit and Shopify. Doug has spent years thinking about retrieval, ranking, evaluation, and the messy systems work that makes search useful in production, which is exactly the kind of discipline agent builders need now.
We’ll build an e-commerce agent from the ground up, covering agent loops, SDKs, MCP, retrieval, evals, and deployment, so you leave with a working architecture you can adapt to your own company data.
If you want a preview of the kind of thinking behind the course, Doug and I recently did a lightning lesson on building effective agent harnesses. I also put together a reading list on building agents and agent harnesses: tools, memory, evals, hooks, orchestration patterns, and recent conversations with people like Ivan Leo, Jeff Huber, Doug Turnbull, and Lance Martin.
Very early bird pricing is $750 until July 1.
-> Join Build Production-Ready AI Agents for the Enterprise
-> Watch the lightning lesson or browse the agent harness reading list
Coding Agents Are Dead. Here’s How to Build One.
AI agents exploded by adding more scaffolding around models: planning systems, retrieval pipelines, memory layers, reflection loops, tool orchestration, multi-agent workflows.
But frontier models are starting to absorb many of those capabilities directly. A lot of today’s “best practices” for building agents may already be temporary.
I’m talking with Nicolay Gerold from Amp Code about what’s actually changing in agent engineering right now: what still matters, what’s collapsing into the model itself, and how builders should adapt. Amp is one of the coolest and most interesting coding agents out there, so I’m excited to get concrete about what coding-agent infrastructure looks like in production.
We’ll cover why newer models often break older harnesses, the shift from prompt engineering to context engineering, why planning and orchestration are moving into the model, when retrieval and context management still matter, and where durable engineering leverage remains for people building agents.
Then on July 2, Nicolay and I are doing the hands-on version: How to Build a Coding Agent. We’ll build a modern coding-agent harness with TypeScript and Pi that can search and navigate large codebases, manage context across long-running tasks, execute tools, recover from failures, coordinate execution loops, and use modern reasoning models without overengineering the harness.
-> Register for Coding Agents Are Dead
-> Join How to Build a Coding Agent
Show Us Your Agent Workflows
What are people at the top of the game actually building with agents, and what does their day-to-day workflow look like?
Forget agent skills, subagents, OpenClaw, autoresearch, ralph loops, and whatever your Twitter timeline is excited about this week.
Thomas Wiecki and I have now spent roughly 10 hours talking with 16 Python, data, ML, and AI builders we trust about what they’re actually using, what works, and what doesn’t.
The builders we trust keep coming back to verification, memory, review, personal software, and workflow design. Much less swarms, autonomous loops, or agent frameworks. More surfaces, checks, context, and handoffs that let humans and agents work together without losing the plot.
For Show Us Your (Agent) Skills Ep. 05, Thomas Wiecki and I are joined by John Berryman (Arcturus Labs, early engineer on GitHub Copilot, O’Reilly author), Isaac Flath (Kentro Tech, ex-Answer.AI), and Matt Palmer (Conductor, ex-Replit). We want the harness, the workflow, the artifacts, and the parts that make the machine useful.
John Berryman will dig into building harnesses and designing agents that are useful beyond software engineering. He wants more people building interesting AI products, not just using them, and he is thinking hard about what makes agent workflows transferable into other domains.
Isaac Flath is bringing the personal writing++ tool he says would hurt his productivity the most if it were taken away. He uses it every working hour for blog posts, skills, presentations, diagrams, specs, notes, emails, brainstorms, todos, HTML prototypes, and code. Writing is thinking, and Isaac has built a system around that.
And Matt Palmer will show how he uses Conductor to build projects, including Conductor itself. Conductor lets you run Claude Code, Codex, and Cursor in parallel, so I am particularly excited to see what it looks like when the person building the machine uses the machine to build the machine.
The show archive now has previous episodes and guest dossiers: Hamel Husain on skill scepticism and constraints, Chris Fonnesbeck on review loops for Bayesian modeling and agent-written code, Doug Turnbull on agentic search and hidden validation, Eleanor Berger on local boxes and agent boundaries, Vincent Warmerdam on notebooks as shared state, and more.
-> Register to join us live this week, or get the recording afterwards
-> Browse previous episodes and guest workflows
The Agentic Software Factory
In Ep. 01 of Show Us Your (Agent) Skills, Wes McKinney (Posit, creator of pandas), Jeremiah Lowin (Prefect, FastMCP), and Randy Olson (Good Eye Labs) showed three ways agents can work inside a factory of reviewable workflows.
Wes had long-running agents, commits every turn, and review queues that agents need to drain. Jeremiah had personal commands like explain, ship-it, and GitHub replies: tiny pieces of personal software that make agents act more like teammates and less like autocomplete. Randy had chart workflows and LLM-as-judge review loops that keep visual work tied to actual communicative quality.
The agentic software factory is many small workflows, explicit handoffs, strong review loops, and tools that make agent work inspectable.
-> Read The Agentic Software Factory
-> Listen to the podcast or browse the Ep. 01 guest workflows
Notebooks as a Shared Canvas for Agents and Humans
How can we reimagine data, computation, and notebooks so they become a canvas on which we can collaborate with agents?
I talked about this recently with Vincent D. Warmerdam of marimo. Vincent has been building open-source tools for data people for years, and he thinks the Python notebook is evolving from a static scratchpad into a working agent harness.
Notebooks can become shared memory for humans and agents. Agents can manipulate global state, generate code a cell or two at a time, and expose their reasoning through interactive UI elements. Humans can stay in the loop visually: dragging sliders, inspecting outputs, and turning code into something closer to a physical object.
We talked about giving agents modular “Lego” components instead of raw boilerplate; letting algorithms dictate which visualization is needed rather than choosing charts up front; using hooks and linters to constrain what an agent can touch; and remembering that sometimes the best AI workflow still starts with pen and paper, a walk, and enough calm to know what you’re actually trying to build.
Vincent also makes a useful point about models: sometimes a faster, “worse” open-weight model is better for exploratory work because it keeps you alert and engaged. If the model is too good, you can slip into the slot-machine state: click, wait, trust, repeat. Notebooks may be one of the best interfaces we have for resisting that passivity.
-> Listen to the podcast or watch the full conversation on YouTube
-> Watch Vincent’s three demos
The Agentic Data Science Research Lab
Data science was supposed to help organizations make better decisions.
Instead, in too many places, it became a dashboard factory. Fifteen years in, most data science teams are still fighting not to be seen as a cost center.
Thomas Wiecki of PyMC Labs thinks we’re one shift away from finally getting the version we were promised. In our recent conversation and companion essay, we explore how AI agents may help data science deliver on its original promise: not coding faster, but making better decisions.
Three things are making that possible: decision science is finally tractable, causal and Bayesian tooling like PyMC and DoWhy is mature and battle-tested, and agentic interfaces can remove the expert bottleneck. Agentic data science means end-to-end, agent-driven causal analysis that gives you the right answers, not just any answers.
We also get into agentic dashboards, encoding professional judgment as reusable skills, and why grounding decisions in generative processes is one of the real guardrails against hallucination. This is not about replacing data scientists with agents. It is about giving data scientists leverage over the work that has always mattered most: asking better questions, representing uncertainty, and helping teams make decisions they can defend.
-> Listen to the podcast, watch on YouTube, or read the essay
Want to Support Vanishing Gradients?
If you’ve been enjoying Vanishing Gradients and want to support my work, here are a few ways to do so:
🧑🏫 Join (or share) my AI agents course: I’m teaching Build Production-Ready AI Agents for the Enterprise with Doug Turnbull (former search lead at Reddit and Shopify). If you or your team need to build agents that work with real company data, retrieval, evals, tools, and deployment, we’d love to have you.
📣 Spread the word: If you find this newsletter valuable, share it with a friend, colleague, or your team. More thoughtful readers = better conversations.
📅 Stay in the loop: Subscribe to the Vanishing Gradients calendar on lu.ma to get notified about livestreams, workshops, and events.
▶️ Subscribe to the YouTube channel: Get full episodes, livestreams, and AI deep dives. Subscribe here.
💡 Work with me: I help teams navigate AI, data, and ML strategy. If your company needs guidance, feel free to reach out by hitting reply.
Thanks for reading Vanishing Gradients! Subscribe for free to receive new posts and support my work.
If you’re enjoying it, consider sharing it, dropping a comment, or giving it a like: it helps more people find it.
Until next time ✌️
Hugo







