Vanishing Gradients Podcast Por Hugo Bowne-Anderson arte de portada

Vanishing Gradients

Vanishing Gradients

De: Hugo Bowne-Anderson
Escúchala gratis

A podcast for people who build with AI. Long-format conversations with people shaping the field about agents, evals, multimodal systems, data infrastructure, and the tools behind them. Guests include Jeremy Howard (fast.ai), Hamel Husain (Parlance Labs), Shreya Shankar (UC Berkeley), Wes McKinney (creator of pandas), Samuel Colvin (Pydantic) and more.

hugobowne.substack.comHugo Bowne-Anderson
Ciencia
Episodios
  • Episode 72: Why Agents Solve the Wrong Problem (and What Data Scientists Do Instead)
    Mar 20 2026
    I often see what I would consider to be b******t evals, especially in data, like write this dumb SQL. Almost every one of these dumb SQL questions that I’ve seen for benchmarks are just so either obviously easy or overwhelmingly adversarial. They just, they don’t feel valuable as a data scientist, it’s something that you probably would never ask a real data scientist to do. So I went out my way to create real ones. Let me read one to you.Bryan Bischof, Head of AI at Theory Ventures, joins Hugo to talk about what happened when 150 people spent six hours using AI agents to answer real data science questions across SQL tables, log files, and 750,000 PDFs.They Discuss:* Failure Funnels, pinpoint where agent reasoning breaks down using causal-chain binary evaluations instead of vague 1-5 scales;* Median Score: 23 out of 65, what happened when world-class engineers turned agents loose on real data work, and why general-purpose coding agents with human prodding beat fancy frameworks;* Zero-Cost Submissions Kill Trust, without a penalty for wrong answers, agents hill-climb to correct submissions through brute force instead of building confidence;* Data Science is “Zooming”, moving beyond binary decisions to iterative problem framing, refining “does our inventory suck?” into a tractable hypothesis;* MCP as Semantic Layer, model your organization’s proprietary knowledge once and distribute it to whatever LLM interface your team prefers;* The Subagent vs. Tool Debate, a distinction that adds cognitive load without hiding complexity;* Self-Orchestration Gap, agents don’t yet realize they should trigger specialized extraction frameworks like DocETL instead of reading 750K PDFs one by one;* The Future of Evals, from vibe checks to objective functions and continuous user feedback that lets systems converge on reliability.You can also find the full episode on Spotify, Apple Podcasts, and YouTube.You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments!👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort has started. Registration is still open. All sessions are recorded so don’t worry about having missed any. Here is a 25% discount code for readers. 👈LINKS* Bryan Bischof on Twitter/X* Bryan Bischof on LinkedIn* Theory Ventures* The Hunt for a Trustworthy Data Agent (blog post)* America’s Next Top Modeler GitHub repo* Hamel’s evals FAQ: How do I evaluate agentic workflows?* DocETL* LLM Judges and AI Agents at Scale (Hugo’s podcast with Shreya Shankar)* When Your Metrics Are Lying (Cimo Labs)* Lessons from a Year of Building with LLMs (livestream on YouTube)* Bryan Bischof: The Map is Not the Territory (YouTube)* Upcoming Events on Luma* Vanishing Gradients on YouTube* Watch the podcast video on YouTube👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort has started. Registration is still open. All sessions are recorded so don’t worry about having missed any. Here is a 25% discount code for readers. 👈 This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com
    Más Menos
    1 h y 34 m
  • Episode 71: Durable Agents - How to Build AI Systems That Survive a Crash with Samuel Colvin
    Feb 18 2026
    Our thesis is that AI is still just engineering… those people who tell us for fun and profit, that somehow AI is so, so profound, so new, so different from anything that’s gone before that it somehow eclipses the need for good engineering practice are wrong. We need that good engineering practice still, and for the most part, most things are not new. But there are some things that have become more important with AI. One of those is durability.Samuel Colvin, Creator of Pydantic AI, joins Hugo to talk about applying battle-tested software engineering principles to build durable and reliable AI agents.They Discuss:* Production agents require engineering-grade reliability: Unlike messy coding agents, production agents need high constraint, reliability, and the ability to perform hundreds of tasks without drifting into unusual behavior;* Agents are the new “quantum” of AI software: Modern architecture uses discrete “agentlets”: small, specialized building blocks stitched together for sub-tasks within larger, durable systems;* Stop building “chocolate teapot” execution frameworks: Ditch rudimentary snapshotting; use battle-tested durable execution engines like Temporal for robust retry logic and state management;* AI observability will be a native feature: In five years, AI observability will be integrated, with token counts and prompt traces becoming standard features of all observability platforms;* Split agents into deterministic workflows and stochastic activities: Ensure true durability by isolating deterministic workflow logic from stochastic activities (IO, LLM calls) to cache results and prevent redundant model calls;* Type safety is essential for enterprise agents: Sacrificing type safety for flexible graphs leads to unmaintainable software; professional AI engineering demands strict type definitions for parallel node execution and state recovery;* Standardize on OpenTelemetry for portability: Use OpenTelemetry (OTel) to ensure agent traces and logs are portable, preventing vendor lock-in and integrating seamlessly into existing enterprise monitoring.You can also find the full episode on Spotify, Apple Podcasts, and YouTube.You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments!👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Here is a 25% discount code for listeners. 👈LINKS* Samuel Colvin on LinkedIn* Pydantic* Pydantic Stack Demo repo* Deep research example code* Temporal* DBOS (Postgres alternative to Temporal)* Upcoming Events on Luma* Vanishing Gradients on YouTube* Watch the podcast video on YouTube👉Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for listeners.👈https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com
    Más Menos
    51 m
  • Episode 70: 1,400 Production AI Deployments
    Feb 12 2026
    There’s a company who spent almost $50,000 because an agent went into an infinite loop and they forgot about it for a month.It had no failures and I guess no one was monitoring these costs. It’s nice that people do write about that in the database as well. After it happened, they said: watch out for infinite loops. Watch out for cascading tool failures. Watch out for silent failures where the agent reports it has succeeded when it didn’t!We Discuss:* Why the most successful teams are ripping out and rebuilding their agent systems every few weeks as models improve, and why over-engineering now creates technical debt you can’t afford later;* The $50,000 infinite loop disaster and why “silent failures” are the biggest risk in production: agents confidently report success while spiraling into expensive mistakes;* How ELIOS built emergency voice agents with sub-400ms response times by aggressively throwing away context every few seconds, and why these extreme patterns are becoming standard practice;* Why DoorDash uses a three-tier agent architecture (manager, progress tracker, and specialists) with a persistent workspace that lets agents collaborate across hours or days;* Why simple text files and markdown are emerging as the best “continual learning” layer: human-readable memory that persists across sessions without fine-tuning models;* The 100-to-1 problem: for every useful output, tool-calling agents generate 100 tokens of noise, and the three tactics (reduce, offload, isolate) teams use to manage it;* Why companies are choosing Gemini Flash for document processing and Opus for long reasoning chains, and how to match models to your actual usage patterns;* The debate over vector databases versus simple grep and cat, and why giving agents standard command-line tools often beats complex APIs;* What “re-architect” as a job title reveals about the shift from 70% scaffolding / 30% model to 90% model / 10% scaffolding, and why knowing when to rip things out is the may be the most important skill today.You can also find the full episode on Spotify, Apple Podcasts, and YouTube.You can also interact directly with the transcript here in NotebookLM: If you do so, let us know anything you find in the comments!👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for readers. 👈Show Notes Links* Alex Strick van Linschoten on LinkedIn* Alex Strick van Linschoten on Twitter/X* LLMOps Database* LLMOps Database Dataset on Hugging Face* Hugo’s MCP Server for LLMOps Database* Alex’s Blog: What 1,200+ Production Deployments Reveal About LLMOps in 2025* Previous Episode: Practical Lessons from 750 Real-World LLM Deployments* Previous Episode: Tales from 400 LLM Deployments* Context Rot Research by Chroma* Hugo’s Post: AI Agent Harness - 3 Principles for Context Engineering* Hugo’s Post: The Rise of Agentic Search* Episode with Nick Moy: The Post-Coding Era* Hugo’s Personal Podcast Prep Skill Gist* Claude Tool Search Documentation* Gastown on GitHub (Steve Yegge)* Welcome to Gastown by Steve Yegge* ZenML - Open Source MLOps & LLMOps Framework* Upcoming Events on Luma* Vanishing Gradients on YouTube* Watch the podcast livestream on YouTube* Join the final cohort of our Building AI Applications course in March, 2026 (25% off for listeners)👉 Want to learn more about Building AI-Powered Software? Check out our Building AI Applications course. It’s a live cohort with hands on exercises and office hours. Our final cohort starts March 10, 2026. Here is a 25% discount code for readers. 👈 This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hugobowne.substack.com
    Más Menos
    1 h y 10 m
Todavía no hay opiniones