Episodios

  • Conditional Intelligence: Inside the Mixture of Experts architecture
    Oct 7 2025

    Send us a text

    What if not every part of an AI model needed to think at once? In this episode, we unpack Mixture of Experts, the architecture behind efficient large language models like Mixtral. From conditional computation and sparse activation to routing, load balancing, and the fight against router collapse, we explore how MoE breaks the old link between size and compute. As scaling hits physical and economic limits, could selective intelligence be the next leap toward general intelligence?

    Sources

    • What is mixture of experts? (IBM)
    • Applying Mixture of Experts in LLM Architectures (Nvidia)
    • A 2025 Guide to Mixture-of-Experts for Lean LLMs
    • A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
    Más Menos
    14 m
  • Protocols for the AI Age: Unpacking MCP, A2A, and AP2
    Sep 26 2025

    Send us a text

    In this episode of The Second Brain AI Podcast, we dive into the protocols quietly wiring the agentic AI ecosystem. From MCP (Model Context Protocol) that lets models securely access tools, to A2A (Agent-to-Agent) that standardizes how agents collaborate, and AP2 (Agent Payments Protocol) that anchors transactions in cryptographic trust, these frameworks form the plumbing of the AI future.

    We explore why interoperability is the real bottleneck, how these standards build a “digital delegation stack,” and why the future of trust in AI won’t rely on human oversight but on mathematical proof.

    Más Menos
    16 m
  • AI at Work, AI at Home: How we really use LLMs each day?
    Sep 21 2025

    Send us a text

    How are people really using AI, at home, at work, and across the globe? In this episode of The Second Brain AI Podcast, we dive into two reports from OpenAI and Anthropic that reveal the surprising split between consumer and enterprise use.

    From billions in hidden consumer surplus to the rise of automation vs augmentation, and from emerging markets skipping skill gaps to enterprises wrestling with “context bottlenecks,” we explore what these usage patterns mean for productivity, global inequality, and the future of knowledge work.

    Source:

    • Anthropic Economic Index report: Uneven geographic and enterprise AI adoption
    • How people are using ChatGPT
    • Building more helpful ChatGPT experiences for everyone
    Más Menos
    16 m
  • Deterministic by Design: Why "Temp=0" Still Drifts and How to Fix It
    Sep 15 2025

    Send us a text

    Why do LLMs still give different answers even with temperature set to zero? In this episode of The Second Brain AI Podcast, we unpack new research from Thinking Machines Lab on defeating nondeterminism in LLM inference. We cover the surprising role of floating-point math, the real system-level culprit, lack of batch invariance, and how redesigned kernels can finally deliver bit-identical outputs. We also explore the trade-offs, real-world implications for testing and reliability, and how this breakthrough enables reproducible research and true on-policy reinforcement learning.

    Sources:

    • Defeating Nondeterminism in LLM Inference
    • Non-Determinism of “Deterministic” LLM Settings
    Más Menos
    25 m
  • Hallucinations in LLMs: When AI Makes Things Up & How to Stop It
    Sep 8 2025

    Send us a text

    In this episode, we explore why large language models hallucinate and why those hallucinations might actually be a feature, not a bug. Drawing on new research from OpenAI, we break down the science, explain key concepts, and share what this means for the future of AI and discovery.

    Sources:

    • "Why Language Models Hallucinate" (OpenAI)
    Más Menos
    16 m
  • Mind the Context: The Silent Force Shaping AI Decisions
    Jul 16 2025

    Send us a text

    In this episode of we dive into the emerging discipline of context engineering: the practice of curating and managing the information that AI systems rely on to think, reason, and act.

    We unpack why context engineering is becoming important, especially as the use of AI shifts from static chatbots to dynamic, multi-step agents. You'll learn why hallucinations often stem from poor context, not weak models, and how real-world systems like McKinsey's "Lilly" are solving this problem at scale.

    From strategies like write, select, compress, and isolate to key challenges around data fragmentation and semantic unification, this episode breaks down how to design smarter, more reliable AI by managing information, not just prompts.

    Sources:

    • "Beyond Prompts: The Rise of Context Engineering​​" by Rahul Singh
    • "The rise of context engineering" by LangChain
    • "Context Engineering is the New Vibe Coding" by Analytics India Magazine
    • "Why Context Engineering Matters More Than Prompt Engineering" by TowardsAI
    Más Menos
    23 m
  • The SLM Advantage: Rethinking Agent Design with SLMs
    Jun 29 2025

    Send us a text

    In this episode, we explore why Small Language Models (SLMs) are emerging as powerful tools for building agentic AI. From lower costs to smarter design choices, we unpack what makes SLMs uniquely suited for the future of AI agents.

    Source:

    • "Small Language Models are the Future of Agentic AI" by NVIDIA Research
    Más Menos
    20 m
  • Getting to Know LLMs: Generative Models Fundamentals (Part 1)
    Jun 23 2025

    Send us a text

    In this episode, we introduce large language models (LLMs), what they are, how they work at a high level, and why prompting is key to using them effectively. You’ll learn about different types of prompts, how to structure them, and what makes an LLM respond the way it does.

    Source:

    • "Foundations of Large Language Models" by Tong Xiao and Jingbo Zhu
    Más Menos
    22 m