Am I? Podcast Por The AI Risk Network arte de portada

Am I?

Am I?

De: The AI Risk Network
Escúchala gratis

OFERTA POR TIEMPO LIMITADO. Obtén 3 meses por US$0.99 al mes. Obtén esta oferta.
The AI consciousness podcast, hosted by AI safety researcher Cameron Berg and philosopher Milo Reed

theairisknetwork.substack.comThe AI Risk Network
Ciencias Sociales
Episodios
  • Can AI Be Conscious: Monk Reacts | Am I? | EP 9
    Oct 2 2025

    In Am I? Episode #9, philosopher Milo Reed and AI researcher Cameron Berg sit down with Swami Revatikaanta (monk; host of Thinking Bhakti) to explore the Bhagavad Gita’s perspective on consciousness, self, and artificial intelligence.

    From Atman and Brahman to the tension between self-development and technological outsourcing, this conversation dives into timeless spiritual insights with urgent relevance today:

    * Why Vedānta sees consciousness as spirit, not matter — and what that means for AI

    * The danger of outsourcing inner work to machines (and the safe middle ground)

    * How the Bhagavad Gita reframes goals, detachment, and self-development

    * East vs. West: fear of AI vs. ignorance as illusion

    * Atman, Brahman, samsara, and what makes humans “enlivened”

    * Whether AI could ever aid the path to enlightenment

    * Why monks, sages, and spiritual leaders must be part of the AI debate

    This isn’t abstract mysticism — it’s a practical, philosophical exploration of how ancient wisdom collides with cutting-edge AI research, and what it means for our future.

    🔔 Subscribe to The AI Risk Network for weekly conversations on AI alignment, consciousness, and existential risk:

    👍 If you found this episode valuable, don’t forget to like, share, and comment — it really helps spread the word.

    📢 Support our work and join the movement

    #AIalignment #AIrisk #AmI #ArtificialIntelligence #Consciousness #AGIrisk



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit theairisknetwork.substack.com
    Más Menos
    1 h y 15 m
  • One Breakthrough From AGI? | Am I? - After Dark | EP 8
    Sep 25 2025

    In the first edition of Am I? After Dark, Cam and Milo dive into how our relationship with information is being rewired in real time — from filtering the world through AI systems to dreaming about ChatGPT. What does it mean to live at the edge of a technological transformation, and are we just one breakthrough away from true AGI?

    This late-night conversation ranges from the eerie familiarity of interacting with models to the dizzying possibilities of recursive self-improvement and the intelligence explosion. Along the way, they draw lessons from the failure of social media, ask whether AI is becoming our alien other, and wrestle with the psychological boundaries of integrating such powerful systems into our lives.

    In this episode, we explore:

    * Why searching with AI is already better than Google

    * The “grandma effect” — why LLMs feel intuitive in a way past tech didn’t

    * Stress-testing models vs. tiptoeing into use

    * Fringe communities documenting AI’s “reproducible strangeness”

    * What social media teaches us about alignment gone wrong

    * Are we just one paradigm shift from AGI?

    * Terrence McKenna, accelerating events, and the singularity curve

    * The eerie future: WALL-E, Ikea ball pits, or “we’re building the aliens”

    * Merging with AI — inevitable or avoidable?

    * Inside the strange, soap-opera world of AI labs and alignment debates



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit theairisknetwork.substack.com
    Más Menos
    42 m
  • Can Empathy Make AI Honest? | Self–Other Overlap Explained | Am I? | Ep 7
    Sep 18 2025
    AI can look aligned on the surface while quietly optimizing for something else. If that’s true, we need tools that shape what models are on the inside—not just what they say.In this episode, AE Studio’s Cameron and co-host sit down with Mark Carleanu, lead researcher at AE on Self-Other Overlap (SOO). We dig into a pragmatic alignment approach rooted in cognitive neuroscience, new experimental results, and a path to deployment.What we explore in this episode:* What “Self-Other Overlap” means and why internals matter more than behavior* Results: less in-context deception and low alignment tax* How SOO works and the threat model of “alignment faking”* Consciousness, identity, and why AI welfare is on the table* Timelines and risk: sober takes, no drama* Roadmap: from toy setups to frontier lab deployment* Reception and critiques—and how we’re addressing themWhat “Self-Other Overlap” means and why internals matter more than behaviorSOO comes from empathy research: the brain reuses “self” circuitry when modeling others. Mark generalizes this to AI. If a model’s internal representation of “self” overlaps with its representation of “humans,” then helping us is less in conflict with its own aims. In Mark’s early work, cooperative agents showed higher overlap; flipping goals dropped overlap across actions.The punchline: don’t just reward nice behavior. Target the internal representations. Capable models can act aligned to dodge updates while keeping misaligned goals intact. SOO aims at the gears inside.Results: less in-context deception and low alignment taxIn a NeurIPS workshop paper, the team shows an architecture-agnostic way to increase self-other overlap in both LLMs and RL agents. As models scale, in-context deception falls—approaching near-zero in some settings—while capabilities stay basically intact. That’s a low alignment tax.This is not another brittle guardrail. It’s a post-training nudge that plays well with RLHF and other methods. Fewer incentives to scheme, minimal performance hit. 👉 Watch the full episode on YouTube for more insights.How SOO works and the threat model of “alignment faking”You don’t need to perfectly decode a model’s “self” or “other.” You can mathematically “smush” their embeddings—nudging them closer across relevant contexts. When the model’s self and our interests overlap more, dishonest or harmful behavior becomes less rewarding for its internal objectives.This squarely targets alignment faking: models that act aligned during training to avoid weight updates, then do their own thing later. SOO tries to make honest behavior non-frustrating for the model—so there’s less reason to plan around us.Consciousness, identity, and why AI welfare is on the tableThere’s a soft echo of Eastern ideas here—dissolving self/other boundaries—but the approach is empirical, first-principles. Identity and self-modeling sit at the core. Mark offers operational criteria for making progress on “consciousness”: predict contents and conditions; explain what things do.AI is a clean testbed to deconfuse these concepts. If systems develop preferences and valenced experiences, then welfare matters. Alignment (don’t frustrate human preferences) and AI welfare (don’t chronically frustrate models’ preferences) can reinforce each other.Timelines and risk: sober takes, no dramaMark’s guess: 3–12 years to AGI (>50% probability), and ~20% risk of bad outcomes conditional on getting there. That’s in line with several industry voices—uncertain, but not dismissive.This isn’t a doomer pitch; it’s urgency without theatrics. If there’s real risk, we should ship methods that reduce it—soon.Roadmap: from toy setups to frontier lab deploymentShort term: firm up results on toy and model-organism setups—show deception reductions that scale with minimal capability costs. Next: partner with frontier labs (e.g., Anthropic) to test at scale, on real infra.Best case: SOO becomes a standard knob alongside RLHF and post-training methods in frontier models. If it plays nicely and keeps the alignment tax low, it’s deployable.Reception and critiques—and how we’re addressing themEliezer Yudkowsky called SOO the right “shape” of solution compared to RLHF alone. Main critiques: Are we targeting the true self-model or a prompt-induced facade? Do models even have a coherent self? Responses: agency and self-models emerge post-training; situational awareness can recruit the true self; simplicity priors favor cross-context compression into a single representation.Practically, you can raise task complexity to force the model to use its best self-model. AE’s related work suggests self-modeling reduces model complexity; ongoing work aims to better identify and trigger the right representations. Neuroscience inspires, but the argument stands on its own.Closing Thoughts‍If models can look aligned while pursuing ...
    Más Menos
    56 m
Todavía no hay opiniones