muckrAIkers Podcast Por Jacob Haimes and Igor Krawczuk arte de portada

muckrAIkers

muckrAIkers

De: Jacob Haimes and Igor Krawczuk
Escúchala gratis

Acerca de esta escucha

Join us as we dig a tiny bit deeper into the hype surrounding "AI" press releases, research papers, and more. Each episode, we'll highlight ongoing research and investigations, providing some much needed contextualization, constructive critique, and even a smidge of occasional good will teasing to the conversation, trying to find the meaning under all of this muck.© Kairos.fm Ciencia Matemáticas
Episodios
  • DeepSeek: 2 Months Out
    Apr 9 2025
    DeepSeek has been out for over 2 months now, and things have begun to settle down. We take this opportunity to contextualize the developments that have occurred in its wake, both within the AI industry and the world economy. As systems get more "agentic" and users are willing to spend increasing amounts of time waiting for their outputs, the value of supposed "reasoning" models continues to be peddled by AI system developers, but does the data really back these claims?Check out our DeepSeek minisode for a snappier overview!EPISODE RECORDED 2025.03.30(00:40) - DeepSeek R1 recap (02:46) - What makes it new? (08:53) - What is reasoning? (14:51) - Limitations of reasoning models (why we hate reasoning) (31:16) - Claims about R1 training on Open AI (37:30) - “Deep Research” (49:13) - Developments and drama in the AI industry (56:26) - Proposed economic value (01:14:20) - US government involvement (01:23:28) - OpenAI uses MCP (01:28:15) - OutroLinksDeepSeek websiteDeepSeek paperDeepSeek docs - Models and PricingDeepSeek repo - 3FSUnderstanding DeepSeek/DeepResearchExplainersLanguage Models & Co. article - The Illustrated DeepSeek-R1Towards Data Science article - DeepSeek-V3 Explained 1: Multi-head Latent AttentionJina.ai article - A Practical Guide to Implementing DeepSearch/DeepResearchHan, Not Solo blogpost - The Differences between Deep Research, Deep Research, and Deep ResearchAnalysis and ResearchPreprint - Understanding R1-Zero-Like Training: A Critical PerspectiveBlogpost - There May Not be Aha Moment in R1-Zero-like Training — A Pilot StudyPreprint - Large Language Monkeys: Scaling Inference Compute with Repeated SamplingPreprint - Chain-of-Thought Reasoning In The Wild Is Not Always FaithfulFallout coverageTechCrunch article - OpenAI calls DeepSeek 'state-controlled,' calls for bans on 'PRC-produced' modelsThe Verge article - OpenAI has evidence that its models helped train China’s DeepSeekInteresting Engineer article - $6M myth: DeepSeek’s true AI cost is 216x higher at $1.3B, research revealsArs Technica article - Microsoft now hosts AI model accused of copying OpenAI dataThe Signal article - Nvidia loses nearly $600 billion in DeepSeek crashYahoo Finance article - The 'Magnificent 7' stocks are having their worst quarter in more than 2 yearsReuters article - Microsoft pulls back from more data center leases in US and Europe, analysts sayUS governanceNational Law Review article - Three States Ban DeepSeek Use on State Devices and NetworksCNN article - US lawmakers want to ban DeepSeek from government devicesHouse bill - No DeepSeek on Government Devices ActSenate bill - Decoupling America's Artificial Intelligence Capabilities from China Act of 2025LeaderboardsaiderLiveBenchLM ArenaKonwinski PrizePreprint - SWE-Bench+: Enhanced Coding Benchmark for LLMsCybernews article - OpenAI study proves LLMs still behind human engineers in over 1400 real-world tasksOther ReferencesAnthropic report - The Anthropic Economic IndexMETR Report - Measuring AI Ability to Complete Long TasksThe Information article - OpenAI Discusses Building Its First Data Center for StorageDeepmind report backing up this ideaTechCrunch article - OpenAI adopts rival Anthropic's standard for connecting AI models to dataReuters article - OpenAI, Meta in talks with Reliance for AI partnerships, The Information reports2024 AI Index reportNDTV article - Ghibli-Style Images To Memes: White House Embraces Alt-Right Online CultureElk post on DOGE and AI
    Más Menos
    1 h y 32 m
  • DeepSeek Minisode
    Feb 10 2025

    DeepSeek R1 has taken the world by storm, causing a stock market crash and prompting further calls for export controls within the US. Since this story is still very much in development, with follow-up investigations and calls for governance being released almost daily, we thought it best to hold of for a little while longer to be able to tell the whole story. Nonetheless, it's a big story, so we provide a brief overview of all that's out there so far.

    • (00:00) - Recording date
    • (00:04) - Intro
    • (00:37) - DeepSeek drop and reactions
    • (04:27) - Export controls
    • (08:05) - Skepticism and uncertainty
    • (14:12) - Outro


    Links
    • DeepSeek website
    • DeepSeek paper
    • Reuters article - What is DeepSeek and why is it disrupting the AI sector?

    Fallout coverage

    • The Verge article - OpenAI has evidence that its models helped train China’s DeepSeek
    • The Signal article - Nvidia loses nearly $600 billion in DeepSeek crash
    • CNN article - US lawmakers want to ban DeepSeek from government devices
    • Fortune article - Meta is reportedly scrambling ‘war rooms’ of engineers to figure out how DeepSeek’s AI is beating everyone else at a fraction of the price
    • Dario Amodei's blogpost - On DeepSeek and Export Controls
    • SemiAnalysis article - DeepSeek Debates
    • Ars Technica article - Microsoft now hosts AI model accused of copying OpenAI data
    • Wiz Blogpost - Wiz Research Uncovers Exposed DeepSeek Database Leaking Sensitive Information, Including Chat History

    Investigations into "reasoning"

    • Blogpost - There May Not be Aha Moment in R1-Zero-like Training — A Pilot Study
    • Preprint - s1: Simple test-time scaling
    • Preprint - LIMO: Less is More for Reasoning
    • Blogpost - Reasoning Reflections
    • Preprint - Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH
    Más Menos
    15 m
  • Understanding AI World Models w/ Chris Canal
    Jan 27 2025
    Chris Canal, co-founder of EquiStamp, joins muckrAIkers as our first ever podcast guest! In this ~3.5 hour interview, we discuss intelligence vs. competencies, the importance of test-time compute, moving goalposts, the orthogonality thesis, and much more.A seasoned software developer, Chris started EquiStamp as a way to improve our current understanding of model failure modes and capabilities in late 2023. Now a key contractor for METR, EquiStamp evaluates the next generation of LLMs from frontier model developers like OpenAI and Anthropic.EquiStamp is hiring, so if you're a software developer interested in a fully remote opportunity with flexible working hours, join the EquiStamp Discord server and message Chris directly; oh, and let him know muckrAIkers sent you!(00:00) - Recording date (00:05) - Intro (00:29) - Hot off the press (02:17) - Introducing Chris Canal (19:12) - World/risk models (35:21) - Competencies + decision making power (42:09) - Breaking models down (01:05:06) - Timelines, test time compute (01:19:17) - Moving goalposts (01:26:34) - Risk management pre-AGI (01:46:32) - Happy endings (01:55:50) - Causal chains (02:04:49) - Appetite for democracy (02:20:06) - Tech-frame based fallacies (02:39:56) - Bringing back real capitalism (02:45:23) - Orthogonality Thesis (03:04:31) - Why we do this (03:15:36) - Equistamp!LinksEquiStampChris's TwitterMETR Paper - RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human expertsAll Trades article - Learning from History: Preventing AGI Existential Risks through Policy by Chris CanalBetter Systems article - The Omega Protocol: Another Manhattan ProjectSuperintelligence & CommentaryWikipedia article - Superintelligence: Paths, Dangers, Strategies by Nick BostromReflective Altruism article - Against the singularity hypothesis (Part 5: Bostrom on the singularity)Into AI Safety Interview - Scaling Democracy w/ Dr. Igor KrawczukReferenced SourcesBook - Man-made Catastrophes and Risk Information Concealment: Case Studies of Major Disasters and Human FallibilityArtificial Intelligence Paper - Reward is EnoughWikipedia article - Capital and Ideology by Thomas PikettyWikipedia article - PantheonLeCun on AGI"Won't Happen" - Time article - Meta’s AI Chief Yann LeCun on AGI, Open-Source, and AI Risk"But if it does, it'll be my research agenda latent state models, which I happen to research" - Meta Platforms Blogpost - I-JEPA: The first AI model based on Yann LeCun’s vision for more human-like AIOther SourcesStanford CS Senior Project - Timing Attacks on Prompt Caching in Language Model APIsTechCrunch article - AI researcher François Chollet founds a new AI lab focused on AGIWhite House Fact Sheet - Ensuring U.S. Security and Economic Strength in the Age of Artificial IntelligenceNew York Post article - Bay Area lawyer drops Meta as client over CEO Mark Zuckerberg’s ‘toxic masculinity and Neo-Nazi madness’OpenEdition Academic Review of Thomas PikettyNeural Processing Letters Paper - A Survey of Encoding Techniques for Signal Processing in Spiking Neural NetworksBFI Working Paper - Do Financial Concerns Make Workers Less Productive?No Mercy/No Malice article - How to Survive the Next Four Years by Scott Galloway
    Más Menos
    3 h y 20 m
adbl_web_global_use_to_activate_T1_webcro805_stickypopup
Todavía no hay opiniones