Ship It Weekly - DevOps, SRE, and Platform Engineering News | Podcasts en Audible

Episodios

Cloudflare BYOIP BGP Withdrawals, Clerk’s Postgres Query-Plan Flip Outage, and AWS Kiro Permissions Lessons (Grafana Privesc + runc CVEs)

Feb 27 2026

This week on Ship It Weekly, Brian covers three “automation meets reality” stories that every DevOps, SRE, and platform team can learn from.
Cloudflare accidentally withdrew customer BYOIP prefixes due to a buggy cleanup task, Clerk got knocked over by a Postgres auto-analyze query plan flip, and AWS responded to reports about its internal Kiro tooling by framing the incident as misconfigured access controls. Plus: a quick EKS node monitoring update, and a tight security lightning round.
Links
Cloudflare BYOIP outage postmortem https://blog.cloudflare.com/cloudflare-outage-february-20-2026/
Clerk outage postmortem (Feb 19, 2026) https://clerk.com/blog/2026-02-19-system-outage-postmortem
AWS outage report (Reuters) https://www.reuters.com/business/retail-consumer/amazons-cloud-unit-hit-by-least-two-outages-involving-ai-tools-ft-says-2026-02-20/
AWS response on Kiro + access controls https://www.aboutamazon.com/news/aws/aws-service-outage-ai-bot-kiro
EKS Node Monitoring Agent (open source) https://aws.amazon.com/about-aws/whats-new/2026/02/amazon-eks-node-monitoring-agent-open-source/
Grafana CVE-2026-21721 https://grafana.com/security/security-advisories/cve-2026-21721/
runc CVEs (AWS-2025-024) https://aws.amazon.com/security/security-bulletins/rss/aws-2025-024/
GitLab patch releases https://about.gitlab.com/releases/2025/11/26/patch-release-gitlab-18-6-1-released/
Atlassian Feb 2026 security bulletin https://confluence.atlassian.com/security/security-bulletin-february-17-2026-1722256046.html
Human story: SRE Is Anti-Transactional (ACM Queue) https://queue.acm.org/detail.cfm?id=3773094
More episodes and show notes at https://shipitweekly.fm
On Call Briefs at: https://oncallbrief.com

Más Menos

18 m

No se pudo agregar al carrito

Solo puedes tener X títulos en el carrito para realizar el pago.

Add to Cart failed.

Por favor prueba de nuevo más tarde

Error al Agregar a Lista de Deseos.

Por favor prueba de nuevo más tarde

Error al eliminar de la lista de deseos.

Por favor prueba de nuevo más tarde

Error al añadir a tu biblioteca

Por favor intenta de nuevo

Error al seguir el podcast

Intenta nuevamente

Error al dejar de seguir el podcast

Intenta nuevamente

Escúchala gratis
Ship It Conversations: Mike Lady on Day Two Readiness + Guardrails in the AI Era

Feb 24 2026
This is a guest conversation episode of Ship It Weekly (separate from the weekly news recaps).
In this Ship It: Conversations episode I talk with Mike Lady (Senior DevOps Engineer, distributed systems) from Enterprise Vibe Code on YouTube. We talk day two readiness, guardrails/quality gates, and why shipping safely matters even more now that AI can generate code fast.
Highlights
Day 0 vs Day 1 vs Day 2 (launching vs operating and evolving safely)
What teams look like without guardrails (“hope is not a strategy”)
Why guardrails speed you up long-term (less firefighting, more predictable delivery)
Day-two audit checklist: source control/branches/PRs, branch protection, CI quality gates, secrets/config, staging→prod flow
AI agents: they’ll “lie, cheat, and steal” to satisfy the goal unless you gate them
Multi-model reviews (Claude/Gemini/Codex) as different perspectives
AI in prod: start read-only (logs/traces), then earn trust slowly
Mike’s links
YouTube: https://www.youtube.com/@EnterpriseVibeCode
Site: https://www.enterprisevibecode.com/
LinkedIn: https://www.linkedin.com/in/mikelady/
Stuff mentioned
Vibe Coding (Gene Kim + Steve Yegge): https://www.simonandschuster.com/books/Vibe-Coding/Gene-Kim/9781966280026
Beads (agent memory/issue tracker): https://github.com/steveyegge/beads
Gas Town (agent orchestration): https://github.com/steveyegge/gastown
AGENTS.md (agent instructions file): https://agents.md/
OpenAI Codex: https://openai.com/codex/
More episodes + details: https://shipitweekly.fm
Más Menos
35 m

No se pudo agregar al carrito

Solo puedes tener X títulos en el carrito para realizar el pago.

Add to Cart failed.

Por favor prueba de nuevo más tarde

Error al Agregar a Lista de Deseos.

Por favor prueba de nuevo más tarde

Error al eliminar de la lista de deseos.

Por favor prueba de nuevo más tarde

Error al añadir a tu biblioteca

Por favor intenta de nuevo

Error al seguir el podcast

Intenta nuevamente

Error al dejar de seguir el podcast

Intenta nuevamente

Escúchala gratis
GitHub Agentic Workflows, Gentoo Leaves GitHub, Argo CD 3.3 Upgrade Gotcha, AWS Config Scope Creep

Feb 20 2026

This week on Ship It Weekly, Brian hits five stories where the “defaults” are shifting under ops teams.
GitHub is bringing Agentic Workflows into Actions, Gentoo is migrating off GitHub to Codeberg, Argo CD upgrades are forcing Server-Side Apply in some paths, AWS Config quietly expanded coverage again, and EC2 nested virtualization is now possible on virtual instances.
Links
YouTube episodes https://www.youtube.com/watch?v=tuuLlo2rbI0&list=PLYLi5KINFnO7dVMbhsJQTKRFXfSSwPmuL&pp=sAgC
OnCallBrief https://oncallbrief.com
Teller’s Tech Substack https://tellerstech.substack.com/
GitHub Agentic Workflows (preview) https://github.blog/changelog/2026-02-13-github-agentic-workflows-are-now-in-technical-preview/
Gentoo moves to Codeberg https://www.theregister.com/2026/02/17/gentoo_moves_to_codeberg_amid/
Argo CD upgrade guide: 3.2 -> 3.3 (SSA) https://argo-cd.readthedocs.io/en/latest/operator-manual/upgrading/3.2-3.3/
AWS Config: 30 new resource types https://aws.amazon.com/about-aws/whats-new/2026/02/aws-config-new-resource-types
EC2 nested virtualization (virtual instances) https://aws.amazon.com/about-aws/whats-new/2026/02/amazon-ec2-nested-virtualization-on-virtual/
GitHub status page update https://github.blog/changelog/2026-02-13-updated-status-experience/
GitHub Actions: early Feb updates https://github.blog/changelog/2026-02-05-github-actions-early-february-2026-updates/
Runner min version enforcement extended https://github.blog/changelog/2026-02-05-github-actions-self-hosted-runner-minimum-version-enforcement-extended/
Open Build Service postmortem https://openbuildservice.org/2026/02/02/post-mortem/
Human story: AI SRE vs incident management https://surfingcomplexity.blog/2026/02/14/lots-of-ai-sre-no-ai-incident-management/
More episodes and show info on https://shipitweekly.fm

Más Menos

19 m

No se pudo agregar al carrito

Solo puedes tener X títulos en el carrito para realizar el pago.

Add to Cart failed.

Por favor prueba de nuevo más tarde

Error al Agregar a Lista de Deseos.

Por favor prueba de nuevo más tarde

Error al eliminar de la lista de deseos.

Por favor prueba de nuevo más tarde

Error al añadir a tu biblioteca

Por favor intenta de nuevo

Error al seguir el podcast

Intenta nuevamente

Error al dejar de seguir el podcast

Intenta nuevamente

Escúchala gratis
Special: OpenClaw Security Timeline and Fallout: CVE-2026-25253 One-Click Token Leak, Malicious ClawHub Skills, Exposed Agent Control Panels, and Why Local AI Agents Are a New DevOps/SRE Control Plane (OpenAI Hires Founder)

Feb 17 2026

In this Ship It Weekly special, Brian breaks down the OpenClaw situation and why it’s bigger than “another CVE.”
OpenClaw is a preview of what platform teams are about to deal with: autonomous agents running locally, wired into real tools, real APIs, and real credentials. When the trust model breaks, it’s not just data exposure. It’s an operator compromise.
We walk through the recent timeline: mass internet exposure of OpenClaw control panels, CVE-2026-25253 (a one-click token leak that can turn your browser into the bridge to your local gateway), a skills marketplace that quickly became a malware delivery channel, and the Moltbook incident showing how “agent content” becomes a new supply chain problem. We close with the signal that agents are going mainstream: OpenAI hiring the OpenClaw creator.
Chapters
• 1) What OpenClaw is and why local agents are different
• 2) The situation in one line: autonomy + creds + messy reality
• 3) CVE-2026-25253: one click, token leak, gateway exposure
• 4) The exposure wave: thousands of reachable control panels
• 5) ClawHub skills: marketplace turns into malware delivery
• 6) Moltbook: the wider agent ecosystem leaking real data
• 7) Minimum viable safety: least privilege, isolation, approvals, action logs
• 8) The plot twist: OpenAI hires the creator, adoption accelerates
Links from this episode
Censys exposure research https://censys.com/blog/openclaw-in-the-wild-mapping-the-public-exposure-of-a-viral-ai-assistant
GitHub advisory (CVE-2026-25253) https://github.com/advisories/GHSA-g8p2-7wf7-98mq
NVD entry https://nvd.nist.gov/vuln/detail/CVE-2026-25253
Koi Security: ClawHavoc / malicious skills https://www.koi.ai/blog/clawhavoc-341-malicious-clawedbot-skills-found-by-the-bot-they-were-targeting
Moltbook leak coverage (Reuters) https://www.reuters.com/legal/litigation/moltbook-social-media-site-ai-agents-had-big-security-hole-cyber-firm-wiz-says-2026-02-02/
OpenClaw security docs https://docs.openclaw.ai/gateway/security
OpenAI hire coverage (FT) https://www.ft.com/content/45b172e6-df8c-41a7-bba9-3e21e361d3aa
More information and past episodes on https://shipitweekly.fm

Más Menos

19 m

No se pudo agregar al carrito

Solo puedes tener X títulos en el carrito para realizar el pago.

Add to Cart failed.

Por favor prueba de nuevo más tarde

Error al Agregar a Lista de Deseos.

Por favor prueba de nuevo más tarde

Error al eliminar de la lista de deseos.

Por favor prueba de nuevo más tarde

Error al añadir a tu biblioteca

Por favor intenta de nuevo

Error al seguir el podcast

Intenta nuevamente

Error al dejar de seguir el podcast

Intenta nuevamente

Escúchala gratis
When guardrails break prod: GitHub “Too Many Requests” from legacy defenses, Kubernetes nodes/proxy GET RCE, HCP Vault resilience in an AWS regional outage, and PCI DSS scope creep

Feb 13 2026

This week on Ship It Weekly, Brian hits four stories where the guardrails become the incident.
GitHub had “Too Many Requests” caused by legacy abuse protections that outlived their moment. Takeaway: controls need owners, visibility, and a retirement plan.
Kubernetes has a nasty edge case where nodes/proxy GET can turn into command execution via WebSocket behavior. If you’ve ever handed out “telemetry” RBAC broadly, go audit it.
HashiCorp shared how HCP Vault handled a real AWS regional disruption: control plane wobbled, Dedicated data planes kept serving. Control plane vs data plane separation paying off.
AWS expanded its PCI DSS compliance package with more services and the Asia Pacific (Taipei) region. Scope changes don’t break prod today, but they turn into evidence churn later if you don’t standardize proof.
Human story: “reasonable assurance” turning into busywork.
Links
GitHub: When protections outlive their purpose (legacy defenses + lifecycle)
https://github.blog/engineering/infrastructure/when-protections-outlive-their-purpose-a-lesson-on-managing-defense-systems-at-scale/
Kubernetes nodes/proxy GET → RCE (analysis)
https://grahamhelton.com/blog/nodes-proxy-rce
OpenFaaS guidance / mitigation notes
https://www.openfaas.com/blog/kubernetes-node-proxy-rce/
HCP Vault resilience during real AWS regional outages
https://www.hashicorp.com/blog/how-resilient-is-hcp-vault-during-real-aws-regional-outages
AWS: Fall 2025 PCI DSS compliance package update
https://aws.amazon.com/blogs/security/fall-2025-pci-dss-compliance-package-available-now/
GitHub Actions: self-hosted runner minimum version enforcement extended
https://github.blog/changelog/2026-02-05-github-actions-self-hosted-runner-minimum-version-enforcement-extended/
Headlamp in 2025: Project Highlights (SIG UI)
https://kubernetes.io/blog/2026/01/22/headlamp-in-2025-project-highlights/
AWS Network Firewall Active Threat Defense (MadPot)
https://aws.amazon.com/blogs/security/real-time-malware-defense-leveraging-aws-network-firewall-active-threat-defense/
Reasonable assurance turning into busywork (r/sre)
https://www.reddit.com/r/sre/comments/1qvwbgf/at_what_point_does_reasonable_assurance_turn_into/
More episodes + details: https://shipitweekly.fm

Más Menos

16 m

No se pudo agregar al carrito

Solo puedes tener X títulos en el carrito para realizar el pago.

Add to Cart failed.

Por favor prueba de nuevo más tarde

Error al Agregar a Lista de Deseos.

Por favor prueba de nuevo más tarde

Error al eliminar de la lista de deseos.

Por favor prueba de nuevo más tarde

Error al añadir a tu biblioteca

Por favor intenta de nuevo

Error al seguir el podcast

Intenta nuevamente

Error al dejar de seguir el podcast

Intenta nuevamente

Escúchala gratis
Azure VM Control Plane Outage, GitHub Agent HQ (Claude + Codex), Claude Opus 4.6, Gemini CLI, MCP

Feb 6 2026

This week on Ship It Weekly, Brian hits four “control plane + trust boundary” stories where the glue layer becomes the incident.
Azure had a platform incident that impacted VM management operations across multiple regions. Your app can be up, but ops is degraded.
GitHub is pushing Agent HQ (Claude + Codex in the repo/CI flow), and Actions added a case() function so workflow logic is less brittle.
MCP is becoming platform plumbing: Miro launched an MCP server and Kong launched an MCP Registry.
Links
Azure status incident (VM service management issues) https://azure.status.microsoft/en-us/status/history/?trackingId=FNJ8-VQZ
GitHub Agent HQ: Claude + Codex https://github.blog/news-insights/company-news/pick-your-agent-use-claude-and-codex-on-agent-hq/
GitHub Actions update (case() function) https://github.blog/changelog/2026-01-29-github-actions-smarter-editing-clearer-debugging-and-a-new-case-function/
Claude Opus 4.6 https://www.anthropic.com/news/claude-opus-4-6
How Google SREs use Gemini CLI https://cloud.google.com/blog/topics/developers-practitioners/how-google-sres-use-gemini-cli-to-solve-real-world-outages
Miro MCP server announcement https://www.businesswire.com/news/home/20260202411670/en/Miro-Launches-MCP-Server-to-Connect-Visual-Collaboration-With-AI-Coding-Tools
Kong MCP Registry announcement https://konghq.com/company/press-room/press-release/kong-introduces-mcp-registry
GitHub Actions hosted runners incident thread https://github.com/orgs/community/discussions/186184
DockerDash / Ask Gordon research https://noma.security/blog/dockerdash-two-attack-paths-one-ai-supply-chain-crisis/
Terraform 1.15 alpha https://github.com/hashicorp/terraform/releases/tag/v1.15.0-alpha20260204
Wiz Moltbook write-up https://www.wiz.io/blog/exposed-moltbook-database-reveals-millions-of-api-keys
Chainguard “EmeritOSS” https://www.chainguard.dev/unchained/introducing-chainguard-emeritoss
More episodes + details: https://shipitweekly.fm

Más Menos

21 m

No se pudo agregar al carrito

Solo puedes tener X títulos en el carrito para realizar el pago.

Add to Cart failed.

Por favor prueba de nuevo más tarde

Error al Agregar a Lista de Deseos.

Por favor prueba de nuevo más tarde

Error al eliminar de la lista de deseos.

Por favor prueba de nuevo más tarde

Error al añadir a tu biblioteca

Por favor intenta de nuevo

Error al seguir el podcast

Intenta nuevamente

Error al dejar de seguir el podcast

Intenta nuevamente

Escúchala gratis
CodeBreach in AWS CodeBuild, Bazel TLS Certificate Expiry Breaks Builds, Helm Charts Reliability Audit, and New n8n Sandbox Escape RCE

Jan 30 2026

This week on Ship It Weekly, Brian looks at four “glue failures” that can turn into real outages and real security risk.
We start with CodeBreach: AWS disclosed a CodeBuild webhook filter misconfig in a small set of AWS-managed repos. The takeaway is simple: CI trigger logic is part of your security boundary now.
Next is the Bazel TLS cert expiry incident. Cert failures are a binary cliff, and “auto renew” is only one link in the chain.
Third is Helm chart reliability. Prequel reviewed 105 charts and found a lot of demo-friendly defaults that don’t hold up under real load, rollouts, or node drains.
Fourth is n8n. Two new high-severity flaws disclosed by JFrog. “Authenticated” still matters because workflow authoring is basically code execution, and these tools sit next to your secrets.
Lightning round: Fence, HashiCorp agent-skills, marimo, and a cautionary agent-loop story.
Links
AWS CodeBreach bulletin https://aws.amazon.com/security/security-bulletins/2026-002-AWS/
Wiz research https://www.wiz.io/blog/wiz-research-codebreach-vulnerability-aws-codebuild
Bazel postmortem https://blog.bazel.build/2026/01/16/ssl-cert-expiry.html
Helm report https://www.prequel.dev/blog-post/the-real-state-of-helm-chart-reliability-2025-hidden-risks-in-100-open-source-charts
n8n coverage https://thehackernews.com/2026/01/two-high-severity-n8n-flaws-allow.html
Fence https://github.com/Use-Tusk/fence
agent-skills https://github.com/hashicorp/agent-skills
marimo https://marimo.io/
Agent loop story https://www.theregister.com/2026/01/27/ralph_wiggum_claude_loops/
Related n8n episodes:
https://www.tellerstech.com/ship-it-weekly/n8n-critical-cve-cve-2026-21858-aws-gpu-capacity-blocks-price-hike-netflix-temporal/
https://www.tellerstech.com/ship-it-weekly/n8n-auth-rce-cve-2026-21877-github-artifact-permissions-and-aws-devops-agent-lessons/
More episodes + details: https://shipitweekly.fm

Más Menos

19 m

No se pudo agregar al carrito

Solo puedes tener X títulos en el carrito para realizar el pago.

Add to Cart failed.

Por favor prueba de nuevo más tarde

Error al Agregar a Lista de Deseos.

Por favor prueba de nuevo más tarde

Error al eliminar de la lista de deseos.

Por favor prueba de nuevo más tarde

Error al añadir a tu biblioteca

Por favor intenta de nuevo

Error al seguir el podcast

Intenta nuevamente

Error al dejar de seguir el podcast

Intenta nuevamente

Escúchala gratis
Ship It Conversations: AI Automation for SMBs: What to Automate (And What Not To) (with Austin Reed)

Jan 27 2026

This is a guest conversation episode of Ship It Weekly (separate from the weekly news recaps).
In this Ship It: Conversations episode I talk with Austin Reed from horizon.dev about AI and automation for small and mid-sized businesses, and what actually works once you leave the demo world.
We get into the most common automation wins he sees (sales and customer service), why a lot of projects fail due to communication and unclear specs more than the tech, and the trap of thinking “AI makes it cheap.” Austin shares how they push teams toward quick wins first, then iterate with prototypes so you don’t spend $10k automating a thing that never even happens.
We also talk guardrails: when “human-in-the-loop” makes sense, what he avoids automating (finance-heavy logic, HIPAA/medical, government), and why the goal is usually leverage, not replacing people. On the dev side, we nerd out a bit on the tooling they’re using day to day: GPT and Claude, Cursor, PR review help, CI/CD workflows, and why knowing how to architect and validate output matters way more than people think.
If you’re a DevOps/SRE type helping the business “do AI,” or you’re just tired of automation hype that ignores real constraints like credentials, scope creep, and operational risk, this one is very much about the practical middle ground.
Links from the episode:
Austin on LinkedIn: https://www.linkedin.com/in/automationsexpert/
horizon.dev: horizon.dev
YouTube: https://www.youtube.com/@horizonsoftwaredev
Skool: https://www.skool.com/automation-masters
If you found this useful, share it with the person on your team who keeps saying “we should automate that” but hasn’t dealt with the messy parts yet.
More information on our website: https://shipitweekly.fm

Más Menos

25 m

No se pudo agregar al carrito

Solo puedes tener X títulos en el carrito para realizar el pago.

Add to Cart failed.

Por favor prueba de nuevo más tarde

Error al Agregar a Lista de Deseos.

Por favor prueba de nuevo más tarde

Error al eliminar de la lista de deseos.

Por favor prueba de nuevo más tarde

Error al añadir a tu biblioteca

Por favor intenta de nuevo

Error al seguir el podcast

Intenta nuevamente

Error al dejar de seguir el podcast

Intenta nuevamente

Escúchala gratis

Episodios

Cloudflare BYOIP BGP Withdrawals, Clerk’s Postgres Query-Plan Flip Outage, and AWS Kiro Permissions Lessons (Grafana Privesc + runc CVEs)

No se pudo agregar al carrito

Add to Cart failed.

Error al Agregar a Lista de Deseos.

Error al eliminar de la lista de deseos.

Error al añadir a tu biblioteca

Error al seguir el podcast

Error al dejar de seguir el podcast

Ship It Conversations: Mike Lady on Day Two Readiness + Guardrails in the AI Era

No se pudo agregar al carrito

Add to Cart failed.

Error al Agregar a Lista de Deseos.

Error al eliminar de la lista de deseos.

Error al añadir a tu biblioteca

Error al seguir el podcast

Error al dejar de seguir el podcast

GitHub Agentic Workflows, Gentoo Leaves GitHub, Argo CD 3.3 Upgrade Gotcha, AWS Config Scope Creep

No se pudo agregar al carrito

Add to Cart failed.

Error al Agregar a Lista de Deseos.

Error al eliminar de la lista de deseos.

Error al añadir a tu biblioteca

Error al seguir el podcast

Error al dejar de seguir el podcast

Special: OpenClaw Security Timeline and Fallout: CVE-2026-25253 One-Click Token Leak, Malicious ClawHub Skills, Exposed Agent Control Panels, and Why Local AI Agents Are a New DevOps/SRE Control Plane (OpenAI Hires Founder)

No se pudo agregar al carrito

Add to Cart failed.

Error al Agregar a Lista de Deseos.

Error al eliminar de la lista de deseos.

Error al añadir a tu biblioteca

Error al seguir el podcast

Error al dejar de seguir el podcast

When guardrails break prod: GitHub “Too Many Requests” from legacy defenses, Kubernetes nodes/proxy GET RCE, HCP Vault resilience in an AWS regional outage, and PCI DSS scope creep

No se pudo agregar al carrito

Add to Cart failed.

Error al Agregar a Lista de Deseos.

Error al eliminar de la lista de deseos.

Error al añadir a tu biblioteca

Error al seguir el podcast

Error al dejar de seguir el podcast

Azure VM Control Plane Outage, GitHub Agent HQ (Claude + Codex), Claude Opus 4.6, Gemini CLI, MCP

No se pudo agregar al carrito

Add to Cart failed.

Error al Agregar a Lista de Deseos.

Error al eliminar de la lista de deseos.

Error al añadir a tu biblioteca

Error al seguir el podcast

Error al dejar de seguir el podcast

CodeBreach in AWS CodeBuild, Bazel TLS Certificate Expiry Breaks Builds, Helm Charts Reliability Audit, and New n8n Sandbox Escape RCE

No se pudo agregar al carrito

Add to Cart failed.

Error al Agregar a Lista de Deseos.

Error al eliminar de la lista de deseos.

Error al añadir a tu biblioteca

Error al seguir el podcast

Error al dejar de seguir el podcast

Ship It Conversations: AI Automation for SMBs: What to Automate (And What Not To) (with Austin Reed)

No se pudo agregar al carrito

Add to Cart failed.

Error al Agregar a Lista de Deseos.

Error al eliminar de la lista de deseos.

Error al añadir a tu biblioteca

Error al seguir el podcast

Error al dejar de seguir el podcast