Afleveringen

  • I put GLM 5.2, the open-weight coding model from Z.AI, through four real tasks inside my actual codebase: a codebase architecture audit, a UI redesign, and a 45-minute autonomous bug-hunting session pulling from Sentry and Vercel logs. Total cost: $3.36 for roughly 6 million tokens, a prioritized bug-fix dashboard I’m actually shipping from, and a landing page redesign that matched Chat PRD’s design system on the first try.

    What you’ll learn:

    What “open-weight” actually means and why it matters for cost and vendor independenceHow to connect GLM 5.2 to Cursor and Claude CodeHow it performs on codebase exploration and autonomous architecture summarization in a real production Next.js appWhether GLM 5.2 can match an existing design systemHow the model handles a 45-minute long-running autonomous taskWhere GLM 5.2 stumbled The actual cost breakdown

    —

    Brought to you by:

    Mercury—Radically different banking loved by over 300K entrepreneurs

    —

    In this episode, we cover:

    (00:00) What open-weight models are and why GLM 5.2 is worth testing

    (01:38) GLM 5.2 model overview

    (04:02) Capabilities and benchmark results

    (06:02) How to set up GLM 5.2 in Cursor

    (08:37) How to set up GLM 5.2 in Claude Code

    (11:04) Live test 1: codebase exploration and architecture audit on ChatPRD

    (12:43) Live test 2: generating an HTML architecture and roadmap page

    (16:37) Live test 3: redesigning the How I AI landing page in Cursor

    (20:57) Live test 4: 45-minute autonomous task, pulling Sentry errors and Vercel logs

    (22:35) Where it struggled

    (23:49) My verdict on the output

    (25:23) Cost breakdown

    —

    Tools referenced:

    z.ai: https://z.aiGLM 5.2: https://z.ai/blog/glm-5.2OpenRouter: https://openrouter.aiCursor: https://cursor.comClaude Code: https://docs.anthropic.com/en/docs/claude-codeSentry: https://sentry.ioVercel: https://vercel.com

    —

    Other references:

    SWE-Bench Pro leaderboard (coding benchmark scores referenced in episode): https://www.swebench.comFrontier Suite and Post-Train Bench (additional benchmarks cited): https://scale.com/leaderboardUse Claude Code with OpenRouter: https://openrouter.ai/docs/cookbook/coding-agents/claude-code-integration

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Brian Grinstead is a distinguished engineer at Mozilla, where he’s worked on Firefox and the web platform since 2013 (he joined to help launch Firefox DevTools). Recently he and his team pointed an agentic bug-finding pipeline at Firefox—a codebase with tens of thousands of files and tens of millions of lines of code—and shipped a record month of security fixes. The viral chart everyone saw gave the credit to Anthropic’s new Mythos model. Brian’s take is that the harness and pipeline did just as much of the work, and he walks through exactly how it runs and how anyone can build a starter version.

    What you’ll learn:

    How to build a basic bug-finding harness by running Claude Code or Codex with one prompt and the -p flag, no SDK requiredWhy pointing an agent at a whole codebase fails, and how an LLM judge can score and rank files before you spend any computeHow a verifier subagent kills false positives by catching the agent when it cheatsThe goal-loop pattern: give an agent a tightly scoped problem, a clear pass/fail signal, and let it retry far past the point a human would quitWhy teams that already invested in fuzzing, CI, and dev tooling are so far aheadHow to weigh model versus harness, and why Brian splits the credit close to 50-50How a non-engineer can reuse the same score, verify, and fix the loop for design quality, conversion rate, or tech debtWhy AI-generated patches still can’t ship on their own, and where humans stay in the loop

    —

    Brought to you by:

    WorkOS—Make your app enterprise-ready today

    Metaview—The agentic recruiting platform for winning teams

    —

    In this episode, we cover:

    (00:00) Introduction to Brian Grinstead

    (02:43) The viral chart: Firefox Security Bug Fixes by Month

    (05:32) How the custom harness works

    (10:22) Goal loops and guardrails

    (14:45) How they built it

    (16:55) Real bugs, including a 15-year-old one

    (23:00) Open-sourcing it

    (26:26) Why humans still review every fix

    (32:30) Live demo and prioritizing files

    (40:18) Mobilizing the team and recap

    (42:33) Lightning round

    —

    Tools referenced:

    • Claude Code: https://claude.ai/code

    • Claude Agent SDK: https://code.claude.com/docs/en/agent-sdk/overview

    • Codex: https://openai.com/index/openai-codex/

    • OpenAI Agent SDK: https://developers.openai.com/api/docs/guides/agents

    • VS Code: https://code.visualstudio.com/

    • Docker: https://www.docker.com/

    • Firefox: https://www.mozilla.org/firefox/

    • Address Sanitizer: https://github.com/google/sanitizers

    • RLBox: https://rlbox.dev/

    —

    Other references:

    • Mozilla Bug Bounty Program: https://www.mozilla.org/security/bug-bounty/

    • Mozilla GitHub: https://github.com/mozilla

    —

    Where to find Brian Grinstead:

    LinkedIn: https://www.linkedin.com/in/bgrins/

    GitHub: https://github.com/bgrins

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Zijn er afleveringen die ontbreken?

    Klik hier om de feed te vernieuwen.

  • I break down every loop type from scratch—what a heartbeat, cron, hook, and goal loop actually are, when each one fits, and the five things any effective loop needs before it touches production. Then I build two live loops: a daily aging-PR reviewer in Claude Code that schedules itself at 10:15 a.m. and spins off its own subagents, and a weekly skills-identification loop in Codex that spawns goal-based subagents to validate its own output in real time.

    What you’ll learn:

    The plain-English definition of a loop—and why it’s just an automated prompt, not a scary new paradigmThe four loop types (heartbeat, cron, hook, and goal) and when each one actually fits your workflowHow to think about loop design using the “onboarding an employee” mental modelThe five things every effective loop needs: work trees, skills, plugins/connectors, subagents, and state trackingHow to build a scheduled PR-review routine in Claude Code that babysits aging PRs and alerts your teamHow to set up a weekly skills-identification automation in Codex that spawns its own validating subagentsWhy goal-based loops are the hardest to write well—and where most people burn tokens for nothingThe two warning signs that your loop is going to get expensive before it gets useful

    —

    Brought to you by:

    WorkOS—Make your app enterprise-ready today

    Runway—The creative AI platform for images, video, and more

    —

    In this episode, we cover:

    (00:00) Prompts are out and loops are in

    (02:30) Defining a loop

    (03:03) The four ways to automate a prompt: heartbeat, cron, hooks, and goals

    (06:03) Five things every effective loop needs

    (09:26) The “onboarding an employee” framework for designing loops

    (11:58) Live build #1: Daily aging PR loop in Claude Code

    (17:08) Subagents inside loops

    (19:00) Live build #2: Weekly skills identification loop in Codex

    (22:57) Watching subagents spin up in real time

    (25:28) Warning signals around loops

    (27:31) What listeners are doing with loops

    —

    Tools referenced:

    • Claude Code: https://claude.ai/code

    • Codex: https://chatgpt.com/codex

    • OpenClaw: https://openclaw.ai/

    —

    Other references:

    • Claire’s article “Why OpenClaw Feels Alive Even Though It’s Not”: https://x.com/clairevo/article/2017741569521271175

    • Addy Osmani’s article on loop engineering: https://addyosmani.com/blog/loop-engineering/

    • Using Goals in Codex: https://developers.openai.com/cookbook/examples/codex/using_goals_in_codex

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • In this episode, I sit down with Ankur Goyal, founder and CEO of Braintrust, the AI evals and observability platform used by teams like Notion, Stripe, Vercel, and Zapier. This one is for the senior engineers, staff engineers, VPs of engineering, and CTOs in my audience. We get into how coding agents can take on deeply technical architecture and infrastructure work that no single human engineer could tackle before, and then we demystify evals so you can use them to make your AI products better without touching the implementation.

    What you’ll learn:

    How Ankur uses Codex to run week-long benchmark experiments across database indexes, column store formats, and execution engines to speed up slow queriesWhy he argues there’s no excuse to skip rigorous benchmarking now that agents can run them tirelesslyThe “agent line” framework: how to decide which decisions, directions, and interactions you can hand off to an agentHow I think about the practical vs. theoretical quality of AI on hard technical problems, and why human attention decays on tedious workWhy evals are the modern version of a PRD, and how to encode “what good looks like” so a model can figure out the “how”How to build a scoring function live and let an agent improve your prompt inside a safe playgroundHow Ankur turned his designer David’s taste into a repeatable eval so quality scales beyond one personWhy fixing your CI is the highest-leverage way to speed up engineering velocity

    —

    Brought to you by:

    Guru—The AI layer of truth

    Persona—Trusted identity verification for any use case

    —

    In this episode, we cover:

    (00:00) Introduction to Ankur Goyal

    (03:00) Using AI agents for database optimization

    (06:10) Running exhaustive benchmarks with coding agents

    (09:03) Why staff engineers are wrong about AI limitations

    (11:30) The “agent line” framework for delegation

    (14:00) Ankur’s workflow: running 4 to 6 concurrent agents

    (17:16) Technical setup: foreground agents, background agents, and cloud environments

    (20:32) Spending time with AI tools

    (23:06) Demystifying evals

    (26:02) Live demo: Building an eval for documentation answers

    (30:20) The alternative to evals: vibe checks and whack-a-mole

    (32:09) Capturing designer taste in scoring functions

    (33:13) Quick recap

    (33:44) Managing velocity and throughput

    (35:40) Why CI/CD investment is critical for AI-accelerated teams

    (37:30) Ankur’s prompting strategy when agents fail

    (39:10) Closing thoughts and how to connect

    —

    Tools referenced:

    • Braintrust: https://www.braintrust.dev/

    • Codex: https://openai.com/codex/

    • GPT 5.4: https://developers.openai.com/api/docs/models/gpt-5.4

    • Claude: https://claude.ai/

    —

    Other references:

    • GPT 5.5 just did what no other model could: https://www.lennysnewsletter.com/p/gpt-55-just-did-what-no-other-model

    • Paul Graham’s Maker vs. Manager Schedule: http://www.paulgraham.com/makersschedule.html

    • tmux: https://github.com/tmux/tmux

    • Chris Tate at Vercel: https://www.linkedin.com/in/ctatedev/

    —

    Where to find Ankur Goyal:

    LinkedIn: https://www.linkedin.com/in/ankrgyl/

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Claude Fable 5 is the first Mythos-class intelligence model to be generally available, and I got early access to test it before launch. In this episode, I walk through what Anthropic is promising, what actually stood out when I used it on real work, and where I think it fits in your AI stack.

    —

    In this episode, we cover:

    (00:00) Introduction: Fable 5 is finally here

    (00:31) What Anthropic says about the model

    (05:14) Token-intensive by design

    (06:28) Safety classifiers and the new fallback concept

    (07:46) Is this or is this not Mythos?

    (08:30) New product launches: Managed Agents and more

    (09:20) Crushing benchmarks

    (09:55) What it’s actually like to use (the good and the bad)

    (11:40) Test 1: product graph spec

    (12:56) Test 2: designing a skills registry

    (14:04) Conservative on execution

    (14:43) Test 3: multi-agent orchestration

    (15:39) My takeaways

    —

    Tools referenced:

    • Claude Fable 5: https://www.anthropic.com/news/claude-fable-5-mythos-5

    • Claude Managed Agents: https://platform.claude.com/docs/en/managed-agents/overview

    —

    Other reference:

    • SWBench Pro benchmark: https://www.swebench.com/

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Nicole Ruiz is a writer and parent who has built a comprehensive AI-powered shopping system to help her family buy high-quality, long-lasting items while avoiding the noise of drop-shipping brands, paid ads, and poorly made products. She writes an interview series on Substack about how technology is changing the household.

    What you’ll learn:

    How to build a Claude Project with custom instructions for vetting brands based on heritage, craftsmanship, and return policiesThe shopping criteria that help surface century-old manufacturers over trendy direct-to-consumer brandsHow to use Claude to search through trusted vendor websites that have terrible UXWhy AI actually helps small artisans and heritage brands compete against Amazon’s infrastructureHow to use Claude Cowork to automate returns by finding receipts in your email and drafting refund requestsThe technique for getting Claude to analyze whether a brand is legitimate or just a drop-shipping operationHow to shop within a specific budget or with gift cards using AI assistance

    —

    Brought to you by:

    Orkes—The enterprise platform for reliable applications and agentic workflows

    Metaview—The agentic recruiting platform for winning teams

    —

    In this episode, we cover:

    (00:00) Introduction to Nicole and AI-powered shopping

    (02:29) The problem

    (04:55) Building a Claude Project for household purchasing

    (07:44) The “anti-to-do list” concept for reducing mental overhead

    (10:30) Shopping for a can opener: the system in action

    (15:53) How AI helps century-old brands with terrible websites

    (18:45) Processing returns with Claude Cowork

    (25:06) Using gift cards strategically

    (26:33) Vetting brands

    (29:40) Recap, lightning round, and final thoughts

    —

    Tools referenced:

    • Claude: https://claude.ai/

    • Claude Cowork: https://www.anthropic.com/product/claude-cowork

    —

    Other references:

    • Boston General Store: https://bostongeneralstore.com/

    • L.L.Bean: https://www.llbean.com/

    • Manufactum: https://www.manufactum.com/

    • 5 OpenClaw agents run my home, finances, and code | Jesse Genet: https://www.lennysnewsletter.com/p/5-openclaw-agents-run-my-home-finances

    • From a $6.90 newsletter to $3M API: How a non-coder built Memelord | Jason Levin: https://www.lennysnewsletter.com/p/from-a-690-newsletter-to-3m-api-how

    —

    Where to find Nicole Ruiz:

    X: https://x.com/nwilliams030

    Substack (The Third Oikos): https://www.thirdoikos.com/

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • In this experimental episode, I document my real-time attempt to create an AI avatar of myself using Google Flow and the new Gemini Omni video generation model. I walk through the entire process—from scanning my face with my phone to generating a complete one-minute hype video for the podcast, all in about 15 minutes.

    What you’ll learn:

    How to create an AI avatar using Google Flow in under five minutesWhy video AI tools unlock creative possibilities for people with zero video production skillsThe step-by-step process of generating a full storyboard using AI as your creative producerHow to use character consistency features to generate multiple video scenes with the same avatarThe uncanny-valley moments you’ll encounter when your AI clone doesn’t quite nail emotions or physicsHow to stitch together AI-generated scenes into a complete video using built-in editing tools

    —

    Brought to you by:

    Merge—Connective infrastructure for production AI

    Jira Product Discovery—Prioritize with insights, build with confidence

    —

    In this episode, we cover:

    (00:00) Getting started with Google Flow and Gemini Omni

    (01:38) The avatar creation process: scanning and photo capture

    (02:55) Using Flow to brainstorm a hype video storyboard

    (06:59) Generating the first video scene with the avatar

    (08:41) Troubleshooting: accidentally generating images instead of videos

    (09:32) Generating all seven scenes for the complete video

    (11:37) Reviewing the avatar videos

    (13:13) Stitching the videos together in the browser-based editor

    (14:32) The complete How I AI hype video

    (15:32) What worked and what didn’t

    (19:04) Final thoughts

    —

    Tools referenced:

    • Google Flow: https://labs.google/fx/tools/flow

    • Gemini Omni: https://gemini.google/overview/video-generation/

    • Veo 3: https://deepmind.google/technologies/veo/

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Bryce Rattner Keithley has spent her career in talent and recruiting, working with technical leaders but never writing a line of code herself. Yet she managed to build Daily Hundred—a fitness app featuring custom AI-generated videos of anthropomorphic animals demonstrating exercises—and ship it to the App Store before her software engineer friends. Using Replit, Claude, Gemini, and a relentless beginner’s mindset, Bryce proves that in the AI era, execution is no longer the constraint on good ideas.

    What you’ll learn:

    How to build and ship an iPhone app using Replit without any coding knowledgeThe step-by-step process for creating custom AI-generated workout videos by combining Gemini images with real exercise footageHow to use Claude as your technical architect and Claude Code as your software engineerHow to navigate App Store submission requirements (including fixing rejection feedback)Why being hyper-literal in your prompts unlocks better AI resultsWhy a beginner’s mind is actually an advantage when building with AI tools

    —

    Brought to you by:

    WorkOS—Make your app enterprise-ready today

    Metaview—The agentic recruiting platform for winning teams

    —

    In this episode, we cover:

    (00:00) Introduction to Bryce and Daily Hundred

    (04:48) Building with Replit

    (06:16) The beginner’s mindset advantage

    (11:17) Creating anthropomorphic animals

    (22:55) Moving from static image to video

    (27:15) The floating genie and other anthropomorphic animal generations

    (30:46) Shifting from web app to App Store submission

    (36:24) User feedback

    (37:41) Lightning round and final thoughts

    —

    Tools referenced:

    • Replit: https://replit.com/

    • Lovable: https://lovable.dev/

    • Claude: https://claude.ai/

    • Claude Code: https://claude.ai/code

    • Gemini: https://gemini.google.com/

    • Higgsfield: https://higgsfield.ai/

    • Kling: https://kling.ai/

    • Railway: https://railway.app/

    • TestFlight: https://developer.apple.com/testflight/

    —

    Other references:

    • How a 91-year-old vibe coded a complex event management system using Claude and Replit | John Blackman: https://www.lennysnewsletter.com/p/how-a-91-year-old-vibe-coded-a-complex

    • What Got You Here Won’t Get You There: https://www.amazon.com/What-Got-Here-Wont-There/dp/1401301304

    • How Women Rise: https://www.amazon.com/How-Women-Rise-Holding-Careers/dp/0316440124

    • A Whole New Mind: https://www.amazon.com/Whole-New-Mind-Right-Brainers-Future/dp/1594481717

    • How to Win Friends and Influence People: https://www.amazon.com/How-Win-Friends-Influence-People/dp/0671027034

    —

    Where to find Bryce Rattner Keithley:

    LinkedIn: https://www.linkedin.com/in/brycerattner/

    GitHub: https://github.com/brk-bot/

    Daily Hundred on the App Store: https://apps.apple.com/us/app/daily100-fitness-challenge/id6762108062

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • I got a few hours of early-access testing with Anthropic’s newly released model Opus 4.8. I walk through real coding, design, and strategy tasks across Claude Code and Claude Cowork, and give you my unfiltered view on what impressed me and what didn’t.

    —

    What you’ll learn:

    Where Opus 4.8 excels: greenfield prototypes, one-shot features, and fast executionWhere it struggles: the last 10%, edge cases in existing codebases, and hallucinationsHow Opus 4.8 compares to Opus 4.7 on business strategy workWhy I’m still reaching for Opus 4.7 on data-heavy strategy and roadmap workThe new features shipping alongside the model: dynamic workflows with parallel subagents and effort control in Claude.ai and CoworkThe prompting and harness strategy I’d use to get the most out of it

    —

    In this episode, we cover:

    (00:00) Introduction to Opus 4.8

    (00:44) Benchmark performance and pricing

    (01:53) First coding test: Building a prototyping tool

    (03:00) Where it failed: The last 10% problem

    (03:27) The hallucination problem

    (04:23) Testing Opus 4.8 on existing codebases

    (05:24) The ambition test: Building games for a 9-year-old

    (07:03) Business strategy test: 4.7 vs 4.8

    (08:23) The roadmap test

    (09:17) Final verdict

    —

    References:

    • System Card: Claude Opus 4.8: https://cdn.sanity.io/files/4zrzovbb/website/c886650a2e96fc0925c805a1a7ca77314ccbf4a6.pdf

    • Introducing Claude Opus 4.8 on X: https://x.com/claudeai/status/2060042702150930686?s=20

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • In this 30-minute episode, I walk through my favorite feature in Codex: the /goal command. I show how Goals transform AI from a turn-based assistant that needs constant ‘what’s next?’ prompting into an autonomous agent that can work for hours on complex, multi-step tasks. I share three real examples: eliminating thousands of Sentry errors, cleaning 3,900 emails down to 68, and organizing hundreds of Linear tasks.

    What you’ll learn:

    What Goals are and how they differ from standard promptsHow I used /goal to eliminate hundreds of error logs in my codebase over a five-hour autonomous runThe non-technical use cases that make Goals incredibly powerful: cleaning up 3,900 emails in under four hours and organizing hundreds of project management tasks in LinearHow to write effective /goal prompts with measurable outcomes, verification methods, and constraintsWhen not to use Goals and what makes a strong versus weak GoalWhy Goals represent a fundamental shift in how we work with AI, from babysitting the model to managing it

    —

    Brought to you by:

    Mercury—Radically different banking loved by over 300K entrepreneurs

    —

    In this episode, we cover:

    (00:00) Introduction

    (01:50) What is /goal and when should you use it?

    (02:45) The difference between prompts and Goal-based loops

    (04:06) Claire’s first five-hour 45-minute autonomous coding task

    (05:05) How to manage a Goal lifecycle: view, pause, resume, and clear

    (06:06) How to write strong goals: outcomes vs. outputs

    (07:34) The six components of effective Goals

    (08:57) Example: Reducing P95 checkout latency with /goal

    (09:36) Demo: Using /goal to eliminate Sentry errors in ChatPRD

    (13:18) Demo: Burning down Vercel API errors

    (17:28) Non-technical use case: Cleaning 3,900 emails with /goal

    (21:24) Demo: Using /goal to clean up Linear project tasks

    (24:41) When not to use /goal

    (26:10) Why /goal changes everything

    —

    Tools referenced:

    • Codex: https://openai.com/codex/

    • Sentry: https://sentry.io/

    • Vercel: https://vercel.com/

    • Linear: https://linear.app/

    —

    Other reference:

    • OpenAI blog post “Using Goals in Codex”: https://developers.openai.com/cookbook/examples/codex/using_goals_in_codex

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Felix Rieseberg is the engineering lead for Claude Cowork and Claude Code Desktop at Anthropic. He previously spent five years at Slack building developer tools. In this episode, Felix demonstrates how he uses Claude to solve real-life problems: analyzing floor plans to build interactive 3D house walkthroughs, automatically tracking promises he makes on Twitter, and building a $20 hardware device that physically approves Claude actions with a button press.

    What you’ll learn:

    How to use Claude Cowork to turn a 2D floor plan into an interactive 3D walkthrough where you can move furniture aroundThe “go one abstraction layer up” philosophy: why you should never manually enter data Claude can find itselfHow to use your email as an inventory database for furniture, clothing, and personal purchasesWhen to use Opus vs. Sonnet 4.6 (hint: it’s about how well you can scope the problem, not technical complexity)How live artifacts work and why they’re powerful for dashboards that refresh with real-time data from your connectorsThe product philosophy behind making latency delightfulHow to build your own $20 hardware device using Claude Code (no hardware experience required)Why Felix never reads the code Claude writes and judges it purely on output

    —

    Brought to you by:

    Magic Patterns—Prototypes that look like your product

    Guru—The AI layer of truth

    —

    In this episode, we cover:

    (00:00) Introduction to Felix Rieseberg

    (02:40) Felix’s role at Anthropic

    (03:25) The multiple tabs in Claude and why they exist

    (05:55) Using Claude Cowork to design a new house using floor plans

    (09:52) When to use Opus versus Sonnet 4.6

    (12:37) Building an interactive 3D furniture planner

    (14:30) Using your email as a source of truth for personal inventory

    (15:58) The anti-to-do list: going one abstraction layer up

    (23:14) Introduction to live artifacts

    (26:02) Building a personal dashboard with live data

    (28:37) Being polite to Claude (and why it matters for your humanity)

    (30:28) Claude interaction tips

    (32:33) Looking at the daily dashboard

    (33:55) How live artifacts work with connectors

    (35:02) Redesigning the dashboard

    (37:55) The biggest gap: people don’t know what problems AI can solve

    (41:52) The reverse interview

    (42:30) Making latency delightful through asynchronous design

    (44:05) The redesigned dashboard

    (45:28) AI should free up your creative energy

    (46:44) Building a $20 hardware Claude buddy

    (52:33) Why kids are magical AI users

    (54:30) Recap and final thoughts

    —

    Tools referenced:

    • Claude Cowork: https://www.anthropic.com/product/claude-cowork

    • Claude Code: https://claude.ai/code

    • Claude for Chrome: https://code.claude.com/docs/en/chrome

    • Claude Desktop: https://claude.ai/download

    • Live Artifacts: https://support.claude.com/en/articles/14729249-use-live-artifacts-in-claude-cowork

    • Connectors (Spotify, Gmail, Calendar, Notion): https://claude.ai/settings/connectors

    • Slack: https://slack.com/

    —

    Where to find Felix Rieseberg:

    Website: https://felixrieseberg.com/

    LinkedIn: https://www.linkedin.com/in/felixrieseberg/

    X: https://x.com/felixrieseberg

    GitHub: https://github.com/felixrieseberg

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Today is day one of Google I/O 2026, and I walk through every major announcement live—from the new Gemini 3.5 model family to Anti-Gravity 2.0, Google AI Studio, Gemini’s consumer redesign, the Omni video model, Flow, Stitch, and Pomelli. I test them in real time and tell you exactly which ones delivered.

    What you’ll learn:

    How Gemini 3.5 Flash benchmarks against Claude and GPT models on speed and agentic coding tasksHow Anti-Gravity 2.0’s new features (projects, scheduled tasks, subagents, slash commands) compare to Codex and Claude CodeWhy the /grill-me slash command could be a more aggressive alternative to Claude Code’s clarification flow—and how to use itHow Google AI Studio’s new Workspace integration is designed to own the internal productivity app use caseHow Google’s new creative tools work in practice: Omni (video generation), Flow (cinematic video editing and character consistency), Stitch (streaming UI design with inline edits), and Pomelli (brand identity and asset generation)Why Google’s launch-to-availability gap is still a problem—and what to do when a featured product doesn’t actually work yet

    —

    Brought to you by:

    Magic Patterns—Prototypes that look like your product

    Thoughtspot—Build AI-powered analytics into your product

    —

    In this episode, we cover:

    (00:00) Google I/O 2026 day 1 overview

    (01:47) Gemini 3.5 flash

    (04:19) Antigravity updates

    (06:32) CLI test and agent features

    (07:59) Core agent features released today—May 19th, 2026

    (09:43) New slash commands

    (11:20) Antigravity test results and takeaways

    (12:25) AI Studio updates

    (13:52) Access issues

    (15:20) Gemini redesign

    (17:24) Gemini image gen test

    (19:16) Omni (video generation)

    (22:56) Flow (cinematic editing)

    (24:31) Avatar creation test

    (26:45) Pomelli and Stitch

    (31:13) Recap and final thoughts

    —

    Tools referenced:

    • Gemini 3.5 Flash: https://deepmind.google/technologies/gemini/

    • Antigravity: https://antigravity.google/

    • Google AI Studio: https://aistudio.google.com/

    • Google Gemini: https://gemini.google.com/

    • Omni (video generation): https://gemini.google/overview/video-generation/

    • Google Flow: https://flow.google/

    • Stitch: https://stitch.withgoogle.com/

    • Pomelli (Google brand tool): https://labs.google.com/pomelli/about/

    —

    Other references:

    • Google I/O 2026 announcements: https://blog.google/innovation-and-ai/sundar-pichai-io-2026/

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Thariq Shihipar is an engineer at Anthropic working on the Claude Code team. He’s spent the past several months experimenting with HTML as a replacement for Markdown in planning and implementation workflows, discovering that richer visual formats lead to better human engagement—and, ultimately, better products. In this episode, filmed at Anthropic’s Code with Claude event in San Francisco, Thariq demonstrates how to use HTML artifacts to create interactive plans, build throwaway UIs for specific problems, and maintain living design systems that travel with your codebase.

    What you’ll learn:

    Why HTML has replaced Markdown as the ideal format for AI agent communication and planningHow to brainstorm in HTML to get visual mockups and interactive demos instead of text listsThe technique for building throwaway micro-UIs to edit specific parts of your planHow to create a living design system in HTML that lives in your repo and travels with every projectWhy “complexity has to earn its keep” and how HTML helps you stay in the loop without over-constraining ClaudeThe prompting technique that gives Claude flexibility while ensuring that you get what you needWhy 99% of your AI-generated tokens should go to planning, interfaces, and communication—not production code

    —

    Brought to you by:

    Celigo—Intelligent automation built for AI

    Persona—Trusted identity verification for any use case

    —

    In this episode, we cover:

    (00:00) Introduction

    (02:39) HTML as the new Markdown

    (04:30) The compute allocator mindset

    (05:51) How HTML makes specs more engaging

    (06:48) Demo: Brainstorming in HTML with Claude Code

    (09:24) From brainstorm to full implementation plan

    (11:20) Prompting philosophy: Trust Claude but give it constraints

    (13:50) The future of PRDs and tech specs

    (18:16) Making HTML specs editable

    (20:23) The abundance mindset

    (24:17) Just-in-time documentation and throwaway software

    (25:39) Using plans as artifacts for implementation

    (26:39) Demo: Living design systems in HTML

    (30:16) Adding comments and annotations to HTML plans

    (31:42) Recap: The HTML workflow

    (32:21) Lightning round and final thoughts

    —

    Tools referenced:

    • Claude Code: https://claude.ai/code

    • Claude Design: https://claude.ai/design

    • AWS: https://aws.amazon.com/

    • Figma: https://www.figma.com/

    • GitHub: https://github.com/

    —

    Other references:

    • Anthropic Code with Claude event: https://claude.com/code-with-claude

    • SpaceX partnership announcement: https://www.anthropic.com/news/higher-limits-spacex

    • Jevons paradox: https://en.wikipedia.org/wiki/Jevons_paradox

    —

    Where to find Thariq Shihipar:

    Website: https://www.thariq.io/

    LinkedIn: https://www.linkedin.com/in/thariqshihipar/

    X: https://x.com/trq212

    GitHub: https://github.com/ThariqS

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Ryan Nystrom is a software engineer at Notion. He joined in December 2024 after Notion acquired Campsite, the team communication platform he co-founded with Brian Lovin. At Notion, he’s been a core builder of Notion AI and the Custom Agents feature launched in February 2026. He manages a team of six to seven engineers while still writing code himself, currently running Project Afterburner, a push to cut Notion’s CI time to a quarter of its current duration.

    What you’ll learn:

    How to build a Notion AI custom agent that auto-generates your daily standup pre-read by pulling from Slack, GitHub, Honeycomb metrics, and yesterday’s meeting transcriptHow to configure subagents and MCP integrations within Notion AIHow Notion’s internal “Boxy” system lets engineers @mention Codex from within Notion comments and get a full pull request with screenshots in 20 minutesThe spec-first development workflow: dictate an idea into Whisper, have Codex format it as a proper spec, commit it to the repo, and let the agent implement and verify it autonomouslyWhy fast CI is absolutely critical in the age of AI coding agentsHow to prompt AI coding agents to defend their reasoning under pushbackWhy engineering managers and even senior executives should keep writing code

    —

    Brought to you by:

    WorkOS—Make your app enterprise-ready today

    Orkes—The enterprise platform for reliable applications and agentic workflows

    —

    In this episode, we cover:

    (00:00) Introduction to Ryan Nystrom

    (02:48) How AI has upended 12+ years of the same working routine

    (04:30) Project Afterburner: Notion’s push to cut CI time to a quarter

    (09:00) Why high-frequency, high-quality meetings beat lower-frequency standups

    (11:10) How automated context surfaces every engineer’s work equally

    (12:15) Why cutting meeting prep is a burnout protection mechanism

    (14:26) The case for engineering managers writing code

    (16:13) Inside “Boxy”: Notion’s internal VM-based background agent system

    (20:30) Old World vs. New World code review

    (24:51) Prompting Codex from Notion comments

    (29:20) The emotions around code review

    (31:01) Quick recap

    (32:00) Spec-first development: writing and checking agent specs into the repo

    (35:10) The spec as changelog: version control for how a feature actually works

    (37:53) How engineers’ roles are evolving

    (39:00) Lightning round

    (45:21) Where to find Ryan

    —

    Tools referenced:

    • Notion AI: https://www.notion.com/product/ai

    • Notion Custom Agents: https://www.notion.com/blog/introducing-custom-agents

    • Codex (OpenAI): https://openai.com/codex

    • Claude Code (Anthropic): https://claude.ai/code

    • Honeycomb (observability + MCP): https://www.honeycomb.io

    • Whisper (OpenAI voice transcription): https://openai.com/research/whisper

    • Slack: https://slack.com

    • GitHub: https://github.com

    —

    Other references:

    • How Stripe built “minions”—AI coding agents that ship 1,300 PRs weekly from Slack reactions | Steve Kaliski (Stripe): https://www.chatprd.ai/how-i-ai/stripes-ai-minions-ship-1300-prs-weekly-from-a-slack-emoji

    • Notion 3.3 Custom Agents launch (February 24, 2026): https://www.notion.com/releases/2026-02-24

    —

    Where to find Ryan Nystrom:

    X: https://x.com/ryannystrom

    LinkedIn: https://www.linkedin.com/in/ryannystrom/

    GitHub: https://github.com/rnystrom

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Claire breaks down the biggest announcements from Anthropic’s “Code with Claude” event and what they actually mean for builders shipping AI products today. From scheduled AI routines to outcome-based agents, multi-agent orchestration, and new memory systems, Claire walks through the features she’s most excited to use immediately—and how they could reshape the future of agentic software.

    What you’ll learn:

    How Claude Code routines let you automate recurring workflows on schedules or webhooksWhat “Outcomes” are and how rubric-based agent grading worksHow multi-agent orchestration enables specialized AI teams with different roles and toolsWhy Anthropic’s new “Dreams” memory system matters for long-term agent behaviorWhy increased Claude Code usage limits are a bigger deal than they soundHow Claire thinks about building practical agentic products today

    —

    Resources:

    • Code with Claude: https://claude.com/code-with-claude

    • Claude Code Routines Docs: https://code.claude.com/docs/en/routines

    • Define Outcomes Docs: https://platform.claude.com/docs/en/managed-agents/define-outcomes

    • Dreams Docs: https://platform.claude.com/docs/en/managed-agents/dreams

    • Multi-Agent Docs: https://platform.claude.com/docs/en/managed-agents/multi-agent

    • Managed Agent Webhooks Docs: https://platform.claude.com/docs/en/managed-agents/webhooks#supported-event-types

    • Codex (OpenAI): https://openai.com/codex

    • GitHub: https://github.com

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • John Kim is the co-founder and CEO of Delight.ai, a customer experience platform that’s transforming how companies deploy AI. But what makes John’s story fascinating isn’t just his product; it’s how he’s turned his entire company into an AI-native organization. His marketing team built a fully functional e-commerce swag store with Stripe integration in days. His sales team built their own CRM tools. His recruiting team automated their entire workflow. And it’s all tracked, measured, and celebrated through an internal platform called Automators.

    What you’ll learn:

    How Sendbird’s marketing team built a fully functional swag store with Stripe integration in a day (with no engineering support)How the Automators platform works—an internal marketplace where anyone can request AI tools and engineers (or AI agents) can build themHow to create secure, compliant templates so non-technical teams can ship to production safelyHow Sendbird built a token usage dashboard with five tiers (beginner through AI God) and why tracking the smoothness of the curve matters more than the totalWhy visible leadership usage is the most powerful adoption signalWhy Sendbird rewrote job descriptions to prioritize curiosity, agency, and energy over years of experienceHow John uses AI for his own learning

    —

    Brought to you by:

    WorkOS—Make your app enterprise-ready today

    ThoughtSpot—Build AI-powered analytics into your product

    —

    In this episode, we cover:

    (00:00) Introduction to John Kim

    (02:45) The Delight.ai swag store built by marketing in two days

    (05:51) The before times: when fun had to earn its place on the roadmap

    (07:55) Demo: The Automators platform and quest system

    (13:47) The AI Engineer for Internal Operations role

    (16:06) Demo: The company-wide skills marketplace

    (17:19) Treating AI adoption as a product

    (18:43) Real wins: team-level and campaign examples

    (21:51) Why SaaS isn’t dead—it’s being rebuilt internally

    (23:46) Demo: The token tracking dashboard

    (26:32) Measuring without fear: setting expectations, not punishments

    (28:54) Quick recap

    (30:51) Personal AI use cases: endless knowledge at your fingertips

    (36:15) Lightning round and final thoughts

    —

    Tools referenced:

    • Claude Code: https://claude.ai/code

    • Codex (OpenAI): https://openai.com/codex

    • Obsidian: https://obsidian.md

    • GitHub: https://github.com

    • Stripe: https://stripe.com

    —

    Other references:

    • Jason Levin (CEO of Memelord) on How I AI: https://www.lennysnewsletter.com/p/from-a-690-newsletter-to-3m-api-how

    • Konami Code: https://en.wikipedia.org/wiki/Konami_Code

    • Andrew Huberman’s podcast: https://hubermanlab.com/

    • Y Combinator: https://www.ycombinator.com/

    —

    Where to find John Kim:

    X: https://x.com/doshkim

    Instagram: https://instagram.com/dosh

    LinkedIn: https://www.linkedin.com/in/doshkim/

    Company: https://delight.ai

    Delight.ai Spark Conference (May 7, SF): https://delight.ai/spark

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Owen Williams is a design manager at Stripe who built Protodash, an internal AI-powered prototyping platform that lets designers and PMs create high-quality Stripe dashboard prototypes without writing code. What started as a bundle of Cursor rules and React components evolved into a full web-based prototyping studio that runs in dev boxes, complete with design review modes, variant testing, and AI-powered iteration. Surprisingly, PMs now use Protodash just as much as designers, fundamentally changing how Stripe approaches prototyping, design reviews, and engineering handoffs.

    What you’ll learn:

    How Stripe built an internal AI prototyping tool using Cursor rules, MCPs, and their design systemWhy “blurple slop” happens when designers use generic AI tools—and how to fix itThe architecture behind Protodash: React router, design system components, and MCP integrationsHow Stripe prototypes in dev boxes so designers never have to worry about local setupWhy “demos, not memos” transformed Stripe’s design review cultureHow Stripe built design review modes, variant testing, and AI annotation directly into your prototyping toolWhy internal tools don’t need to be production-grade to be transformative

    —

    Brought to you by:

    Celigo—Intelligent automation built for AI

    Cursor—The best way to code with AI

    —

    In this episode, we cover:

    (00:00) Welcome and intro to Owen Williams

    (02:19) The “blurple slop” problem with AI design tools

    (03:50) Protodash: an internal vibe-coding tool for Stripe prototypes

    (05:26) Why an engineering background helped Owen lower the bar for designers

    (07:55) The Cursor rules that taught the Stripe design system

    (09:04) Running prototypes on dev boxes vs. locally

    (10:30) “Demos, not memos” and rewiring design reviews at Stripe

    (14:50) Building Protodash Studio: a browser-based wrapper for prototyping

    (19:04) Live demo: variants, line charts, and remixing prototypes in browser

    (21:02) Self-testing prototypes that take screenshots and check their work

    (23:20) Multiple variant features

    (26:08) The annotate-for-AI button for in-canvas feedback

    (27:21) Design review mode: comments, summaries, and AI follow-up

    (29:39) Why building internal tools beats buying off-the-shelf

    (32:50) PMs as the surprise power users of Protodash

    (35:20) Live demo: a Black Friday/Cyber Monday pet store dashboard

    (42:03) Lo-fi modes, monospace fonts, and “Comic Sans for WIP” at Shopify

    (44:45) Quick recap

    (45:35) The Radar prototype that changed engineering handoff

    (49:08) Lightning round and final thoughts

    —

    Blog & detailed workflow walkthroughs from this episode:

    Stripe’s Owen Williams on Killing ‘Blurple Slop’ with an Internal Prototyping Studio: http://chatprd.ai/how-i-ai/stripe-owen-williams-on-buildling-internal-prototyping-studio

    ↳ How To Connect a Design System to an AI Code Editor for High Fidelity Prototypes: https://www.chatprd.ai/how-i-ai/workflows/how-to-connect-a-design-system-to-an-ai-code-editor-for-high-fidelity-prototypes

    ↳ Streamline Design Reviews with an AI-Powered Prototyping Studio: https://www.chatprd.ai/how-i-ai/workflows/streamline-design-reviews-with-an-ai-powered-prototyping-studio

    ↳ Build a Personal AI App to Track Purchases and User Manuals: https://www.chatprd.ai/how-i-ai/workflows/build-a-personal-ai-app-to-track-purchases-and-user-manuals

    —

    Tools referenced:

    • v0: https://v0.app/

    • Cursor: https://cursor.com/

    • Claude Code: https://www.claude.com/product/claude-code

    • Claude Design: https://www.anthropic.com/news/claude-design-anthropic-labs

    • Figma: https://www.figma.com/

    • Stripe Radar: https://stripe.com/radar

    • Balsamiq: https://balsamiq.com/

    —

    Where to find Owen Williams:

    X: https://x.com/ow

    Website: https://owenwillia.ms/

    LinkedIn: https://www.linkedin.com/in/owenpwilliams

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • Jason Levin is the CEO and founder of Memelord, an AI-powered meme creation platform that helps brands and individuals create contextual, trending memes. He started Memelord as a $6.90-per-month newsletter sending subscribers to a Google Slides deck, grew it to $100K ARR on Bubble without hiring engineers, then raised $3M to build it into an API-first product.

    What you’ll learn:

    How Jason grew Memelord from a $6.90/month newsletter to $100K ARR without writing a single line of codeWhy “no UX is the best UX” and how agents are becoming Memelord’s primary usersThe mandatory vibe-coding rule for his marketing team and how it unlocks unprecedented creativityWhy free tools are the new PDF downloads and how they’ve generated hundreds of thousands of emailsJason’s hardware hacking projects, including a bedside keyboard that creates Linear tickets without waking his wifeWhy AI can be funny (but humans are still funnier) and which model is the funniestThe philosophy of building hyper-personalized software just for yourself

    —

    Brought to you by:

    WorkOS—Make your app enterprise-ready today

    Persona—Trusted identity verification for any use case

    —

    In this episode, we cover:

    (00:00) Introduction to Jason Levin and Memelord

    (04:28) Demo: Agentic meme creation with OpenClaw

    (06:55) “No UX is the best UX”—building for an agent-first future

    (08:35) How Memelord started as a $6.90 newsletter with Google Slides

    (12:35) Building to $100K ARR on Bubble with 395 workflows

    (15:20) Demo: Free tools section that generates hundreds of thousands of emails

    (17:59) Why Cursor is perfect for non-technical founders

    (20:20) Let your marketers cook—or watch them leave

    (24:19) Commit graph that shows the vibe-coding inflection point

    (25:25) Tools: Claude, Gemini, Linear, PostHog

    (28:19) Build weird stuff in the real world

    (33:24) Creative AI use cases

    (39:56) Using OpenClaw for calendar analysis

    (43:37) Can AI be funny? Which model is funniest?

    (45:26) Memes are not slop

    (46:45) What Jason doesn’t use AI for

    (48:12) Final thoughts

    —

    Blog & detailed workflow walkthroughs from this episode:

    How I AI: Jason Levin’s Workflows for Agentic Memes, Vibe Coding, and Hardware Hacking: https://www.chatprd.ai/how-i-ai/jason-levins-workflows-for-agentic-memes-vibe-coding-and-hardware-hacking

    ↳ Build a Custom Bedside Keyboard for Idea Capture with Raspberry Pi and ChatGPT: https://www.chatprd.ai/how-i-ai/workflows/build-a-custom-bedside-keyboard-for-idea-capture-with-raspberry-pi-and-chatgpt

    ↳ Build Free Marketing Tools as Lead Magnets Using AI Code Assistants: https://www.chatprd.ai/how-i-ai/workflows/build-free-marketing-tools-as-lead-magnets-using-ai-code-assistants

    ↳ Automate Meme Marketing with an AI Agent and OpenClaw: https://www.chatprd.ai/how-i-ai/workflows/automate-meme-marketing-with-an-ai-agent-and-openclaw

    —

    Tools referenced:

    • Memelord API: https://memelord.com/api

    • Cursor: https://cursor.com/

    • Bubble: https://bubble.io/

    • OpenClaw: https://openclaw.ai

    • Claude: https://claude.ai/

    • ChatGPT: https://chat.openai.com/

    • Gemini: https://gemini.google.com/

    • Grok: https://grok.x.ai/

    • Linear: https://linear.app/

    • PostHog: https://posthog.com/

    • Zapier: https://zapier.com/

    —

    Other references:

    • Diego Zaks—“The best UX is no UX”: https://x.com/diegozaks/status/1966526522136649980

    • Sam Lessin: https://wlessin.com/

    • “Stop giving me advice”: https://stopgivingmeadvice.com

    • Memelord free tools: https://memelord.com/tools

    —

    Where to find Jason Levin:

    Twitter: https://twitter.com/iamjasonlevin

    Instagram: https://instagram.com/iamjasonlevin

    LinkedIn: https://www.linkedin.com/in/iamjasonlevin/

    Memelord: https://memelord.com

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • In this mini episode, I break down OpenAI’s new GPT 5.5 and GPT 5.5 Pro after weeks of early testing. I walk through three real jobs I threw at the model:  building an app for me to teach my second grader more advanced subtraction concepts, tackling a tech debt problem in the ChatPRD codebase, and hacking into a proprietary Bluetooth pixel display that every other model had failed me on. My verdict: higher intelligence, better efficiency, and genuinely autonomous long-running loops that change what I think is worth tackling.

    What you’ll learn:

    How I think about GPT 5.5 Pro’s pricing vs engineering time, and when I believe the “intelligence tax” is worth payingWhy I treat GPT 5.5 as a developer model first, and why I couldn’t find a consumer use case that justified its intelligenceThe exact prompt pattern I use to unlock a long-running autonomous subagent loopHow I got a near-six-hour autonomous run to one-shot 98% of edge cases in a migration over millions of chat threads and drop my Sentry error rate to the floorWhy I’m now throwing GPT 5.5 at tech debt, flaky tests, and security backlogs firstHow I combined a Bluetooth packet sniffer and GPT 5.5 to reverse-engineer a proprietary pixel speaker after Claude Code and GPT 5.4 both gave upHow I use the /personality command inside Codex to swap the default “baked potato” tone for something I actually enjoy working with

    —

    In this episode, I cover:

    (00:00) Introduction to GPT 5.5 testing

    (00:40) What is GPT 5.5 and how much does it cost?

    (03:23) Testing GPT 5.5 in ChatGPT: the intelligence overhang problem

    (07:12) Moving to Codex: where GPT 5.5 really shines

    (16:01) Hacking a Chinese Bluetooth speaker

    (21:47) Final thoughts on GPT 5.5’s intelligence and efficiency

    —

    Tools referenced:

    • GPT 5.5 and GPT 5.5 Pro: https://openai.com/index/introducing-gpt-5-5/

    • Codex: https://openai.com/codex/

    • ChatGPT: https://chat.openai.com/

    • Claude Code: https://claude.ai/code

    • Sentry: https://sentry.io/

    • Divoom MiniToo: https://divoom.com/products/minitoo

    —

    Other references:

    • OpenAI Codex Security: https://openai.com/index/codex-security-now-in-research-preview/

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].

  • In this mini episode, I do a full walkthrough of the AI design tools that dropped in April 2026: Anthropic’s new Claude Design, OpenAI’s GPT Images 2.0, and Google Labs’ open-source DESIGN.md format. I import a full design system from Lenny’s Newsletter, build a landing page, turn my own article into a polished deck, generate a brand kit for ChatPRD, and run a personal color analysis from a photo.

    What you’ll learn:

    How Claude Design handles design system imports and whether it can actually replace FigmaThe three best use cases for Claude Design: marketing landing pages, slide decks, and creative redesignsWhy ChatGPT Images 2.0 is a breakthrough for brand kits and layout workGoogle’s new DESIGN.md standardThe practical limits of AI design tools (spoiler: you’ll hit credit limits fast)

    —

    Brought to you by:

    WorkOS—Make your app enterprise-ready today

    Rippling—Stop wasting time on admin tasks, build your startup faster

    —

    In this episode, we cover:

    (00:00) Welcome and what’s in the spring 2026 AI design drop

    (01:45) Claude Design overview

    (03:05) Importing Lenny’s Newsletter design system into Claude Design

    (04:06) How Claude Design structures a design system

    (05:42) Google Labs’ DESIGN.md standard

    (06:41) Building Lenny Doc, a PRD generator landing page using the Lenny design system

    (09:44) Why the three-variation output is Claude Design’s smartest UX choice

    (10:20) Hitting the Claude Design limit and paying $200 to keep going

    (11:05) Where Figma still wins

    (13:20) Reviewing Lenny Doc

    (16:19) Turning an Open Claude article into a branded slide deck

    (17:57) The ’90s GeoCities “Lenny’s Product Zone” redesign

    (19:44) Claude Design recap

    (20:15) ChatGPT Images 2.0 and what makes it the first “thinking” image model

    (21:25) Generating a multi-page brand kit for ChatPRD and iterating with reference images

    (23:43) Personal color analysis demo

    (26:02) Recap

    —

    Detailed workflow walkthroughs from this episode:

    • How I Put Claude Design and GPT Images 2.0 to the Test: Building Landing Pages, Slides, and Brand Kits: https://www.chatprd.ai/how-i-ai/claude-design-and-gpt-images-2-building-landing-pages-slides-and-brand-kits

    • How to Generate a Professional Brand Kit with GPT Images 2.0: https://www.chatprd.ai/how-i-ai/workflows/how-to-generate-a-professional-brand-kit-with-gpt-images-2-0

    • How to Convert an Article into a Polished Slide Deck with AI: https://www.chatprd.ai/how-i-ai/workflows/how-to-convert-an-article-into-a-polished-slide-deck-with-ai

    • How to Build a High-Fidelity Landing Page with Claude Design: https://www.chatprd.ai/how-i-ai/workflows/how-to-build-a-high-fidelity-landing-page-with-claude-design

    —

    Tools referenced:

    • Claude Design: https://claude.ai/design

    • ChatGPT Images 2.0: https://openai.com/index/introducing-chatgpt-images-2-0/

    • Midjourney: https://www.midjourney.com/

    —

    Other references:

    • Google’s DESIGN.md: https://stitch.withgoogle.com/docs/design-md/overview

    • Lenny’s Newsletter: https://www.lennysnewsletter.com/

    • Jamie Gannon “How I AI” episode on reference styles: https://www.lennysnewsletter.com/p/mastering-midjourney-how-to-create

    • Brand prompt inspiration: https://x.com/riomadeit/status/2046682442791071787

    • Figma team “How I AI” episode on design systems: https://www.lennysnewsletter.com/p/from-figma-to-claude-code-and-back

    —

    Where to find Claire Vo:

    ChatPRD: https://www.chatprd.ai/

    Website: https://clairevo.com/

    LinkedIn: https://www.linkedin.com/in/clairevo/

    X: https://x.com/clairevo

    —

    Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].