Afleveringen
-
Justin Lebar (jlebar.com) recently spent $10,000 in an afternoon, uncovering critical miscompiles across NVIDIA's PTXAS, LLVM's AMD GPU, and X86 backends. He joins Jordan Nanos (@JordanNanos) to detail his methodology, which combined traditional fuzzing techniques with novel LLM-assisted bug finding. Their discussion highlights the unique challenges of detecting flaws in less-tested ML compilers compared to mature CPU environments.Lebar shares specific high-severity X86 findings, including an atomic operation bug that splits into two non-atomic operations. They explore the comparative efficacy of fuzzing versus LLM agents in identifying these elusive errors. This episode offers critical insights into compiler security and the burgeoning role of AI in automating rigorous code verification for AI infrastructure.FULL ARTICLE
00:00 Introduction and Content Overview00:25 Justin Lebar's Background and Recent Project00:59 Fuzzing Techniques for Compiler Bugs01:56 Motivation Behind the Project02:48 Challenges in Bug Detection in GPU and ML Compilers04:13 Bug Severity and Findings in AMD and x8605:38 Using LLMs to Read and Find Bugs in Code07:56 Impact of New Models and UltraCode Mode12:18 Estimating Time and Effort Without AI Assistance14:22 Limitations of Manual Code Review for Bugs15:03 Optimism About AI in Software Development16:17 Next Steps and Future Projects18:11 Key Takeaways for Developers and Researchers21:48 Call for Community Engagement and Scientific Approach
-
AWS operating margins jumped 10 percentage points while Microsoft Azure and Google Cloud stayed flat. The driver: Anthropic's Claude usage routing through Bedrock, Amazon's token-as-a-service platform.
Jordan Nanos (@JordanNanos), Jeremie Eliahou Ontiveros (@JeremieEO), Joey Brookhart (@SaasquatchC), and Crystal Huang (@Egg1459) break down why stabilized token margins are fundamentally richer than GPU-as-a-service for hyperscalers. The crew analyzes Anthropic's recent $65B Series H raise, Claude Opus 4.8 release, and SpaceX partnership against the backdrop of 300+ neo clouds fragmenting the traditional cloud moat.The team forecasts how AWS's workload mix advantage creates sustainable returns while competitors struggle with asset-heavy GPU service models. They examine the $22.7T TAM question, earnings-before-training dynamics, and whether the 2026 AI infrastructure beat belongs to silicon vendors or platform integrators. Subscribe for weekly deep dives into semiconductor and AI infrastructure economics.00:00 Intro: Episode 13 and the AWS margins article00:56 What is Bedrock? The three hyperscaler buckets02:33 AWS margins rising while peers lag03:33 Cloud moats collapsing and the neo cloud explosion06:32 Why stabilized token-as-a-service margins are so rich09:54 Amazon's workload mix advantage12:41 Forecasting Anthropic and the 4.8 release16:33 The SpaceX deal and the $65B Series H raise19:30 Bullish or bearish? Demand becoming supply28:55 The $22.7T TAM and does the race even matter31:59 Earnings before training and open-ended TAM36:27 The 2026 beat is basically one company40:22 Who wins long term: silicon, partnerships, integration
-
Zijn er afleveringen die ontbreken?
-
Jordan Nanos (@JordanNanos), Howie, and Myron Xie break down the economics of Cerebras's IPO on the eve of their public debut, examining their OpenAI and Amazon deals that have shifted the company away from Middle Eastern investor concentration toward frontier AI labs willing to pay exponentially more for speed.The discussion covers Cerebras's radical stitching innovation across a full wafer, creating compute density equivalent to an entire NVL72 rack without off-chip data movement. The hosts analyze whether businesses will accept these premium economics as fast tokens become the new standard for interactive AI applications.Subscribe for weekly deep dives into semiconductor economics and AI infrastructure developments.
(00:00) Cerebras IPO Preview
(18:58) Need for Speed Fast Tokens
(21:19) Wafer Scale Engine Architecture
(25:31) Radical Stitching Innovation
(31:55) Power Delivery and Cooling
(34:31) Bandwidth and IO Limitations
(37:12) Scaling Beyond Wafer Size
(40:05) Manufacturing and Assembly Bottlenecks
(42:54) Data Center Service Model
-
OpenAI was in serious trouble at the beginning of this year. Anthropic's Claude Opus 4.5 release had triggered a wave of developers to start using Claude Code, pushing Anthropic's revenue past OpenAI's on a like-for-like basis by April. OpenAI's GPT 5.4 response was such an embarrassment they didn't even compare it to Claude in their model release card. Then came GPT 5.5 - finally back on the frontier, but is it enough to reclaim the crown?
Jordan Nanos (@JordanNanos), Dylan Patel (@Dylan522p), Doug O'Laughlin (@FabricatedKnowledge), and Max Kan (@maxkan_) break down the latest AI model wars, from Claude 4.7's coding dominance to DeepSeek's long-delayed v4 release and what it reveals about China's AI capabilities. They analyze token efficiency, benchmark gaming, and why fast mode might be fake news. Subscribe for weekly deep dives into the semiconductor and AI infrastructure powering the future.
The Coding Assistant Breakdown
AI Value Capture
Timestamps:
00:00 OpenAI's Comeback and the Latest AI Model Wars
04:05 The High Cost of AI Models and Fast Mode Effectiveness
08:16 When AI Tokens Become Too Expensive for Tasks
13:11 Why AI Model Quality Degrades and Benchmarks Fail
18:42 Deep Dive into Claude 4.7 Features and Tokenizer Changes
25:29 DeepSeek's Release and China's AI Compute Constraints
28:20 The Future of Context Windows and Agent Orchestration
30:47 The Great Debate: CLI vs. App for AI Interaction
36:33 Debunking AI Fake News and Context Window Limitations
40:51 The AI Race: China, Meta, and the Neo Cloud Vision
43:46 Final Thoughts and Listener Feedback Request
-
This episode features Jordan Nanos (@JordanNanos) and Daniel Nishball (@dnishball) breaking down the economics of GPU clusters through real-world data and experience. Joined with Kang Wen Cheang and Zane Fong, the team discussed moving beyond theoretical TCO models as they examine how reliability differences between top-tier and lower-tier providers create significant cost disparities that aren't captured in simple per-GPU pricing. The discussion introduces practical frameworks for measuring goodput and understanding how system failures cascade through entire training jobs.Nanos walks through the mechanics of fault-tolerant frameworks including AWS's Checkpointless Training and explains why a single GPU failure can halt progress across hundreds of nodes. The conversation reveals how hyperscalers and NeoClouds price their services and why paying premium rates for reliable infrastructure often delivers better value than chasing the lowest per-hour costs. Subscribe to SemiAnalysis for in-depth analysis of AI hardware economics and infrastructure trends that impact the entire semiconductor ecosystem.
-
This week the team from ChipBook (formerly Chips & Wafers) joins. Jordan Nanos talks with Chaim Eisenberg and Simi Sherman as they explore how they build the ChipBook with open source data, and how that drives investment decisions. This episode dives deep into the ChipBook itself, revealing how granular, historical data collection provides insights into supply chain dynamics, memory markets, wafer fabrication equipment trends and more. The guests also share compelling examples of how their data-driven approach has generated some viral social media recently.for more: SemiAnalysis.com/chipbook00:00 The Chipbook: Understanding Open Source Data07:56 Granularity in Data: The Key to Investment Insights09:11 Understanding the Semiconductor Supply Chain10:52 Memory Market Insights and Trends14:23 Tracking WFE and Its Impact on Production15:20 The Importance of Early Signals in Investment16:51 Geopolitical Implications on Semiconductor Supply20:24 The Impact of Tariffs and Regulations22:48 Granular Tracking for Investment Decisions27:07 Data-Driven Insights and Investment Strategies29:13 The Structure of the Chipbook33:02 Collaboration and Integration at Semi Analysis37:02 Geopolitical Analysis and Its Impact on TSMC43:48 Helium Supply Chain and Its Importance
-
Jordan Nanos (@jordannanos) Daniel Nishball (@dnishball) and Sam Harshe (@sharshe02) break down how SemiAnalysis is deploying Claude Code agents across its Singapore office at a scale that outpaces Meta on a per-employee basis. They cover the practical workflow changes, the trust and reliability questions that come with AI-generated analysis, and what it actually takes to build an agent swarm that does useful work. The conversation also gets into cybersecurity risks and where AI model development is headed next.00:00 - Introduction and Team Dynamics09:10 - The Evolution of Agent Utilization14:49 - Conference Insights and Research Efficiency15:58 - AI's Role in Learning and Analysis21:13 - Trust and Reliability in AI Outputs27:04 - Market Impact and Adoption of AI Tools32:14 - Cybersecurity and AI: Opportunities and Challenges39:32 - Future of AI Models and User Experience
-
The Core Research team is on SemiAnalysis Weekly this week. Jordan Nanos, Nick Doyle, Nigel Chiang, and Konrad Wang walk through the biggest bottlenecks in AI infrastructure: TSMC capacity constraints, PCB and substrate shortages, memory cycle dynamics, modular data center construction, and behind-the-meter power. The team also gets into how Claude Code is transforming their day-to-day research, what enterprise AI adoption actually looks like today, and the signals they're watching for when the cycle turns.
-
This week, Sravan Kundojjala (@SKundojjala) and Ivan Chiam ( from our team join Jordan (@JordanNanos) to break down the AI silicon shortage — and why its ripple effects are hitting everything from GPU pricing to your next smartphone.We cover what's driving the crisis, how TSMC is allocating scarce capacity across its biggest customers, and why memory constraints could cut consumer electronics production by 10–15%. We dig into TSMC's $70B+ capex plans, the structural dynamics reshaping the memory market, and near-term node migration strategies that could offer some relief. Then we shift to the GPU rental market, tracking real pricing trends and what they signal about supply and demand heading into the back half of the year.In the second half, we unpack Nvidia's co-packaged optics (CPO) roadmap — one of the most significant infrastructure announcements to come out of recent industry events. We cover highlights from the OFC conference, explain why optical interconnects matter for next-gen AI clusters, and break down the dueling MSA standards battle playing out across the optical components industry.Whether you're an investor, engineer, or just trying to understand why AI hardware is so hard to get right now — this one's for you.
Jordan Nanos (Chapters00:00 Introduction and Episode Overview00:30 AI Silicon Shortage: Causes and Demand Growth03:25 TSMC's Capacity Constraints and Customer Allocation06:24 Impact of Memory Shortages on Consumer Electronics10:16 Memory Market Dynamics and Structural Trends13:25 Near-term Solutions and Node Migration Strategies15:31 Modeling the Memory Shortage and Industry Outlook17:16 Signs of Relief and Demand Trends19:35 Capex and Industry Investment Outlook23:25 GPU Rental Market and Pricing Trends26:58 Nvidia's CPO Roadmap and Industry Implications39:26 OFC Conference Highlights and Optical Interconnects40:24 Understanding Co-Package Optics (CPO) and Its Significance46:06 Industry Significance of Nvidia's Announcements and Market Outlook50:04 Dueling MSAs and Industry Standardization in Optical Components56:31 Summary and Final Thoughts on Industry Trends
-
This episode explores the impact of AI on memory costs, market dynamics, and economic measurement. Our experts discuss how AI is transforming industries, the challenges of measuring economic value, and the future of AI adoption across sectors. This week Jordan hosts Ray Wang, Malcolm Splitter, Joey Brookhart from SemiAnalysis.
Chapters
00:00 Introduction and Guest Credibility
03:44 Memory Constraints and Market Impact
07:14 Elasticity of Supply and Economic Analogies
10:00 AI Adoption in Personal and Enterprise Life
14:57 Global Distribution of AI Usage and Tokens
20:00 Consumer vs Enterprise AI Use Cases
24:56 Future Market Growth and Adoption Scenarios
29:48 AI's Impact on GDP and Economic Measurement
40:12 AI and the Future of Work and Productivity
50:02 Global Workforce and AI's Economic Effects
01:00:03 Building AI-Driven Dashboards and Tools
01:09:59 Reevaluating GDP and Economic Value of AI
01:19:56 Closing Remarks and Future Outlook
-
Join Jordan Nanos, Doug O’Laughlin, and special guest Jeremie Eliahou Ontiveros for an in-depth analysis of the intersection between AI infrastructure and energy markets. Jeremie, Head of Datacenter & Energy Infrastructure Research, shares expert insights on how AI data centers are reshaping electricity pricing, market dynamics, and the regulatory challenges facing the grid today.
00:00 AI Data Centers and Electricity Prices
02:15 Market Dynamics: PJM vs. ERCOT
04:56 Supply and Demand Challenges
08:02 Thermal Accreditation and Market Reforms
11:05 Future of Energy Supply and Coal Retirements
17:13 The Future of Coal Power Plants
18:27 Balancing Energy Prices and Consumer Expectations
19:29 Customized Tariffs and Their Impact
20:28 Investment Commitments and Grid Reliability
22:30 Negotiation Timelines for Data Centers
24:32 Financing Mismatches in Energy Projects
25:10 The AI Boom and Its Impact on Energy
27:13 Anthropic vs. OpenAI: The Revenue Race
29:15 The Role of Government in AI Adoption
32:00 Demand Drivers in AI and Software Development
33:33 The Shift in Consumer Perception of AI
35:53 The Future of Coding and AI Integration
39:55 The Evolution of Software and AI Tools
43:53 The Canadian Tech Scene and AI Growth
-
Article (latest): https://newsletter.semianalysis.com/p...
00:00 Introduction to Vera Rubin and Extreme Co-Design
02:30 Innovations in GPU Architecture: From Blackwell to Rubin
05:38 Memory Bandwidth and HBM4: A New Era
08:26 NVLink 6 and Interconnect Enhancements
11:38 Cable-less Design: Revolutionizing System Assembly
14:29 Thermal Management Innovations in Rubin
17:30 Power Management and Performance Expectations
33:36 Innovations in Packaging and Heat Management
38:15 Chiller-less Design and Data Center Infrastructure
40:47 Power Delivery Innovations in Rubin
46:16 Memory Solutions and Supply Chain Management
54:57 Deployment Timeline and Future Expectations
-
InferenceX, formerly InferenceMAX: https://inferencex.com/
Article (latest): https://newsletter.semianalysis.com/p/inferencex-v2-nvidia-blackwell-vs
GitHub https://github.com/SemiAnalysisAI/InferenceX
Article (original) https://newsletter.semianalysis.com/p/inferencemax-open-source-inference
00:00 Introduction to InferenceX
02:52 Evolution from InferenceMAX to InferenceX
06:06 Benchmarking and Performance Insights
08:43 The Scale of Benchmarking Work
11:39 Collaboration with AMD and Nvidia
14:52 The Evolution of Inference Benchmarking
17:34 Optimizations and Their Impact
20:47 Challenges in Composability
23:51 Multi-Token Prediction Explained
26:52 Cost Implications of Optimizations
31:06 Understanding Inference Workloads and Benchmarks
33:44 Future Plans for Inference Optimization
37:16 Roadmap for New Models and Data Sets
39:03 Challenges in Benchmarking Multi-Turn and Multi-Modal Data
42:44 Experiences with AI Models and Their Limitations
48:43 Skepticism About Future AI Improvements
-
Claude Code Rising: https://semianalysis.com/institutional/claude-code-adoption-note/ and https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
CPUs are back: https://newsletter.semianalysis.com/p/cpus-are-back-the-datacenter-cpu
Memory Mania: https://newsletter.semianalysis.com/p/memory-mania-how-a-once-in-four-decades
Claude Fast: https://x.com/SemiAnalysis_/status/2020922445989822709?s=20
Agent Swarms: https://x.com/SemiAnalysis_/status/2021283054019330194?s=20
Taiwan: https://x.com/SemiAnalysis_/status/2021222800707538980?s=20
Seedance2: https://x.com/kirkinator_sol/status/2012588116536631778?s=20 , https://docs.google.com/document/d/1du1Ld94b1d2TU4maYIcU-K6v405iWoYTILnO9ipsEXQ/edit?tab=t.0