Podcast Guide
← All podcasts
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, deep learning, natural language processing, neural networks, analytics, computer science, data science and

All episodes(30)

  • StandardSummaries only
    How Capital One Delivers Multi-Agent Systems with Rashmi Shetty

    Published Apr 16, 2026

    Rashmi Shetty

    In this episode, Rashmi Shetty, senior director of enterprise generative AI platform at Capital One, joins us to explore how the company is designing, deploying, and scaling multi-agent systems in a highly regulated environment. Rashmi walks us through Chat Concierge, a multi-agent chat experience for auto dealerships that handles intent disambiguation, tool invocation, and human handoffs to deliver safer, more personalized customer journeys. We discuss Capital One’s platform-centric approach to

  • StandardSummaries only
    The Race to Production-Grade Diffusion LLMs with Stefano Ermon

    Published Mar 26, 2026

    Stefano Ermon

    Today, we're joined by Stefano Ermon, associate professor at Stanford University and CEO of Inception Labs to discuss diffusion language models. We dig into how diffusion approaches—traditionally used for images—are being adapted for text and code generation, the technical challenges of applying continuous methods to discrete token spaces, and how diffusion models compare to traditional autoregressive LLMs. Stefano introduces Mercury 2, a commercial-scale diffusion LLM that can generate multiple

  • StandardSummaries only
    Agent Swarms and Knowledge Graphs for Autonomous Software Development with Siddhant Pardeshi

    Published Mar 10, 2026

    Siddhant Pardeshi

    In this episode, Sid Pardeshi, co-founder and CTO of Blitzy, joins us to discuss building autonomous development systems able to deliver production-ready software at enterprise scale. Sid contrasts AI-assisted coding with end-to-end autonomy, arguing that “code is a commodity” and acceptance is the real metric—security, standards, tests, and maintainability included. We explore Blitzy’s hybrid graph-plus-vector approach, which grounds agents and combines semantic signals with keyword search to n

  • StandardSummaries only
    AI Trends 2026: OpenClaw Agents, Reasoning LLMs, and More with Sebastian Raschka

    Published Feb 26, 2026

    Sebastian Raschka

    In this episode, Sebastian Raschka, independent LLM researcher and author, joins us to break down how the LLM landscape has changed over the past year and what is likely to matter most in 2026. We discuss the shift from raw model scaling to reasoning-focused post-training, inference-time techniques, and better tool integration. Sebastian explains why methods like self-consistency, self-refinement, and verifiable-reward reinforcement learning have become central to progress in domains like math a

  • StandardSummaries only
    The Evolution of Reasoning in Small Language Models with Yejin Choi

    Published Jan 29, 2026

    Yejin Choi

    Today, we're joined by Yejin Choi, professor and senior fellow at Stanford University in the Computer Science Department and the Institute for Human-Centered AI (HAI). In this conversation, we explore Yejin’s recent work on making small language models reason more effectively. We discuss how high-quality, diverse data plays a central role in closing the intelligence gap between small and large models, and how combining synthetic data generation, imitation learning, and reinforcement learning can

  • StandardSummaries only
    Intelligent Robots in 2026: Are We There Yet? with Nikita Rudin

    Published Jan 8, 2026

    Nikita Rudin

    Today, we're joined by Nikita Rudin, co-founder and CEO of Flexion Robotics to discuss the gap between current robotic capabilities and what’s required to deploy fully autonomous robots in the real world. Nikita explains how reinforcement learning and simulation have driven rapid progress in robot locomotion—and why locomotion is still far from “solved.” We dig into the sim2real gap, and how adding visual inputs introduces noise and significantly complicates sim-to-real transfer. We also explore

  • StandardSummaries only
    Rethinking Pre-Training for Agentic AI with Aakanksha Chowdhery

    Published Dec 17, 2025

    Aakanksha Chowdhery

    Today, we're joined by Aakanksha Chowdhery, member of technical staff at Reflection, to explore the fundamental shifts required to build true agentic AI. While the industry has largely focused on post-training techniques to improve reasoning, Aakanksha draws on her experience leading pre-training efforts for Google’s PaLM and early Gemini models to argue that pre-training itself must be rethought to move beyond static benchmarks. We explore the limitations of next-token prediction for multi-step

  • StandardSummaries only
    Why Vision Language Models Ignore What They See with Munawar Hayat

    Published Dec 9, 2025

    Munawar Hayat

    In this episode, we’re joined by Munawar Hayat, researcher at Qualcomm AI Research, to discuss a series of papers presented at NeurIPS 2025 focusing on multimodal and generative AI. We dive into the persistent challenge of object hallucination in Vision-Language Models (VLMs), why models often discard visual information in favor of pre-trained language priors, and how his team used attention-guided alignment to enforce better visual grounding. We also explore a novel approach to generalized cont

  • StandardSummaries only
    Scaling Agentic Inference Across Heterogeneous Compute with Zain Asgar

    Published Dec 2, 2025

    Zain Asgar

    In this episode, Zain Asgar, co-founder and CEO of Gimlet Labs, joins us to discuss the heterogeneous AI inference across diverse hardware. Zain argues that the current industry standard of running all AI workloads on high-end GPUs is unsustainable for agents, which consume significantly more tokens than traditional LLM applications. We explore Gimlet’s approach to heterogeneous inference, which involves disaggregating workloads across a mix of hardware—from H100s to older GPUs and CPUs—to optim

  • StandardSummaries only
    Proactive Agents for the Web with Devi Parikh

    Published Nov 19, 2025

    Devi Parikh

    Today, we're joined by Devi Parikh, co-founder and co-CEO of Yutori, to discuss browser use models and a future where we interact with the web through proactive, autonomous agents. We explore the technical challenges of creating reliable web agents, the advantages of visually-grounded models that operate on screenshots rather than the browser’s more brittle document object model, or DOM, and why this counterintuitive choice has proven far more robust and generalizable for handling complex web in

  • StandardSummaries only
    AI Orchestration for Smart Cities and the Enterprise with Robin Braun and Luke Norris

    Published Nov 12, 2025

    Today, we're joined by Robin Braun, VP of AI business development for hybrid cloud at HPE, and Luke Norris, co-founder and CEO of Kamiwaza, to discuss how AI systems can be used to automate complex workflows and unlock value from legacy enterprise data. Robin and Luke detail high-impact use cases from HPE and Kamiwaza’s collaboration on an “Agentic Smart City” project for Vail, Colorado, including remediation and automation of website accessibility for 508 compliance, digitization and understand

  • StandardSummaries only
    Building an AI Mathematician with Carina Hong

    Published Nov 4, 2025

    Carina Hong

    In this episode, Carina Hong, founder and CEO of Axiom, joins us to discuss her work building an "AI Mathematician." Carina explains why this is a pivotal moment for AI in mathematics, citing a convergence of three key areas: the advanced reasoning capabilities of modern LLMs, the rise of formal proof languages like Lean, and breakthroughs in code generation. We explore the core technical challenges, including the massive data gap between general-purpose code and formal math code, and the diffic

  • StandardSummaries only
    High-Efficiency Diffusion Models for On-Device Image Generation and Editing with Hung Bui

    Published Oct 28, 2025

    Hung Bui

    In this episode, Hung Bui, Technology Vice President at Qualcomm, joins us to explore the latest high-efficiency techniques for running generative AI, particularly diffusion models, on-device. We dive deep into the technical challenges of deploying these models, which are powerful but computationally expensive due to their iterative sampling process. Hung details his team's work on SwiftBrush and SwiftEdit, which enable high-quality text-to-image generation and editing in a single inference step

  • StandardSummaries only
    Vibe Coding's Uncanny Valley with Alexandre Pesant

    Published Oct 22, 2025

    Alexandre Pesant

    Today, we're joined by Alexandre Pesant, AI lead at Lovable, who joins us to discuss the evolution and practice of vibe coding. Alex shares his take on how AI is enabling a shift in software development from typing characters to expressing intent, creating a new layer of abstraction similar to how high-level code compiles to machine code. We explore the current capabilities and limitations of coding agents, the importance of context engineering, and the practices that separate successful vibe co

  • StandardSummaries only
    Dataflow Computing for AI Inference with Kunle Olukotun

    Published Oct 14, 2025

    Kunle Olukotun

    In this episode, we're joined by Kunle Olukotun, professor of electrical engineering and computer science at Stanford University and co-founder and chief technologist at Sambanova Systems, to discuss reconfigurable dataflow architectures for AI inference. Kunle explains the core idea of building computers that are dynamically configured to match the dataflow graph of an AI model, moving beyond the traditional instruction-fetch paradigm of CPUs and GPUs. We explore how this architecture is well-s

  • StandardSummaries only
    Recurrence and Attention for Long-Context Transformers with Jacob Buckman

    Published Oct 7, 2025

    Jacob Buckman

    Today, we're joined by Jacob Buckman, co-founder and CEO of Manifest AI to discuss achieving long context in transformers. We discuss the bottlenecks of scaling context length and recent techniques to overcome them, including windowed attention, grouped query attention, and latent space attention. We explore the idea of weight-state balance and the weight-state FLOP ratio as a way of reasoning about the optimality of compute architectures, and we dig into the Power Retention architecture, which

  • StandardSummaries only
    The Decentralized Future of Private AI with Illia Polosukhin

    Published Sep 30, 2025

    Illia Polosukhin

    In this episode, Illia Polosukhin, a co-author of the seminal "Attention Is All You Need" paper and co-founder of Near AI, joins us to discuss his vision for building private, decentralized, and user-owned AI. Illia shares his unique journey from developing the Transformer architecture at Google to building the NEAR Protocol blockchain to solve global payment challenges, and now applying those decentralized principles back to AI. We explore how Near AI is creating a decentralized cloud that leve

  • StandardSummaries only
    Inside Nano Banana 🍌 and the Future of Vision-Language Models with Oliver Wang

    Published Sep 23, 2025

    Oliver Wang

    Today, we’re joined by Oliver Wang, principal scientist at Google DeepMind and tech lead for Gemini 2.5 Flash Image—better known by its code name, “Nano Banana.” We dive into the development and capabilities of this newly released frontier vision-language model, beginning with the broader shift from specialized image generators to general-purpose multimodal agents that can use both visual and textual data for a variety of tasks. Oliver explains how Nano Banana can generate and iteratively edit i

  • StandardSummaries only
    Is It Time to Rethink LLM Pre-Training? with Aditi Raghunathan

    Published Sep 16, 2025

    Aditi Raghunathan

    Today, we're joined by Aditi Raghunathan, assistant professor at Carnegie Mellon University, to discuss the limitations of LLMs and how we can build more adaptable and creative models. We dig into her ICML 2025 Outstanding Paper Award winner, “Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction,” which examines why LLMs struggle with generating truly novel ideas. We dig into the "Roll the dice" approach, which encourages structured exploration by injec

  • StandardSummaries only
    Building an Immune System for AI Generated Software with Animesh Koratana

    Published Sep 9, 2025

    Animesh Koratana

    Today, we're joined by Animesh Koratana, founder and CEO of PlayerZero to discuss his team’s approach to making agentic and AI-assisted coding tools production-ready at scale. Animesh explains how rapid advances in AI-assisted coding have created an “asymmetry” where the speed of code output outpaces the maturity of processes for maintenance and support. We explore PlayerZero’s debugging and code verification platform, which uses code simulations to build a "memory bank" of past bugs and leverag

  • StandardSummaries only
    Autoformalization and Verifiable Superintelligence with Christian Szegedy

    Published Sep 2, 2025

    Christian Szegedy

    In this episode, Christian Szegedy, Chief Scientist at Morph Labs, joins us to discuss how the application of formal mathematics and reasoning enables the creation of more robust and safer AI systems. A pioneer behind concepts like the Inception architecture and adversarial examples, Christian now focuses on autoformalization—the AI-driven process of translating mathematical concepts from their human-readable form into rigorously formal, machine-verifiable logic. We explore the critical distinct

  • StandardSummaries only
    Multimodal AI Models on Apple Silicon with MLX with Prince Canuma

    Published Aug 26, 2025

    Prince Canuma

    Today, we're joined by Prince Canuma, an ML engineer and open-source developer focused on optimizing AI inference on Apple Silicon devices. Prince shares his journey to becoming one of the most prolific contributors to Apple’s MLX ecosystem, having published over 1,000 models and libraries that make open, multimodal AI accessible and performant on Apple devices. We explore his workflow for adapting new models in MLX, the trade-offs between the GPU and Neural Engine, and how optimization methods

  • StandardSummaries only
    Genie 3: A New Frontier for World Models with Jack Parker-Holder and Shlomi Fruchter

    Published Aug 19, 2025

    Today, we're joined by Jack Parker-Holder and Shlomi Fruchter, researchers at Google DeepMind, to discuss the recent release of Genie 3, a model capable of generating “playable” virtual worlds. We dig into the evolution of the Genie project and review the current model’s scaled-up capabilities, including creating real-time, interactive, and high-resolution environments. Jack and Shlomi share their perspectives on what defines a world model, the model's architecture, and key technical challenges

  • StandardSummaries only
    Closing the Loop Between AI Training and Inference with Lin Qiao

    Published Aug 12, 2025

    Lin Qiao

    In this episode, we're joined by Lin Qiao, CEO and co-founder of Fireworks AI. Drawing on key lessons from her time building PyTorch, Lin shares her perspective on the modern generative AI development lifecycle. She explains why aligning training and inference systems is essential for creating a seamless, fast-moving production pipeline, preventing the friction that often stalls deployment. We explore the strategic shift from treating models as commodities to viewing them as core product assets.

  • StandardSummaries only
    Context Engineering for Productive AI Agents with Filip Kozera

    Published Jul 29, 2025

    Filip Kozera

    In this episode, Filip Kozera, founder and CEO of Wordware, explains his approach to building agentic workflows where natural language serves as the new programming interface. Filip breaks down the architecture of these "background agents," explaining how they use a reflection loop and tool-calling to execute complex tasks. He discusses the current limitations of agent protocols like MCPs and how developers can extend them to handle the required context and authority. The conversation challenges

  • StandardSummaries only
    Infrastructure Scaling and Compound AI Systems with Jared Quincy Davis

    Published Jul 22, 2025

    Jared Quincy Davis

    In this episode, Jared Quincy Davis, founder and CEO at Foundry, introduces the concept of "compound AI systems," which allows users to create powerful, efficient applications by composing multiple, often diverse, AI models and services. We discuss how these "networks of networks" can push the Pareto frontier, delivering results that are simultaneously faster, more accurate, and even cheaper than single-model approaches. Using examples like "laconic decoding," Jared explains the practical techni

  • StandardSummaries only
    Building Voice AI Agents That Don’t Suck with Kwindla Kramer

    Published Jul 15, 2025

    Kwindla Kramer

    In this episode, Kwindla Kramer, co-founder and CEO of Daily and creator of the open source Pipecat framework, joins us to discuss the architecture and challenges of building real-time, production-ready conversational voice AI. Kwin breaks down the full stack for voice agents—from the models and APIs to the critical orchestration layer that manages the complexities of multi-turn conversations. We explore why many production systems favor a modular, multi-model approach over the end-to-end models

  • StandardSummaries only
    Distilling Transformers and Diffusion Models for Robust Edge Use Cases with Fatih Porikli

    Published Jul 9, 2025

    Fatih Porikli

    Today, we're joined by Fatih Porikli, senior director of technology at Qualcomm AI Research for an in-depth look at several of Qualcomm's accepted papers and demos featured at this year’s CVPR conference. We start with “DiMA: Distilling Multi-modal Large Language Models for Autonomous Driving,” an end-to-end autonomous driving system that incorporates distilling large language models for structured scene understanding and safe planning motion in critical "long-tail" scenarios. We explore how DiM

  • StandardSummaries only
    Building the Internet of Agents with Vijoy Pandey

    Published Jun 24, 2025

    Vijoy Pandey

    Today, we're joined by Vijoy Pandey, SVP and general manager at Outshift by Cisco to discuss a foundational challenge for the enterprise: how do we make specialized agents from different vendors collaborate effectively? As companies like Salesforce, Workday, and Microsoft all develop their own agentic systems, integrating them creates a complex, probabilistic, and noisy environment, a stark contrast to the deterministic APIs of the past. Vijoy introduces Cisco's vision for an "Internet of Agents

  • StandardSummaries only
    LLMs for Equities Feature Forecasting at Two Sigma with Ben Wellington

    Published Jun 17, 2025

    Ben Wellington

    Today, we're joined by Ben Wellington, deputy head of feature forecasting at Two Sigma. We dig into the team’s end-to-end approach to leveraging AI in equities feature forecasting, covering how they identify and create features, collect and quantify historical data, and build predictive models to forecast market behavior and asset prices for trading and investment. We explore the firm's platform-centric approach to managing an extensive portfolio of features and models, the impact of multimodal