Podcast Guide

Results for "evals"

2 results

Episodes

  • Latent Space: The AI Engineer Podcast
    StandardSummaries only

    METR’s Joel Becker on exponential Time Horizon Evals, Threat Models, and the Limits of AI Productivity

    Latent Space: The AI Engineer Podcast· Feb 27, 2026

    This is a free preview of a paid episode. To hear more, visit www.latent.spaceAIE Europe CFP and AIE World’s Fair paper submissions for CAIS peer review are due TODAY - do not delay! Last call ever.We’re excited to welco

    evals
  • Latent Space: The AI Engineer Podcast
    StandardSummaries only

    ⚡️The End of SWE-Bench Verified — Mia Glaese & Olivia Watkins, OpenAI Frontier Evals & Human Data

    Latent Space: The AI Engineer Podcast· Feb 23, 2026

    Olivia Watkins (Frontier Evals team) and Mia Glaese (VP of Research at OpenAI, leading the Codex, human data, and alignment teams) discuss a new blog post (https://openai.com/index/why-we-no-longer-evaluate-swe-bench-ver

    openaievals