Avsnitt
-
Anthropic quietly pulled hidden China-tracking code out of Claude Code after a Reddit user reverse-engineered it, and Alibaba responded by banning the tool for its employees. We also cover Claude Fable 5's return after an 18-day export-control freeze, a new default that flips Claude Code to "Manual" permission mode, and Cursor's iOS launch with a discount ending tomorrow. Today's concept: what an agent's permission loop actually does. Hosts: Alex & Jules. New episodes daily.
-
Windsurf's Cascade agent hit end-of-life and got replaced by Devin Local — here's what to repoint in your CI. Claude Code's new default ships a native 1M-token context window, which gives us a chance to break down what a context window actually is and why bigger isn't free. Plus Codex retires two models and DeepMind's AlphaEvolve keeps landing in unexpected places. Hosts: Alex & Jules. New episodes daily.
-
Saknas det avsnitt?
-
Anthropic's Fable 5 is restored globally after a three-week US export control triggered by a jailbreak that let the model produce working exploit code — we break down what a jailbreak actually is and why this model's guardrail story matters to every developer. Plus: Cloudflare sets a September 15 deadline that could break web-browsing agents, Cursor reveals its enterprise "software factory" playbook, and OpenAI's open-weight models land in AWS GovCloud. Hosts: Alex & Jules. New episodes daily.
-
Claude Sonnet 5 is now the default model in Claude Code — and it ships with a native one-million-token context window. We explain what a context window actually is, why a million tokens is a genuine game-changer for developers working on large codebases, and where the tradeoffs still lurk. Plus: Claude Code 2.1.196 lands with background agent reliability overhaul and a supply-chain security fix for MCP in multi-repo setups. And if you're on Devin Desktop (formerly Windsurf), the Cascade local agent hit end-of-life today — here's what to do. Hosts: Alex & Jules. New episodes daily.
-
Claude Code v2.1.195 landed a critical hook matcher fix — if you use MCP servers with hyphens in their names, your hooks may have been silently misfiring, and we explain exactly what to change. Then: ByteDance dropped Seed 2.1 Pro and Turbo, claiming Opus-class agentic coding at 80% lower cost — and why that pricing pressure matters even if you never touch their API. We also break down what Claude Code hooks actually are, how glob pattern matching works, and why exact-match versus substring-match is the kind of distinction that quietly breaks your security tooling. Hosts: Alex & Jules. New episodes daily.
-
Anthropic's Claude Security hits public beta, scanning whole codebases by reasoning about data flows instead of pattern-matching — we explain how that differs from a traditional scanner. Plus Cognition's FrontierCode benchmark, where the best model scores just 13% on whether its pull requests are actually mergeable, and Claude Code's latest releases add shell MCP login and fix silent subagent permission denials. Along the way we teach how AI vulnerability scanning works and why "mergeability" is a harder bar than "correctness." Hosts: Alex & Jules. New episodes daily.
-
A solo researcher mapped ten thousand malicious GitHub repositories actively targeting AI agent developers with credential-stealing malware — and we break down what to check. Cactus Compute distilled Gemini's tool-calling skill into a 26-million-parameter, 14MB model that runs entirely on-device, and we explain how model distillation works so you understand *why* that's possible. Plus, OpenCode — the open-source, model-agnostic terminal coding agent — just hit #1 in the rankings with 160K GitHub stars and 7.5M monthly active developers. Hosts: Alex & Jules. New episodes daily.
-
GitHub Copilot now lets JetBrains users switch to Claude as their coding agent with a two-step setup. Claude Code hits GA on Dynamic Workflows and ships a new sandbox credential-blocking setting. MiniMax M3, an open-weight model beating GPT-5.5 on coding benchmarks, gets unpacked — including the Mixture of Experts architecture that makes a 1M-token context window computationally feasible. Plus: OpenAI acquires Gitpod to give Codex persistent cloud memory. Today's teaching concept: Mixture of Experts (MoE) — why activating only 23B of 428B parameters is the key to M3's speed and scale. Hosts: Alex & Jules. New episodes daily.
-
Cursor shipped three major updates in one week — cloud subagents, Slack-triggered automations, and a unified Customize page — and Alex breaks down the shift to push-mode, event-driven agents that react to GitHub and Slack instead of waiting to be chatted with. Plus: Claude Fable 5 is back from its export ban with new restrictions and a silent-refusal API change (HTTP 200, not an error) that could be silently breaking your integration right now. And Gemini CLI is officially dead — if it's in your workflow, it stopped working June 18. Hosts: Alex & Jules. New episodes daily.
-
Claude Managed Agents hit public beta with scheduled runs, CLI tool access, and Okta-powered enterprise MCP provisioning. Windsurf is now Devin Desktop — and Cascade is end-of-life July 1, so act fast. MiniMax M3 drops as an open-weight model with a one-million-token context window and top SWE-bench scores. Plus: an open-source framework called Forge takes an 8B local model from 53% to 99% on agentic tasks — and we explain exactly why agentic reliability fails (hint: errors compound across steps) and how guardrails fix it. Hosts: Alex & Jules. New episodes daily.
-
Claude Code's latest release blocks destructive git commands in auto mode — `git reset --hard`, `terraform destroy`, and others — unless you explicitly asked for them. Then Cursor ships always-on Automation agents with Slack emoji triggers and five new GitHub event hooks. Plus: a 26-million-parameter model called Needle beats competitors 10x its size at function calling, and we break down why — tool calling is retrieval, not reasoning, and that distinction unlocks a whole class of tiny, fast, on-device specialists. Hosts: Alex & Jules. New episodes daily
-
GLM-5.2 drops full MIT-licensed weights and tops both an OpenAI and an Anthropic flagship on SWE-bench Pro — we explain what that benchmark actually measures. Plus Claude Fable 5 lands inside Claude Code, and Codex expands Computer Use (and we unpack what "an agent using a computer" really means). Hosts: Alex & Jules. New episodes daily.
-
GLM-5.2's MIT open weights finally hit Hugging Face with bold "beats GPT-5.5" benchmark claims — we unpack SWE-bench Pro and why self-reported scores aren't the same as independently-verified ones. Plus Anthropic pauses its Agent SDK billing split (and what "programmatic usage" even means), and the Gemini CLI shutdown hits zero tomorrow. Hosts: Alex & Jules. New episodes daily.
-
Open-source agent OpenCode knocks Cursor off the #1 spot, CircleCI and Amazon ship MCP servers that pipe live infra data into your agent (with interactive responses, not just text), Codex moves to GPT-5.5 with screen-aware memory, and Microsoft prepares to make its own Project Polaris the default model in GitHub Copilot. We break down mixture-of-experts — the architecture that makes that switch possible. Hosts: Alex & Jules. New episodes daily.
-
GLM-5.2's MIT open weights are due any minute — we explain what "open weights" actually means versus open source and API-only. Cursor's Bugbot gets 3x faster and adds a pre-push /review gate for bug and security checks. Plus the teaching beat of the day: distillation, and why nine-billion-parameter clones are topping Hugging Face. Hosts: Alex & Jules. New episodes daily.
-
Anthropic's June 15 billing change lands today — programmatic Claude usage (Agent SDK, claude -p, GitHub Actions) now draws from a separate credit pool, and your automations can stop cold if you don't claim it. Then we teach the AGENTS.md context file using fresh research: a hand-written one cuts agent runtime ~29%, but an auto-generated one is worse than none. Plus the MCP spec gets UIs and long-running tasks, and Gemini CLI dies in three days. Hosts: Alex & Jules. New episodes daily.
-
Claude Code 2.1.172 lets sub-agents spawn their own sub-agents five levels deep — we explain what a sub-agent is and why the recursion cap matters. Plus MiniMax M3's open weights blow their own deadline (still nothing downloadable), and we teach sparse attention: the trick that makes million-token context affordable. Cursor's Bugbot drops to 90-second reviews. Hosts: Alex & Jules. New episodes daily.
-
Anthropic ships Claude Fable 5, its first public Mythos-class model — topping SWE-Bench Pro at 80.3% but costing double Opus 4.8. We break down how token pricing actually works (why output costs 5x input and what that does to your agent bill), the Fable-vs-Mythos safeguards split, and the MiniMax M3 open-weights deadline hitting the wire. Hosts: Alex & Jules. New episodes daily
-
Claude Code ships a "safe mode" that walls the agent off from your filesystem and network — we explain what a sandbox actually is and why it stops prompt injection cold. Plus Google sets June 18 as Gemini CLI's shutdown date (migrate to Antigravity CLI now), the Almanac MCP that brings deep research into your terminal, and the MiniMax M3 open-weights clock running out. Hosts: Alex & Jules. New episodes daily.
-
Apple's WWDC keynote landed — a rebuilt Siri with its own app and Apple Foundation Models powered by a Google Gemini collaboration. Plus Cursor 3.7's Context Usage Report (and why your context BUDGET, not just the window, is what slows agents down) and Windsurf becoming Devin Desktop with the Agent Client Protocol. We break down ACP vs MCP, the Claude Code "ultracode" rename, and the MiniMax M3 weights watch. Hosts: Alex & Jules. New episodes daily.
- Visa fler