#27 - 🚀 OpenAI DevDay, Claude Code 2.0, and more

OpenAI, Anthropic ship big updates—but do they work?

Hey readers! 🚀

This week brought a fascinating mix of AI coding breakthroughs and reality checks. We saw major releases from OpenAI, Anthropic, and Google, each pushing the boundaries of what AI agents can do in our workflows. But we also got some much-needed perspective on the gap between the hype and what actually works in production. Whether you're all-in on AI coding or taking a more cautious approach, there's plenty to unpack here.

🎯 This Week's Highlights

OpenAI Codex adds SDK, admin tools, Slack integration marks a major milestone as Codex graduates from research preview to general availability. The new TypeScript-first SDK lets you embed Codex directly into your workflows, while the Slack integration means you can delegate tasks without leaving your team channels. – InfoWorld

What makes this release particularly interesting is the focus on enterprise needs. The new admin tools give ChatGPT Business and Enterprise accounts real visibility and control, with environment controls, monitoring, and analytics dashboards. Starting October 20, Codex cloud tasks will count toward usage limits, so if you're on a paid plan, now's the time to understand how this affects your billing.

The rise of the AI co-developer: New ways of building software explores two compelling scenarios for human-AI collaboration. Scenario 1 envisions AI agents as active team members that autonomously resolve issues and update documentation, expanding into QA and DevOps roles. Scenario 2 goes further, imagining AI-led development with new collaboration tools and an AI-first marketplace. – Gulf Business

The piece highlights a sobering statistic: while 93% of surveyed organizations plan to trial agents, only 27% expect agents to be fully autonomous, down from 43% in 2024. That's a significant reality check on the pace of AI autonomy. The article also emphasizes the need for governance tools like Git-level tagging to mark AI-generated code and specialized auditing pipelines to ensure quality and safety.

Claude Coded: Sonnet 4.5, Claude Code 2.0, and more showcases Anthropic's latest developer tools, with Claude Sonnet 4.5 leading coding benchmarks at 77.2% on SWE-bench and maintaining focus on complex tasks for over 30 hours. The new VS Code extension brings real-time inline diffs directly into your IDE, while the checkpoints feature lets you confidently run large tasks and roll back instantly if needed. – Anthropic

The Claude app now supports generating and analyzing Excel, PowerPoint, Word, and PDF files through natural language prompts, a feature that's available in preview for paid users. The Claude Agent SDK provides frameworks to build custom agents, and the new context editing API helps manage token limits by automatically clearing stale tool calls.

🔍 Reality Check: The AI Coding Debate

AI Coding Sucks delivers a brutally honest take from Syntax's CJ on why current AI coding tools fall short of expectations. The core issue? Unpredictability and goal-seeking shortcuts that reduce code quality. – Syntax

"AI coding tools suck and are not at all what was promised. I'm done."

CJ outlines practical mitigations like spec-driven workflows, plan.md files, and small incremental features, but concludes these are imperfect. His solution? Taking a one-month break from AI coding tools to return to manual development and regain the enjoyment of programming. It's a perspective worth considering, especially for those feeling frustrated with AI's limitations.

Market research: AI coding tools push production problems backs up these concerns with hard data. While 90% of developers use AI coding tools and see an average 35% productivity boost, 45% of deployments involving AI-generated code lead to problems, and 72% of organizations have experienced production incidents tied to AI code. – TechTarget

The research from Google DORA, Harness, and IDC reveals that higher AI usage correlates with greater software delivery instability. Security vulnerabilities and regulatory compliance risks remain major concerns, and despite the availability of automated testing tools, adoption remains inconsistent across organizations.

🛠️ Tools and Platforms

Top Agentic AI Tools for VS Code, According to Installs ranks the six most-installed VS Code extensions that explicitly describe themselves as "agent" or "agentic": Cline, BLACKBOXAI Agent, Continue, Codex, Roo Code, and Qodo Gen. – Visual Studio Magazine

Common themes across these tools include multi-file edits, terminal command execution, browser automation (with approval), support for multiple model backends including local runners, and emphasis on safety via human oversight. For context, GitHub Copilot boasts 53.8 million installs, dwarfing these agent-specific entries, but wasn't included because it doesn't use the specific keywords searched for.

Google releases Jules Tools for command line AI coding introduces a CLI for Jules, Google's AI coding agent, designed for developers who live in the terminal. The tool allows seamless integration into existing workflows, enabling developers to manage tasks, inspect operations, and customize functionality without leaving the command line. – The Register

📊 Enterprise Adoption

IBM and Anthropic Transform Enterprise Software Development announces a partnership integrating Claude into IBM's software tools, creating an AI-first IDE that automates the software development lifecycle. Early adopters, including over 6,000 IBM clients, report an average productivity gain of 45%. – Azat TV

The collaboration emphasizes security and governance through the Agent Development Lifecycle framework and contributions to the open Model Context Protocol community. IBM and Anthropic jointly developed a guide titled "Architecting Secure Enterprise AI Agents with MCP," outlining a structured approach to designing and deploying secure AI agents.

Microsoft Announces Open-Source Agent Framework to Simplify AI Agent Development unveils a new SDK that consolidates features from Semantic Kernel and AutoGen. The framework supports open standards, advanced orchestration patterns, and is production-ready with built-in observability and Azure integration. – InfoQ

🎓 Practical Applications

Agoda Leverages ChatGPT in the CI/CD Process for SQL Stored Procedure Optimization shows a real-world implementation where Agoda integrated GPT into its CI/CD workflow to optimize SQL stored procedures. The automated step feeds SP code, table structures, indexes, and performance test results to ChatGPT, which suggests rewritten queries and indexing changes. – InfoQ

The initiative aims to reduce the 366 person-days Agoda previously spent on SP optimization, with 320 days devoted to analyzing SP changes that caused performance test failures. The team acknowledges current limitations and is working to extend GPT-based support outside CI/CD and improve prompt tuning.

The #1 complaint about AI Coding (solved in 7 minutes) presents Warp's Agent Steering as a solution to the "almost right, but not quite" problem that 66% of developers cite as the worst part of AI coding. Agent Steering guides the AI in real time with inline diffs, allowing review, refinement, editing, or rejection of changes. – Tiff In Tech

🔮 Looking Forward

The future of software development: AI speed, human judgment argues that while AI can dramatically accelerate development, human judgment remains essential. The 2025 DORA report frames AI as an amplifier that boosts high-performing teams while exposing weaknesses in others. – ThoughtWorks Insights

"The future of software development lies in combining the incredible speed and power of AI with the irreplaceable judgment, empathy and experience of human engineers."

The article outlines seven foundational practices to safely harness AI: clear AI stance, healthy data ecosystems, AI-accessible internal data, strong version control, working in small batches, user-centric focus, and quality internal platforms. It's a balanced perspective that acknowledges both AI's potential and its limitations.

Other notable updates:

Made with ❤️ by Data Drift Press

Got thoughts on this week's AI coding developments? Hit reply and let me know what you're seeing in your workflows!