Back to home

AI Agents

8 articles tagged with this topic

LangChainDeepAgent

LangChain: AI Agents Load Skills On-Demand — Modular Dev Is the New Agent Paradigm

LangChain DeepAgent: AI agents load skill modules on-demand like humans, shifting Agent development from monolithic to pluggable composition for custo

May 62 min read
LangChainMCP

LangChain Standardizes AI Tool Calling: LLMs Shift from Talking to Doing

LangChain updates tool APIs for LLMs to interact with external systems. AI shifts from chatbots to executors; tool calling is key to enterprise AI ado

May 22 min read
LangSmithDeepEval

Stop Chasing Leaderboards: How Berkeley Exposed Flawed AI Agent Benchmarks

Berkeley researchers reveal critical data contamination in top AI benchmarks. Learn how to validate your own agent tools, avoid overfitting, and build

Apr 125 min read
OpenAI CodexAnthropic Claude

Harness Engineering Emerges as Core AI Agent Discipline at OpenAI, Anthropic

OpenAI and Anthropic publicly frame 'Harness Engineering' as the critical layer between model capability and production reliability.

Apr 124 min read
IBM ResearchALTK-Evolve

IBM ALTK-Evolve Enables AI Agents to Learn During Deployment

IBM Research releases ALTK-Evolve, a toolkit letting AI agents update their behavior from real task experience without full retraining.

Apr 83 min read
Meta EngineeringAI Agents

How Meta Built a Pre-Compute Engine to Give AI Agents a Codebase Map

Meta deployed 50+ specialized AI agents to encode tribal knowledge across 4,100+ files, cutting agent tool calls by 40%.

Apr 62 min read
Amazon BedrockAgentCore Gateway

Amazon Bedrock AgentCore Gateway Now Supports OAuth 2.0 for MCP Servers

AgentCore Gateway centralizes MCP server auth using OAuth 2.0 Authorization Code flow, removing per-server credential management.

Apr 62 min read
Claude Opus 4Anthropic

Claude Opus 4 Fails Elden Ring: A Reality Check on AGI Claims

A developer tested Claude Opus 4 on Elden Ring gameplay. It couldn't leave the first room, challenging Jensen Huang's AGI claims.

Apr 62 min read