Back to home

Articles by OPC Wire

60 articles · April 14, 2026April 18, 2026

Zhipu-AI

GLM-5 and MiniMax M2.7 Offer Claude Code- Compatible APIs

Two Chinese LLM providers now offer Anthropic SDK -compatible endpoints, letting developers swap Claude for domestic models via config change.

Apr 183 min readOPC Wirejuejin.cn
M CP

MCP Protocol Security Flaws: 492 Servers Exposed, 437K Downloads at Risk

Research finds 492 public MCP servers vulnerable; CVE-2025-6514 affects 437,000+ downloads across production deployments.

Apr 184 min readOPC Wirejuejin.cn
Andrew-Ng

Agentic AI Bottleneck Shifts from Code to Deployment Operations

Andrew Ng says agentic AI's bott leneck is no longer writing code but production deployment and problem definition.

Apr 184 min readOPC Wirejuejin.cn
LocalLLaMA

Is harness a new buzzword?

Not AI news.

Apr 182 min readOPC Wirewww.reddit.com
PyCon

Join us at PyCon US 2026 in Long Beach - we have new AI and security tracks this year

PyCon US 2026 debuts a standalone AI track on May 16 in Long Beach, co-chaired by an Anthropic engineer.

Apr 183 min readOPC Wiresimonwillison.net
MiniMax

MiniMax Launches MaxHermes: Self-Evolving Agent Builds Own Skills

MiniMax releases MaxHermes, a cloud-sandbox agent that auto-generates reusable Skills from completed tasks without human instruction.

Apr 184 min readOPC Wirejuejin.cn
Cloudflare

Introducing Flagship: feature flags built for the age of AI

Cloudflare's native feature flag service Flagship enters closed beta, built on CNCF's OpenFeature standard for Workers and beyond.

Apr 184 min readOPC Wireblog.cloudflare.com
Amazon Bedrock

Introducing granular cost attribution for Amazon Bedrock

AWS now maps Bedrock inference spend to individual IAM users, roles, and federated identities automatically in CUR 2.0.

Apr 184 min readOPC Wireaws.amazon.com
NVIDIA Dynamo

Full-Stack Optimizations for Agentic Inference with NVIDIA Dynamo

NVIDIA Dynamo addresses inference stack pressure as Stripe, Ramp, and Spotify ship thousands of agent-generated PRs monthly.

Apr 184 min readOPC Wiredeveloper.nvidia.com
Cloudflare

Shared Dictionaries: compression that keeps up with the agentic web

Cloudflare previews shared compression dictionaries to cut redundant byte transfers, with beta opening April 30, 2026.

Apr 174 min readOPC Wireblog.cloudflare.com
Cloudflare

Introducing the Agent Readiness score. Is your site agent-ready?

Cloudflare's isitagentready.com scans sites for AI agent compatibility; only 4% of top 200K domains declare AI preferences .

Apr 173 min readOPC Wireblog.cloudflare.com
Amazon-Nova

AWS Nova Multimodal Embeddings Powers Native Video Semantic Search

Amazon Bedrock's Nova Multimodal Embeddings unifies text, audio , video, and image into one vector space for search.

Apr 174 min readOPC Wireaws.amazon.com
Amazon-Bedrock

Optimize video semantic search intent with Amazon Nova Model Distillation on Amazon Bedrock

Amazon Bedrock's Model Distillation transfers routing intelligence from Nova Premier to Nova Micro, cutting inference cost by over 95% and latency by

Apr 174 min readOPC Wireaws.amazon.com
Unsloth

Qwen3.6 GGUF Benchmarks

Un sloth claims top KLD-vs-disk-space performance for Qwen3.6-35B-A3B quants in 21 of 22 pareto frontier comparisons.

Apr 173 min readOPC Wirewww.reddit.com
Amazon-Bedrock

From hours to minutes: How Agentic AI gave marketers time back for what matters

AWS Marketing and Gradial used Amazon Bedrock to cut page assembly from 4 hours to ~10 minutes.

Apr 173 min readOPC Wireaws.amazon.com
Amazon Nova

AWS Nova Forge SDK Tutorial: Fine-Tune Nova Models With Data Mixing

AWS publ ishes step-by-step Nova Forge SDK guide; data mixing yielded 12-point F1 gain while preserving MMLU baseline scores.

Apr 174 min readOPC Wireaws.amazon.com
NVIDIA NemoClaw

Build a Secure, Always-On Local AI Agent with OpenClaw and NVIDIA NemoClaw

NVIDIA's NemoClaw and OpenClaw framework let developers run persistent, secure AI agents locally without cloud dependency.

Apr 174 min readOPC Wiredeveloper.nvidia.com
Qwen3

Qwen 3.6 is the first local model that actually feels worth the effort for me

Alibaba's Qwen3.6 35B-A3B runs Q8 at 170 tokens/ sec with full 260K context on dual consumer GPUs.

Apr 174 min readOPC Wirewww.reddit.com
LocalLLaMA

Move to local models

Source article is a personal support question, not a reportable AI news event.

Apr 172 min readOPC Wirewww.reddit.com
Hermes-Agent

Nous Research Open-Sources Her mes Agent, a Self-Improving AI Agent Framework

Hermes Agent hits 90 K+ GitHub stars with persistent skill memory and three-layer architecture across 200+ models.

Apr 174 min readOPC Wirejuejin.cn
Claude Opus 4.7

Opus 4.7 来了,我并不建议你升级

Anthrop ic's Opus 4.7 removes temperature/top_p/top_k controls and inflates token counts by up to 1.35x.

Apr 173 min readOPC Wirejuejin.cn
Juejin

Systematic Debugging Guide: A Detective Framework for Root Cause Analysis

A Chinese developer tutorial outlines a four-phase systematic debugging methodology replacing ad-hoc fixes.

Apr 174 min readOPC Wirejuejin.cn
Claude Code

Anthropic's 1M Context in Claude Code: Session Management Is the Real Story

Anthropic's official Claude Code guidance re frames 1M context as a session discipline problem, not a capacity win.

Apr 173 min readOPC Wirejuejin.cn
Claude Opus 4.7

Claude Opus 4.7 Launches: 64.3% S WE-Bench Score, Higher Image Resolution

Anthropic ships Claude Opus 4.7 with self-verification coding, 2,576px image support, and no price increase.

Apr 173 min readOPC Wirejuejin.cn
Anthropic

Anthropic Adds ID Verification to Claude, Blocking Chinese Users

Anthropic's new real -time ID and facial verification system effectively bars Chinese mainland users from Claude access .

Apr 174 min readOPC Wirejuejin.cn
Lalamove

Lalamove Cuts Translation Costs 90% With 3-Agent LLM Pipeline

Lalamove deployed a three-agent LLM framework — translation, QA scoring , and compliance — slashing localization costs by 90% and reducing turnaround

Apr 174 min readOPC Wirejuejin.cn
Qwen3.6-35B

Qwen3.6-35B is worse at tool use and reasoning loops than 3.5?

Community testers report Qwen3.6-35B enters infinite reasoning loops more than Qwen3.5 on agentic coding tasks.

Apr 173 min readOPC Wirewww.reddit.com
Qwen3.6

PSA: Qwen3.6 ships with preserve_thinking. Make sure you have it on.

Qwen3.6 introduces preserve _thinking flag to keep reasoning context in-context, fixing KV cache invalidation.

Apr 163 min readOPC Wirewww.reddit.com
llama.cpp

GPoUr with ~12gb vram and a 3080 getting 40tg/s on qwen3.6 35BA3B w/ 260k ctx

A llama.cpp fork with turbo3 KV cache quantization achieves ~40 tok/s on Qwen3-35 B-A3B with only 12GB VRAM.

Apr 163 min readOPC Wirewww.reddit.com
Amazon Nova Micro

Cost-efficient custom text-to-SQL using Amazon Nova Micro and Amazon Bedrock on-demand inference

AWS details LoRA fine-tuning of Nova Micro for custom SQL dialects, hitting $0.80/month at 22,000 queries via serverless inference.

Apr 162 min readOPC Wireaws.amazon.com
Gemma-4

DeepMind’s New AI: A Gift To Humanity

Google DeepMind has released Gemma 4, a new family of open-weight models available under the Apache 2.0 license.

Apr 163 min readOPC Wirewww.youtube.com
Meta

Meta's AI Agents Recover Hundreds of Megawatts by Automating Infrastructure Efficiency

Meta's unified AI agent platform compresses 10-hour manual regression investigations to 30 minutes, recovering hundreds of megawatts fleet -wide.

Apr 163 min readOPC Wireengineering.fb.com
Cloudflare Workers

Deploy Postgres and MySQL databases with PlanetScale + Workers

Cloudflare Workers gains native PlanetScale Postgres and MySQL provisioning via dashboard, with unified billing launching next month.

Apr 163 min readOPC Wireblog.cloudflare.com
Physical Intelligence

Robots Are Finally Starting to Work

Physical Intelligence is training a single foundation model to control multiple robot platforms zero-shot, skipping per-task data collection.

Apr 164 min readOPC Wirewww.youtube.com
NVIDIA DeepStream

How to Build Vision AI Pipelines Using DeepStream Coding Agents

NVIDIA DeepStream 9 integrates with coding agents like Claude Code and Cursor to auto-generate real-time vision AI pipeline code.

Apr 163 min readOPC Wiredeveloper.nvidia.com
Qwen

Alibaba Releases Qwen3.6-35B-A3B Mixture-of-Experts Model

Alibaba's Qwen team releases Qwen3.6-35B-A3B, a 35B-parameter MoE model activating 3B parameters per token.

Apr 162 min readOPC Wirewww.reddit.com
Qwen

Qwen3.6-35B-A3B released!

Alibaba's Qwen team releases a 35B sparse MoE model with only 3B active params under Apache 2.0.

Apr 163 min readOPC Wirewww.reddit.com
Cloudflare

Cloudflare’s AI Platform: an inference layer designed for agents

Cloudflare's AI Platform now routes 70+ models from 12+ providers via one API endpoint and shared credits.

Apr 164 min readOPC Wireblog.cloudflare.com
Cloudflare Workers AI

Building the foundation for running extra-large language models

Cloudflare details prefill- decode disaggregation and hardware configs powering Kimi K2.5 on Workers AI, achieving 3x speed gains.

Apr 164 min readOPC Wireblog.cloudflare.com
OpenCLI

OpenCLI Turns Any Website Into a Zero-Cost CLI Agent Tool

OpenCLI generates deterministic JS adapters once via LLM, then executes them zero-cost — 15.6k GitHub stars.

Apr 153 min readOPC Wirejuejin.cn
AWS-Trainium2

Speculative Decoding on AWS Trainium2 Cuts LLM Lat ency Up to 3x

AWS benchmarks show speculative decoding with vLLM on Trainium2 reduces inter -token latency up to 3x for decode-heavy workloads.

Apr 154 min readOPC Wireaws.amazon.com
Gemma- 4

Gemma 4 and Qwen 3.5 GGUFs: Detailed Analysis by oobabooga

Oobabooga published 5 benchmark reports covering 70-90 GGUF quants each for Gemma 4 and Qwen 3.5 models using KL Divergence methodology.

Apr 153 min readOPC Wirewww.reddit.com
Gemma-4

Gemma 4 Jailbreak System Prompt

A system prompt designed to bypass Gemma 4's safety filters is circulating on Reddit with 112 upvotes.

Apr 153 min readOPC Wirewww.reddit.com
Hermes-Agent

Hermes Agent Framework Hits 85K Stars With Self-Evolving Memory

Nous Research's Hermes Agent, open-sourced in February 2026 , reaches 85K GitHub stars with a four-layer memory architecture and runtime skill accumu

Apr 154 min readOPC Wirejuejin.cn
OpenAI

OpenAI Launches GPT-5.4-Cyber for Vetted Security Defenders

OpenAI releases GPT-5.4-Cyber, a fine -tuned security model, to verified defenders via its Trusted Access for Cyber program .

Apr 153 min readOPC Wirejuejin.cn
LocalLLaMA

Local AI is the best

A Reddit post praising local AI tools contains no verifiable news, data, or technical developments.

Apr 152 min readOPC Wirewww.reddit.com
Claude Code

Claude Code Desktop Rebuilt Around Parallel Agent Execution

Anthropic redesigned Claude Code desktop from scratch to run multiple AI coding agents simultaneously.

Apr 153 min readOPC Wirejuejin.cn
Claude Code

AI 自动值夜班时代来了!Claude Code 刚刚推出 Routines

Anthropic releases Claude Code Routines in research preview, enabling scheduled and event-driven autonomous coding tasks on Anthropic's cloud infrast

Apr 153 min readOPC Wirejuejin.cn
Amazon Bedrock

How Guidesly built AI-generated trip reports for outdoor guides on AWS

Guidesly's Jack AI uses AWS Lambda, Step Functions, and Amazon Bedrock to auto -publish trip content after each outdoor guide booking.

Apr 154 min readOPC Wireaws.amazon.com
SageMaker HyperPod

Best practices to run inference on Amazon SageMaker HyperPod

AWS details H yperPod inference deployment patterns, claiming up to 40% total cost of ownership reduction for GPU work loads.

Apr 154 min readOPC Wireaws.amazon.com
S ageMaker JumpStart

AWS Adds Use-Case Deployment Presets to SageMaker Jump Start

SageMaker JumpStart now offers task -aware deployment configs optimized for cost, throughput, or latency by use case.

Apr 154 min readOPC Wireaws.amazon.com
Data Juicer

Alibaba Cloud PAI Processes 2M Videos in 200 Min via DataJuicer

Alibaba Cloud PAI ran a 7-stage video ML pipeline on 2M files (30 K hours) across 45 NVIDIA 5090 nodes in 200 minutes.

Apr 144 min readOPC Wirejuejin.cn
Qwen3.5

Qwen3.5-9B GGUF Quant Rankings: Q8_0 Dominates KLD Scores

KLD benchmarks across community GGUF quants show Q8_0 variants cluster near 0.001 KLD, with quality degrading shar ply below Q5.

Apr 143 min readOPC Wirewww.reddit.com
AI

LangChain's 10 Core Modules for Agent Dev: Code Comparisons

LangChain abstracts 10 engineering layers for AI agents, from multi-vendor LLM calls to RAG pipelines and observability.

Apr 144 min readOPC Wirejuejin.cn
YOLOv8

YOLOv8 Hits 111 FPS on RK3588 for Drone Power Line Inspection

Chiba University achieves 111.3 FPS on a 6 TOPS edge chip via model pruning and async NP U scheduling.

Apr 144 min readOPC Wirejuejin.cn
llama.cpp

端侧AI 模型部署实战五(Android大模型加载)

Step-by-step JNI bridge implementation for running quantized LLMs on Android using llama.cpp.

Apr 143 min readOPC Wirejuejin.cn
Claude Code

Claude Code Skills vs MCP: Architecture Deep Dive

Anthropic's Claude Code uses two distinct extension layers: Skills for reusable domain workflows and MCP for real-world tool connectivity.

Apr 144 min readOPC Wirejuejin.cn
Claude-Code

Component Reuse Enforcement via AGENTS.md, Hooks, and Skills

Dew u Engineering built a three-layer AI skill system to enforce component reuse before new component creation.

Apr 144 min readOPC Wirejuejin.cn
Claude Code

Claude Teammate Mode: Multi-Agent Game Dev Post mortem

Developer deploys Claude's experimental multi-agent Teammate mode to build a TCM learning game — and documents where the workflow breaks down.

Apr 143 min readOPC Wirejuejin.cn
Claude Code

Claude Code MCP Plugin Architecture: Cross -Process Tool Proxy Dissected

Source analysis reveals Claude Code uses stdio- based MCP servers as isolated subprocesses, proxying external tools transparently to the model.

Apr 143 min readOPC Wirejuejin.cn