Ai Benchmarks for Code

First Benchmark for Legacy Code Comprehension Shows Specialized AI Approach Outperforms General-PurposeModels

LegacyCodeBench tests whether AI can understand COBOL well enough to document itaccurately not just generate plausible ...

This week in AI updates: Google’s UCP standard, a redesigned Slackbot, and more (January 16, 2026)

Google has announced a new open-source standard for agentic commerce called the Universal Commerce Protocol (UCP).

Slator

Italian Benchmark Evaluates Large Language Models, Includes AI Translation

A new community-driven initiative evaluates large language models using Italian-native tasks, with AI translation among the ...

Z.ai's open source GLM-Image beats Google's Nano Banana Pro at complex text rendering, but not aesthetics

Furthermore, Nano Banana Pro still edged out GLM-Image in terms of pure aesthetics — using the OneIG benchmark, Nano Banana 2 ...

AI’s Hacking Skills Are Approaching an ‘Inflection Point’

AI models are getting so good at finding vulnerabilities that some experts say the tech industry might need to rethink how ...

The Next Web

Who decides the best AI?

AI evaluation platform LMArena secures $150M at a $1.7B valuation as it expands human-driven model comparisons. Read more in ...

12d

AI evaluation startup LMArena raises $150M at $1.7B valuation

“We cannot deploy AI responsibly without knowing how it delivers value to humans,” said LMArena co-founder and Chief ...

InfoWorld

AI won’t replace human devs for at least 5 years

Progress towards full AI-driven coding automation continues, but in steps rather than leaps, giving organizations time to ...

DeepSeek V4 Leaked : Coding-First Model Aims at Devs with New Memory & Reasoning AI

Rumors suggest two DeepSeek V4 options, a flagship for long coding and a lighter build, so teams can ship multi-file updates ...

Claude Code just got updated with one of the most-requested user features

For the end user, this update is seamless: Claude Code simply feels "smarter" and retains more memory of the conversation.

SlashGear

Is OpenAI Falling Behind In The Artificial Intelligence 'Arms Race'?

Describing AI development as an "arms race" might seem needlessly bombastic, but there's a reason why this term has entered common usage. It encapsulates the speed and intensity at which companies are ...

Nvidia’s NVentures backs Harmonic AI in Series C for mathematical superintelligence

Mathematical superintelligence startup Harmonic AI Inc. revealed today that NVentures, the venture capital arm of Nvidia Corp., was among the investors in its $120 million Series C round that was ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results