AI NewsClaude Opus 4.7, Gemini 3.1 Pro, and Others Score 0% on New SWE Benchmark

Claude Opus 4.7, Gemini 3.1 Pro, and Others Score 0% on New SWE Benchmark

6:29 PM IST · May 6, 2026

Claude Opus 4.7, Gemini 3.1 Pro, and Others Score 0% on New SWE Benchmark

ProgramBench tests SWE agents' ability to develop complete software projects holistically from scratch.

read more

Latest AI News

View All News →
Apple Agrees to Pay $250 Million Settlement to iPhone 16, iPhone 15 Pro Owners Over AI Claims

Apple Agrees to Pay $250 Million Settlement to iPhone 16, iPhone 15 Pro Owners Over AI Claims

Apple launched the iPhone 16 series in September 2024 with the Cupertino-based tech giant's first smartphones to ship with its “Apple Intelligence” suite of tools, a term used by the company instead of artificial intelligence (AI). The tech giant also announced that the 2023-launched iPhone 15 Pro series will also get support for its new AI-powered functionalities. However, last year, a number of class action lawsuits by iPhone owners were reportedly filed against Apple for over-committing and under-delivering on its AI promise. Now, the company has reportedly reached an agreement to pay compensation to millions of affected consumers.

2 hours ago

View

OpenAI Upgrades ChatGPT’s Default AI Model to GPT-5.5 Instant, Adds New Capabilities

OpenAI Upgrades ChatGPT’s Default AI Model to GPT-5.5 Instant, Adds New Capabilities

OpenAI, on Tuesday, announced that it is updating the default artificial intelligence (AI) model in ChatGPT. The default model is available to everyone when they first open the website or the app, including those on the free tier. So far, this experience has been powered by the GPT-5.3 Instant, but now, the San Francisco-based AI giant is replacing it with the GPT-5.5 Instant. Some of the key improvements include more personalised responses, higher intelligence in image analysis and answering science and math questions, and a natural conversational tone.

2 hours ago

View

Claude Opus 4.7, Gemini 3.1 Pro, and Others Score 0% on New SWE Benchmark

Claude Opus 4.7, Gemini 3.1 Pro, and Others Score 0% on New SWE Benchmark

ProgramBench tests SWE agents' ability to develop complete software projects holistically from scratch.

2 hours ago

View

'AI-First' is Now Mandatory for Big 6 IT Audits, But They Don't Know How

'AI-First' is Now Mandatory for Big 6 IT Audits, But They Don't Know How

India's AI and cloud boom is forcing IT companies to rethink how they audit—and how they get audited.

2 hours ago

View