Claude 4.5 vs GPT-5.1

Which AI writes better code in 2025?

Last updated: November 2025 • Based on real benchmark data

The Verdict (TL;DR)

Claude 4.5 leads in coding benchmarks (77.2% vs 76.3% on SWE-bench) and excels at complex refactoring tasks.

GPT-5.1 is faster, cheaper, and better for general-purpose tasks beyond coding.

Bottom line: For serious coding work, Claude 4.5 has the edge. For mixed workloads or budget constraints, GPT-5.1 is competitive.

Benchmark	Claude 4.5	GPT-5.1	Winner
SWE-bench Verified Real GitHub bug fixes	77.2%	76.3%	Claude
AIME 2025 Advanced math problems	—	94.0%	GPT-5.1
OSWorld Computer control tasks	61.4%	—	Claude
Response Speed Average tokens/sec	~45 t/s	~70 t/s	GPT-5.1

Input$3 / 1M tokens

Output$15 / 1M tokens

Higher cost but best-in-class coding performance

Input$2.50 / 1M tokens

Output$10 / 1M tokens

More affordable with competitive performance

Claude 5 is expected in Q2-Q3 2026. Track the release and get instant alerts.

• SWE-bench Verified: Anthropic official announcement (Sep 2025), OpenAI System Card (Nov 2025)
• AIME 2025: OpenAI research paper
• OSWorld: Anthropic benchmark results
• Pricing: Official API pricing pages (Nov 2025)
• Performance tests: Axis Intelligence, artificialanalysis.ai