Cursor just dropped Composer 2.5, and the numbers are hard to ignore. The in-house coding model scores 79.8% on SWE-Bench Multilingual—matching Anthropic's Opus 4.7 and OpenAI's GPT-5.5 benchmark for benchmark—while pricing at $0.50 per million input tokens. That's a 30x cost advantage over the frontier labs, and it's live in Cursor right now.
The Technical Breakdown
Composer 2.5 is built on Moonshot AI's Kimi K2.5 checkpoint, an open-source foundation that Cursor fine-tuned with synthetic training data. According to their blog post, the model trained on 25 times more synthetic tasks than its predecessor, Composer 2, and a full 85% of the compute budget went toward extra training and reinforcement learning rather than raw pre-training. The result is a leaner operation that punches well above its weight class. On CursorBench v3.1, the new model hit 63.2%, further cementing that this isn't a fluke on a single benchmark. There are two pricing tiers: standard at $0.50 input / $2.50 output per million tokens, and a faster variant running $3.00 / $15.00 respectively. For context, Anthropic charges $15 per million input tokens for Opus 4.7, while OpenAI's GPT-5.5 comes in at $10.
The Economics Don't Lie
Here's the thing—this fundamentally breaks the pricing model that Anthropic and OpenAI have been running on. If a smaller player using an open-source checkpoint can match frontier performance at a fraction of the cost, the premium those labs charge for coding tasks becomes very hard to justify. Cursor's approach—massive synthetic data generation plus targeted RL—suggests you don't need $10B+ training runs to build top-tier coding models. The recipe matters more than the budget.
What's Next
Cursor hasn't disclosed exact training costs for Composer 2.5, but they're already thinking bigger. A successor model is in active training with backing from SpaceX and xAI, running on the Colossus-2 cluster equipped with one million H100 equivalents—ten times the compute of the current model. That's a serious signal that Cursor isn't stopping here. SpaceX had previously announced plans to acquire Cursor for $60 billion, which contextualizes why they're suddenly able to access this level of infrastructure. If the successor hits the rumored targets, we're looking at a potential benchmark leader by Q3 2026.
Key Takeaways
- Composer 2.5 scores 79.8% on SWE-Bench Multilingual, matching Opus 4.7 and GPT-5.5
- Pricing is $0.50/M input tokens versus $15 for Opus 4.7—a 30x advantage
- Built on open-source Kimi K2.5 checkpoint with heavy synthetic data fine-tuning
- Successor model training on Colossus-2 with 1M H100 equivalents backed by SpaceX/xAI
The Bottom Line
Cursor just proved that frontier lab pricing is more about monopoly rents than necessity. If Composer 2.5 holds up in production—and early signs suggest it will—Anthropic and OpenAI are going to have to get comfortable with thinner margins on their coding products, or watch developers migrate wholesale.