Hacker News — vinext + Cloudflare Workers

new
past
show
ask
show
jobs
submit

▲Cursor Introduces Composer 2.5 (twitter.com)

79 points by asar 6 hours ago | 40 comments

granzymes 2 hours ago [-]

Surprised this got pushed off the front page so quickly! It’s exciting to see what the Cursor team has been able to do with significantly fewer resources than the frontier lab.

I do wish they weren’t joining xAI. Something tells me there will be a contingent of researchers that departs Cursor if that merger is consummated.

memoryleakgame 2 hours ago [-]

If these benches from their site hold up (they likely wont)

Wouldn't this compress ai revenue like 15x quickly

If they really have a 4.7 opus high equivalent at 1/16 the cost wouldn't this significantly effect all the current capex and planing

Maybe they are getting elon to cover cost

infecto 1 hours ago [-]

The way I have read their benchmark results is that they trained a model to work insanely well in their coding workflow. It’s not a general purpose model.

One of the surprisingly hardest problems to solve is to get a model to use the tools you give it access to.

2001zhaozhao 1 hours ago [-]

> compress ai revenue like 15x

that roughly just puts it on par with OpenAI and Anthropic subscriptions in terms of pricing per token

zackify 1 hours ago [-]

this thing is so awesome on fast mode, so far i am impressed, some of its observations feel similar to opus.

i use gpt 5.5 and opus 4.7 a lot every day, if i can get good results at this speed, hopefully the usage level holds up on my team plan haha

asar 6 hours ago [-]

The model is (like Composer 2) based on Kimi K2.5 and they claim SOTA performance for 1/10th of the cost. The tweet also mentions that they've started a new model from scratch on Colossus 2 (xAI/SpaceX Cluster). Really impressive how they've made this jump from being called the vscode fork with no moat just a couple of months ago.

onlyrealcuzzo 5 hours ago [-]

> Really impressive how they've made this jump from being called the vscode fork with no moat just a couple of months ago.

Impressive, yes. But they still don't have a moat...

infecto 1 hours ago [-]

I am not sure we should dismiss what they have today. Nobody has yet to come close with a full package ide that works well for coding. Is that not a moat? It is easy for my to in my head discount it, thinking that I could build something myself but between autocomplete and their workflow for agent use, it feels like they have some tangible moat emerging.

alach11 2 hours ago [-]

Isn't a large user base and the data collected from those users a moat of sorts?

onlyrealcuzzo 2 hours ago [-]

A moat is when you have something other's can't easily get.

Every MAG 7 / FAANG company already has more users and more data...

That's not a moat.

That's traction.

AussieWog93 2 hours ago [-]

Honestly the data itself is probably worth heaps even in the company itself collapses. Early attention engineering when humans were still in the loop!!!

kkukshtel 4 hours ago [-]

And its still just a vscode fork

liuliu 5 hours ago [-]

Since the frontier is only 8-month ahead of DeepSeek, it is hard to see how model training can be a moat as all the tricks are available from open labs in China. You really just need <100m to bootstrap at this point.

Lionga 6 hours ago [-]

They are still a vscode fork with no moat? Like they lost about 70% of users in half a year which goes to show how there is not even the tiniest of moat.

GenerWork 5 hours ago [-]

I feel like they've been targeting enterprise pretty hard. I know my company uses them, and the companies that hire us also use Cursor.

5 hours ago [-]

whywhywhywhy 5 hours ago [-]

It's still a VsCode fork just now with a Kimi fine tune and still no moat...

I won't debate that it turns out none of this mattered when it came to being as successful company though and kinda makes anyone who tried to roll their own instead of fork look a little silly.

aurareturn 5 hours ago [-]

I doubt it's a brand new model. It's likely just Kimi K2.5 further trained on coding.

enraged_camel 5 hours ago [-]

They didn't say it's a new model... in fact they said exactly what you just said.

PUSH_AX 6 hours ago [-]

They set themselves up for flack when they use whatever these evals are… they did the same for composer 2 which was evaled in close competition with frontier models, spoiler alert, it wasn’t even close in practice.

So now 2.5 is supposed to compete with opus 4.7? Sure…

tuo-lei 5 hours ago [-]

they say it themselves in the post - behavior dimensions "not well captured by existing benchmarks". that was the exact problem with composer 2. not dumber on individual tasks, just bad at session-level decisions like when to stop editing, how much context to carry forward, when to re-read a file vs assume. you don't catch any of that in an isolated eval.

infecto 1 hours ago [-]

As I have said before in prior composer threads. The proof is in the usage. I am inclined to somewhat believe the results as I use composer and also take the results for the given context. It’s not a general purpose sota model. It’s a model that runs inexpensively in their coding workflow that is creating results similar to opus or gpt.

criemen 5 hours ago [-]

Well is that a statement about the quality of Opus 4.7 or about compose 2.5? :P

everfrustrated 6 hours ago [-]

Full details https://cursor.com/blog/composer-2-5

jtwaleson 5 hours ago [-]

Ok this might be weird but I've moved everyone in my 4 person team to our team plan and costs seem to have sky rocketed compared to the individual plans. Where before most people spent 20-100 USD, now the total bill is more like 1k USD. I haven't gone into the details but it feels like I'm being scammed.

infecto 1 hours ago [-]

Keep in mind I believe there is a larger buffer given to personal plans. If they have 50% extra with the personal plan you now only get 25%.

danbrooks 4 hours ago [-]

Check which model you're using.

The fast version of composer is the default now (which costs ~x3 as much).

PUSH_AX 5 hours ago [-]

My cursor costs sky rocketed recently too

lukebrichey 3 hours ago [-]

this feels super bullish on cursor/spacexai's ability to train a frontier level model. could be truly SOTA on coding given that their RL data is this powerful

vanuatu 5 hours ago [-]

It's always great that more companies are throwing their hat in the ring, especially focusing on value (latency + intelligence + cost)

jdlyga 6 hours ago [-]

It's a bit odd that they're not comparing it against Sonnet

jjice 6 hours ago [-]

I don't think so. They're comparing it to the highest tier available models from Anthropic and OpenAI. Generally speaking, Opus is better than Sonnet in almost every way, so why have the redundancy?

CodingJeebus 5 hours ago [-]

The tweet specifies that the new model is geared towards long-running tasks, which is what you'd use a model like Opus for anyway.

polski-g 2 hours ago [-]

I don't know why their model isn't on Openrouter yet. They must not have enough capacity to offer it.

svclaws 6 hours ago [-]

Their previous Composer was already marketed as a cheap model capable of competing with SOTA on most tasks. The evals they shared back then backed this up but in my day-to-day usage it fell short across the board. Canceled my cursor subscription and switched to Claude Code a few weeks ago. It has its own shortcomings but in terms of model capability and UX quality Cursor will have a hard time competing in the long term. Elon Musk will be a very good way out for them.

re-thc 5 hours ago [-]

Did they just upgrade Kimi 2.5 to 2.6?

lukebrichey 3 hours ago [-]

still uses 2.5

sergiotapia 6 hours ago [-]

Congratulations on the launch! I'm interested in trying Cursor but it's very confusing what I should buy. What does the Pro $20 plan get me in usage if I only use Composer 2.5? How fast is the model?

darkwi11ow 5 hours ago [-]

I use $20 plan on daily basis for more than a year now, and have yet to exhaust that limit. The plan includes $20 in api costs for non-Cursor premium models and $20 for Composer and Auto models provided by Cursor themselves.

That said, I am pretty old-fashioned coder and use LLM mostly to overcome the blank page problem, which means I review and often rewrite LLM output by hand and avoid prompt loops for a single task.

People who are aiming to not read code any more might find this $20 plan lacking for their needs, however for my needs it fits perfectly.

kaizoku156 5 hours ago [-]

The limits are probably even higher than that, i seem to get about 100$+ of usage on composer and about 45-50 usd on non composer models

ChrisArchitect 5 hours ago [-]

Non-x link: https://cursor.com/blog/composer-2-5 (https://news.ycombinator.com/item?id=48182126)

scuderiaseb 6 hours ago [-]

[dead]

Rendered at 23:43:43 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.