Hacker News Clone

Claude Haiku 4.5

by adocomplete on 10/15/2025, 4:55 PM with 185 comments

System card: https://assets.anthropic.com/m/99128ddd009bdcb/original/Clau...

by simonw on 10/15/2025, 5:30 PM
Pretty cute pelican on a slightly dodgy bicycle: https://tools.simonwillison.net/svg-render#%3Csvg%20viewBox%...
by Topfi on 10/15/2025, 5:55 PM
Very preliminary testing is very promising, seems far more precise in code changes over GPT-5 models in not ingesting irrelevant to the task at hand code sections for changes which tends to make GPT-5 as a coding assistant take longer than sometimes expected. With that being the case, it is possible that in actual day-to-day use, Haiku 4.5 may be less expensive than the raw cost breakdown may appear initially, though the increase is significant.
Branding is the true issue that Anthropic has though. Haiku 4.5 may (not saying it is, far to early to tell) be roughly equivalent in code output quality compared to Sonnet 4, which would serve a lot users amazingly well, but by virtue of the connotations smaller models have, alongside recent performance degradations making users more suspicious than beforehand, getting these do adopt Haiku 4.5 over Sonnet 4.5 even will be challenging. I'd love to know whether Haiku 3, 3.5 and 4.5 are roughly in the same ballpark in terms of parameters and course, nerdy old me would like that to be public information for all models, but in fairness to companies, many would just go for the largest model thinking it serves all use cases best. GPT-5 to me is still most impressive because of its pricing relative to performance and Haiku may end up similar, though with far less adoption. Everyone believes their task requires no less than Opus it seems after all.
For reference:
Haiku 3: I $0.25/M, O $1.25/M
Haiku 4.5: I $1.00/M, O $5.00/M
GPT-5: I $1.25/M, O $10.00/M
GPT-5-mini: I $0.25/M, O $2.00/M
GPT-5-nano: I $0.05/M, O $0.40/M
GLM-4.6: I $0.60/M, O $2.20/M
by caymanjim on 10/15/2025, 8:49 PM
Ain't nobody got time to pick models and compare features. It's annoying enough having to switch from one LLM ecosystem to another all the time due to vague usage restrictions. I'm paying $20/mo to Anthropic for Claude Code, to OpenAI for Codex, and previously to Cursor for...I don't even know what. I know Cursor lets you select a few different models under the covers, but I have no idea how they differ, nor do I care.
I just want consistent tooling and I don't want to have to think about what's going on behind the scenes. Make it better. Make it better without me having to do research and pick and figure out what today's latest fashion is. Make it integrate in a generic way, like TLS servers, so that it doesn't matter whether I'm using a CLI or neovim or an IDE, and so that I don't have to constantly switch tooling.
by steveklabnik on 10/15/2025, 5:26 PM
I am really interested in the future of Opus; is it going to be an absolute monster, and continue to be wildly expensive? Or is the leap from 4 -> 4.5 for it going to be more modest.
by quentin-smr on 10/15/2025, 8:43 PM
Comparing haiku and sonnet for a question needing a code doc fetch:
haiku https://claude.ai/share/8a5c70d5-1be1-40ca-a740-9cf35b1110b1 sonnet https://claude.ai/share/51b72d39-c485-44aa-a0eb-30b4cc6d6b7b
haiku invented the output of a function and gave a bad answer. sonnet got it right
by zone411 on 10/15/2025, 6:19 PM
I've benchmarked it on the Extended NYT Connections (https://github.com/lechmazur/nyt-connections/). It scores 20.0 compared to 10.0 for Haiku 3.5, 19.2 for Sonnet 3.7, 26.6 for Sonnet 4.0, and 46.1 for Sonnet 4.5.
by minimaxir on 10/15/2025, 5:04 PM
$1/M input tokens and $5/M output tokens is good compared to Claude Sonnet 4.5 but nowadays thanks to the pace of the industry developing smaller/faster LLMs for agentic coding, you can get comparable models priced for much lower which matters at the scale needed for agentic coding.
Given that Sonnet is still a popular model for coding despite the much higher cost, I expect Haiku will get traction if the quality is as good as this post claims.
by getpokedagain on 10/16/2025, 2:04 AM
I have had great experience using the previous haiku with mcp servers. I am looking forward to trying this out.
by 85392_school on 10/15/2025, 5:19 PM
System card: https://assets.anthropic.com/m/99128ddd009bdcb/original/Clau... (edit: discussed here https://news.ycombinator.com/item?id=45596168)
This is Anthropic's first small reasoner as far as I know.
by justinbaker84 on 10/15/2025, 7:00 PM
I am very excited about this. I am a freelance developer and getting responses 3x faster is totally worth the slightly reduced capability.
I expect I will be a lot more productive using this instead of claude 4.5 which has been my daily driver LLM since it came out.
by senko on 10/15/2025, 6:30 PM
I've tried it on a test case for generating a simple SaaS web page (design + code).
Usually I'm using GPT-5-mini for that task. Haiku 4.5 runs 3x faster with roughly comparable results (I slightly prefer the GPT-5-mini output but may have just accustomed to it).
by aliljet on 10/15/2025, 5:13 PM
What is the use case for these tiny models? Is it speed? Is it to move on device somewhere? Or is it to provide some relief in pricing somewhere in the API? It seems like most use is through the Claude subscription and therefore the use case here is basically non-existent.
by shrisukhani on 10/15/2025, 6:07 PM
In our (very) early testing at Hyperbrowser but we're seeing Haiku 4.5 do really well on computer use as well. Pretty cool that Haiku is like the cheapest computer use model from the big labs now.
by RickHull on 10/15/2025, 5:26 PM
If I'm close to weekly limits on Claude Code with Anthropic Pro, does that go away or stretch out if I switch to Haiku?
by samuelknight on 10/15/2025, 6:05 PM
Sonnet 4.5 is an excellent model for my startup's use case. Chatting to Haiku it looks promising too, and it may be great drop in replacement for some of inference tasks that have a lot of input tokens but don't require 4.5-level intelligence.
by meander_water on 10/15/2025, 8:57 PM
> The score reported uses a minor prompt addition: "You should use tools as much as possible, ideally more than 100 times. You should also implement your own tests first before attempting the problem."
I'm not sure if the SWE benchmark score can be compared like for like with OpenAIs scores because of this.
by andrewstuart on 10/15/2025, 9:01 PM
Claude has stopped showing code in artifacts unless it knows the extension.
I used to be able to work on Arduino .ino files in Claude now it just says it can’t show it to me.
And do we have zip file uploads yet to Claude? ChatGPT and Gemini have done this for ages.
And all the while Claude’s usage limits keep going up.
So yeah, less for more with Claude.
by sim04ful on 10/15/2025, 6:02 PM
Curious they don't have any comparison to grok code fast:
Haiku 4.5: I $1.00/M, O $5.00/M
Grok Code: I $0.2/M, O $1.5/M
by stared on 10/15/2025, 6:17 PM
Why I use cheaper models for summaries (a lot ogf gemini-2.5-flash), what’s the use case of cheaper AI for coding? Getting more errors, or more spaghetti code, seems never worth it.
by simonw on 10/15/2025, 7:14 PM
I went looking for the bit about if it blackmails you or tries to murder you... and it was a bit of a cop-out!
> Previous system cards have reported results on an expanded version of our earlier agentic misalignment evaluation suite: three families of exotic scenarios meant to elicit the model to commit blackmail, attempt a murder, and frame someone for financial crimes. We choose not to report full results here because, similarly to Claude Sonnet 4.5, Claude Haiku 4.5 showed many clear examples of verbalized evaluation awareness on all three of the scenarios tested in this suite. Since the suite only consisted of many similar variants of three core scenarios, we expect that the model maintained high unverbalized awareness across the board, and we do not trust it to be representative of behavior in the real extreme situations the suite is meant to emulate.
https://www.anthropic.com/research/agentic-misalignment
by taf2 on 10/15/2025, 9:13 PM
I just don't find the benchmarks on the site here at all believable. codex for me with gpt-5 is so much better then claude any model version. I mean maybe it's because they compare to gpt-5-codex model but they don't mention is that high, medium, low, etc... so it's just misleading probably... but i must reiterate zero loyalty to any AI vendor. 100% what solves the problem more consistently and of a higher quality and currently gpt-5 high - hands down
by dotancohen on 10/15/2025, 7:14 PM
```
  > In the system card, we focus on safety evaluations, including assessments of: ... the model’s own potential welfare ...
```
In what way does a language model need to have its own welfare protected? Does this generation of models have persistent "feelings"?
by philipp-gayret on 10/15/2025, 7:11 PM
Tried it in Claude Code via /config, makes it feel like I'm running on Cerebras. It's seriously fast, bottleneck is on human review at this point.
by knes on 10/15/2025, 5:59 PM
At augmentcode.com, we've been evaluating Haiku for some time, it's actually a very good model. We found out it's 90% as good as Sonnet and is ~34% faster than sonnet!
Where it doesn't shine much is on very large coding task. but it is a phenomenal model for small coding tasks and the speed improvement is much welcome
by leetharris on 10/15/2025, 5:54 PM
The main thing holding these Anthropic models back is context size. Yes, quality deteriorates over a large context window, but for some applications, that is fine. My company is using grok4-fast, the Gemini family, and GPT4.1 exclusively at this point for a lot of operations just due to the huge 1m+ context.
by ilaksh on 10/15/2025, 6:33 PM
What LLM do you guys use for fast inference for voice/phone agents? I feel like to get really good latency I need to "cheat" with Cerebras, groq or SambaNova.
Haiku 4.5 is very good but still seems to be adding a second of latency.
by logankeenan on 10/15/2025, 7:48 PM
I'm not seeing it as a model option in Claude Code for my Pro plan. Perhaps, it'll roll out eventually? Anyone else seeing it with the same plan?
by qustrolabe on 10/15/2025, 8:31 PM
Awww they took away free tier Sonnet 4.5, that was a beautiful model to talk to even outside coding stuff
by KaiserPro on 10/15/2025, 7:26 PM
Ok, I use claude, mostly on default, but with extended thinking and per project prompts.
What's the advantage of using haiku for me?
is it just faster?
by hu3 on 10/15/2025, 11:58 PM
glad to see it's already available in VScode Copilot for me.
by ashirviskas on 10/15/2025, 6:36 PM
And I was wondering today why Sonnet 4.5 seemed so freaking slow. Now this explains it, Sonnet 4.5 is the new Opus 4.1 where Anthropic does not really want you to use it.
by no_flaks_given on 10/16/2025, 12:08 AM
What I want to see is an Anthropic + Cerebras partnership.
Haiku becomes a fucking killer at 2000token/second.
Charge me double idgaf
by seunosewa on 10/15/2025, 5:26 PM
I'd like to see this price structure for Claude:
$5/mt for Haiku 4.5
$10/mt for Sonnet 4.5
$15/mt for Opus 4.5 when it's released.
by singularity2001 on 10/15/2025, 7:34 PM
claude --model Haiku-4.5
doesn't work
by baalimago on 10/15/2025, 5:30 PM
Ehh, expensive
by ericbrow on 10/15/2025, 5:28 PM
Was anyone else slightly disappointed that this new product doesn't respond in Haiku, as the name would imply?
by cadamsdotcom on 10/15/2025, 8:00 PM
Claude Code is great but slow to work with.
Excited to see how fast Haiku can go!
by layer8 on 10/15/2025, 8:28 PM
For those wondering where the “card” terminology originated: https://arxiv.org/pdf/1810.03993
Maybe at 39 pages we should start looking for a different term…