Claude Haiku 4.5

System card: https://assets.anthropic.com/m/99128ddd009bdcb/original/Clau...

  • Pretty cute pelican on a slightly dodgy bicycle: https://tools.simonwillison.net/svg-render#%3Csvg%20viewBox%...

  • Very preliminary testing is very promising, seems far more precise in code changes over GPT-5 models in not ingesting irrelevant to the task at hand code sections for changes which tends to make GPT-5 as a coding assistant take longer than sometimes expected. With that being the case, it is possible that in actual day-to-day use, Haiku 4.5 may be less expensive than the raw cost breakdown may appear initially, though the increase is significant.

    Branding is the true issue that Anthropic has though. Haiku 4.5 may (not saying it is, far to early to tell) be roughly equivalent in code output quality compared to Sonnet 4, which would serve a lot users amazingly well, but by virtue of the connotations smaller models have, alongside recent performance degradations making users more suspicious than beforehand, getting these do adopt Haiku 4.5 over Sonnet 4.5 even will be challenging. I'd love to know whether Haiku 3, 3.5 and 4.5 are roughly in the same ballpark in terms of parameters and course, nerdy old me would like that to be public information for all models, but in fairness to companies, many would just go for the largest model thinking it serves all use cases best. GPT-5 to me is still most impressive because of its pricing relative to performance and Haiku may end up similar, though with far less adoption. Everyone believes their task requires no less than Opus it seems after all.

    For reference:

    Haiku 3: I $0.25/M, O $1.25/M

    Haiku 4.5: I $1.00/M, O $5.00/M

    GPT-5: I $1.25/M, O $10.00/M

    GPT-5-mini: I $0.25/M, O $2.00/M

    GPT-5-nano: I $0.05/M, O $0.40/M

    GLM-4.6: I $0.60/M, O $2.20/M

  • Ain't nobody got time to pick models and compare features. It's annoying enough having to switch from one LLM ecosystem to another all the time due to vague usage restrictions. I'm paying $20/mo to Anthropic for Claude Code, to OpenAI for Codex, and previously to Cursor for...I don't even know what. I know Cursor lets you select a few different models under the covers, but I have no idea how they differ, nor do I care.

    I just want consistent tooling and I don't want to have to think about what's going on behind the scenes. Make it better. Make it better without me having to do research and pick and figure out what today's latest fashion is. Make it integrate in a generic way, like TLS servers, so that it doesn't matter whether I'm using a CLI or neovim or an IDE, and so that I don't have to constantly switch tooling.

  • I am really interested in the future of Opus; is it going to be an absolute monster, and continue to be wildly expensive? Or is the leap from 4 -> 4.5 for it going to be more modest.

  • Comparing haiku and sonnet for a question needing a code doc fetch:

    haiku https://claude.ai/share/8a5c70d5-1be1-40ca-a740-9cf35b1110b1 sonnet https://claude.ai/share/51b72d39-c485-44aa-a0eb-30b4cc6d6b7b

    haiku invented the output of a function and gave a bad answer. sonnet got it right

  • I've benchmarked it on the Extended NYT Connections (https://github.com/lechmazur/nyt-connections/). It scores 20.0 compared to 10.0 for Haiku 3.5, 19.2 for Sonnet 3.7, 26.6 for Sonnet 4.0, and 46.1 for Sonnet 4.5.

  • $1/M input tokens and $5/M output tokens is good compared to Claude Sonnet 4.5 but nowadays thanks to the pace of the industry developing smaller/faster LLMs for agentic coding, you can get comparable models priced for much lower which matters at the scale needed for agentic coding.

    Given that Sonnet is still a popular model for coding despite the much higher cost, I expect Haiku will get traction if the quality is as good as this post claims.

  • I have had great experience using the previous haiku with mcp servers. I am looking forward to trying this out.

  • System card: https://assets.anthropic.com/m/99128ddd009bdcb/original/Clau... (edit: discussed here https://news.ycombinator.com/item?id=45596168)

    This is Anthropic's first small reasoner as far as I know.

  • I am very excited about this. I am a freelance developer and getting responses 3x faster is totally worth the slightly reduced capability.

    I expect I will be a lot more productive using this instead of claude 4.5 which has been my daily driver LLM since it came out.

  • I've tried it on a test case for generating a simple SaaS web page (design + code).

    Usually I'm using GPT-5-mini for that task. Haiku 4.5 runs 3x faster with roughly comparable results (I slightly prefer the GPT-5-mini output but may have just accustomed to it).

  • What is the use case for these tiny models? Is it speed? Is it to move on device somewhere? Or is it to provide some relief in pricing somewhere in the API? It seems like most use is through the Claude subscription and therefore the use case here is basically non-existent.

  • In our (very) early testing at Hyperbrowser but we're seeing Haiku 4.5 do really well on computer use as well. Pretty cool that Haiku is like the cheapest computer use model from the big labs now.

  • If I'm close to weekly limits on Claude Code with Anthropic Pro, does that go away or stretch out if I switch to Haiku?

  • Sonnet 4.5 is an excellent model for my startup's use case. Chatting to Haiku it looks promising too, and it may be great drop in replacement for some of inference tasks that have a lot of input tokens but don't require 4.5-level intelligence.

  • > The score reported uses a minor prompt addition: "You should use tools as much as possible, ideally more than 100 times. You should also implement your own tests first before attempting the problem."

    I'm not sure if the SWE benchmark score can be compared like for like with OpenAIs scores because of this.

  • Claude has stopped showing code in artifacts unless it knows the extension.

    I used to be able to work on Arduino .ino files in Claude now it just says it can’t show it to me.

    And do we have zip file uploads yet to Claude? ChatGPT and Gemini have done this for ages.

    And all the while Claude’s usage limits keep going up.

    So yeah, less for more with Claude.

  • Curious they don't have any comparison to grok code fast:

    Haiku 4.5: I $1.00/M, O $5.00/M

    Grok Code: I $0.2/M, O $1.5/M

  • Why I use cheaper models for summaries (a lot ogf gemini-2.5-flash), what’s the use case of cheaper AI for coding? Getting more errors, or more spaghetti code, seems never worth it.

  • I went looking for the bit about if it blackmails you or tries to murder you... and it was a bit of a cop-out!

    > Previous system cards have reported results on an expanded version of our earlier agentic misalignment evaluation suite: three families of exotic scenarios meant to elicit the model to commit blackmail, attempt a murder, and frame someone for financial crimes. We choose not to report full results here because, similarly to Claude Sonnet 4.5, Claude Haiku 4.5 showed many clear examples of verbalized evaluation awareness on all three of the scenarios tested in this suite. Since the suite only consisted of many similar variants of three core scenarios, we expect that the model maintained high unverbalized awareness across the board, and we do not trust it to be representative of behavior in the real extreme situations the suite is meant to emulate.

    https://www.anthropic.com/research/agentic-misalignment

  • I just don't find the benchmarks on the site here at all believable. codex for me with gpt-5 is so much better then claude any model version. I mean maybe it's because they compare to gpt-5-codex model but they don't mention is that high, medium, low, etc... so it's just misleading probably... but i must reiterate zero loyalty to any AI vendor. 100% what solves the problem more consistently and of a higher quality and currently gpt-5 high - hands down

  •   > In the system card, we focus on safety evaluations, including assessments of: ... the model’s own potential welfare ...
    
    In what way does a language model need to have its own welfare protected? Does this generation of models have persistent "feelings"?

  • Tried it in Claude Code via /config, makes it feel like I'm running on Cerebras. It's seriously fast, bottleneck is on human review at this point.

  • At augmentcode.com, we've been evaluating Haiku for some time, it's actually a very good model. We found out it's 90% as good as Sonnet and is ~34% faster than sonnet!

    Where it doesn't shine much is on very large coding task. but it is a phenomenal model for small coding tasks and the speed improvement is much welcome

  • The main thing holding these Anthropic models back is context size. Yes, quality deteriorates over a large context window, but for some applications, that is fine. My company is using grok4-fast, the Gemini family, and GPT4.1 exclusively at this point for a lot of operations just due to the huge 1m+ context.

  • What LLM do you guys use for fast inference for voice/phone agents? I feel like to get really good latency I need to "cheat" with Cerebras, groq or SambaNova.

    Haiku 4.5 is very good but still seems to be adding a second of latency.

  • I'm not seeing it as a model option in Claude Code for my Pro plan. Perhaps, it'll roll out eventually? Anyone else seeing it with the same plan?

  • Awww they took away free tier Sonnet 4.5, that was a beautiful model to talk to even outside coding stuff

  • Ok, I use claude, mostly on default, but with extended thinking and per project prompts.

    What's the advantage of using haiku for me?

    is it just faster?

  • glad to see it's already available in VScode Copilot for me.

  • And I was wondering today why Sonnet 4.5 seemed so freaking slow. Now this explains it, Sonnet 4.5 is the new Opus 4.1 where Anthropic does not really want you to use it.

  • What I want to see is an Anthropic + Cerebras partnership.

    Haiku becomes a fucking killer at 2000token/second.

    Charge me double idgaf

  • I'd like to see this price structure for Claude:

    $5/mt for Haiku 4.5

    $10/mt for Sonnet 4.5

    $15/mt for Opus 4.5 when it's released.

  • claude --model Haiku-4.5

    doesn't work

  • Ehh, expensive

  • Was anyone else slightly disappointed that this new product doesn't respond in Haiku, as the name would imply?

  • Claude Code is great but slow to work with.

    Excited to see how fast Haiku can go!

  • For those wondering where the “card” terminology originated: https://arxiv.org/pdf/1810.03993

    Maybe at 39 pages we should start looking for a different term…