FLUX.1-Krea and the Rise of Opinionated Models

  • Wan 2.2 is a video model people have been using to do text to image recently that I think solves this problem way better than Krea in the base model. -- https://www.reddit.com/r/comfyui/comments/1mf521w/wan_22_tex...

    As others have said, you can fine-tune any model with a pretty small data set of images and captions and make your generations not look like 'AI' or all look the same.

    Here's one I made a while back trained on Sony HVS HD video demos from the 80s/90s -- https://civitai.com/models/896279/1990s-analog-hd-or-4k-sony...

  • This is what finetuning has been all about since stable diffusion 1.5 and especially SDXL. And even something StabilityAI base models excelled at in the open weights category. (Midjourney has always been the champion, but proprietary)

    Sadly with SAI going effectively bankrupt things changed, their rushed 3.0 model was broken beyond repair and the later 3.5 just unfinished or something (the api version is remarkably better), gens full of errors and artifacts even though the good ones looked great. It turned out hard to finetune as well.

    In the mean time flux got released, but that model can be fried (as in one concept trained in) but not finetuned (this krea flux is not based on the open weights flux). Add to that that as models got bigger training/finetuning now costs an arm and a leg, so here we are, a year after flux got released a good finetune is celebrated as the next new thing :)

  • Hi there! Thank you for the glowing review! I'm the cofounder of Krea and I'm glad you liked Sangwu's blog post. The team is reading it.

    You'll probably get a lot of replies around how this model is a just a fine-tune and a potential disregard for LoRAs, as if we didn't know about them. While the reality is that we have thousands of them running in our platform. Sadly there's simply so much a LoRA and a fine-tune can do before you run into issues that can't be solved until you apply more advanced techniques such as curated post-training runs (including reinforcement learning-based techniques such as Diffusion-PPO[1]), or even large-scale pre-training.

    -

    [1]: https://diffusion-ppo.github.io

  • > Researchers have been overly focused on the extra fingers problem

    A funny consequence of this is that now it’s really hard to get models to intentionally generate disfigured hands (six fingers, missing middle finger).

  • I did a lot of testing with Krea. The results were certainly very different than flux-dev, less "ai-like" in some ways and the details were way better, but very soft and bit washed out and more ai-like in other ways.

    I did a 50% mix of flux-dev-krea and flux-dev and it is my new favorite base model.

  • So, question -- does the author know that this post is merely about "what is widely known about" vs. "what is actually possible?"

    Which is to say -- if one is in the business or activity of "making AI images go a certain way" a quick perusal of e.g. Civitai has about a million solutions to the "problem" of "all the AI art looks the same?"

  • So, the one thing I notice is that in every trio of original image, GPT-4.1 image, and Krea image where the author says GPT-4.1 exhibits the AI look and Krea avoids it (except the first with the cat), comparing the original inage to the Krea image shows Krea retains all the described hallmarks of the AI look that are present in the GPT image, but just toned down a little bit (in the first, it lacks the obvious bokeh because it avoids showing anything at a much different distance than the main subject, which is for that aesthetic issue what avoiding showing hands is for dealing with the correctness issue of bad hands.)

  • All but the last example look better (to me) on Krea than ChatGPT-4.1.

    The problem with AI images, in my opinion, is not the generated image (that can be better or worse) but the prompt and instructions given to the AI and their "defaults".

    So many blog posts and social media updates have that horrible (again, to me) feel and look of overly plastic vibe, like a cartoon that has been burn... just like "needs more JPEG" but "needs more AI-vibe".

  • Recent and related:

    Releasing weights for FLUX.1 Krea - https://news.ycombinator.com/item?id=44745555 - July 2025 (107 comments)

  • I look forward for the day someone trains a model that can do good writing, without emdashes, it's not but and all of the AI slop.