Show HN: Kiln - Interactive LLM fine-tuning, dataset collab & synthetic data gen
Hi HN!
Excited to share a project I’ve been working on called Kiln AI! It solves what I’ve always found to be the hardest part of building AI products: products simply don’t have datasets, and datasets can’t keep up with product evolution. Product goals evolve, are hard to define, and new bugs emerge all the time. Kiln helps you build ML models for your product, from a collaborative dataset. For a bit of context: I’ve been building consumer AI products for a decade at Apple, my own startup, and MSFT.
At its core Kiln is 3 things (so far):
- Really great collaboration: Kiln datasets are designed to be shared/versioned in Git. We make it easy for the full team (PM/QA/subject-experts) to contribute directly, via super intuitive apps. With Kiln, when anyone on the team adds bugs/evals/goals, they go right into the dataset to be picked up in the next build.
- Rapid fine-tuning: dispatch fine-tuning jobs for a range of top models (Llama 3.2, Mixtral, GPT 4o/4o-mini). It’s fine tuning in just a few clicks.
- Synthetic data generation: our interactive data gen helps create large enough datasets for fine-tuning and evals. Build an initial training dataset, or use it to build out enough examples of a bug for the model fix it. It uses large models and heavy prompts (COT, multi-shot), which allows you to fine-tune smaller and faster models.
The link here is to the new fine-tuning feature which just launched today. The demo shows starting a project from scratch, defining a task, generating synthetic training data, fine-tuning 9 models, and deploying them. It’s all very easy: 18 mins of active work, all in an intuitive UI. You can download the apps and follow the same process for your own product goals.
I’d love to chat with folks building AI products, and offer any help I can (tools and/or guidance). Fire me a message at steve@getkiln.ai if interested.
You can download our apps for Mac, Windows or Linux, or `pip install kiln_ai` for the library.
Github with docs, downloads, and guides: https://github.com/Kiln-AI/Kiln
Congrats on the launch!
I'm building in this space[1] and I'm intrigued. When I checked out the repo, this actually looked like possibly a really convenient way to fine-tune models, but I'm trying to understand the piece about "products simply don’t have datasets, and datasets can’t keep up with product evolution". What does this mean in practice and how does this relate to fine-tuning?