Ask HN: Algorithm for searching text with similar meaning
For example, I have a long text that says, "I have to change the screw number 1234," and a short one that says, "screw number 1234 changed."
Both were inserted by different people, referring to the same thing.
I thought of using an LLM (GPT-4), however, my dataset is too large (millions of entries) and it would be expensive.
Is there any other better or good enough way?
Thank you.
These models are self-hosted and cheap to run. Much smaller than GPT 3 or 4 but trained especially for this purpose.