How do users search your website?
I'm working on some search tools to enhance my start-up's existing infrastructure. Users will be searching highly-structured data, and I would like to provide them something better than SQL's LIKE operator. I've looked into some options, like the Google Appliance, but many commercial tools are currently too costly for our bootstrapped company. It may be better, in the end, to build my own.
Are there good tools or resources that the YC community would recommend?
If you're looking to search an existing MYSQL DB I hear Sphinx is a good option. I personally use Ferret but everyone seems to think it sucks horribly so I'm sure you'll hear that from people. I have had no problems with Ferret at all however. Sphinx and Ferret are both Ruby based.
Solr is also an option but it tends to be slower than Ferret and require more resources.
Here is a recent post of mine that gives a quick comparison of Ferret and Solr. http://www.embought.com/blog/show/10?t=SOLR-vs-Ferret-as-a-S...
I've built custom search engines using Lucene before -- http://lucene.apache.org/java/docs/index.html -- problem is that you have to build a lot of extra stuff to make it work for your scenarios.
Wikipedia lists a ton of sites that rely on Lucene: http://en.wikipedia.org/wiki/Lucene
You might also want to look at Solr -- I haven't tried it, but sounds like you have to make less special sauce to get it running. http://en.wikipedia.org/wiki/Solr
Ferret is actually a port of Lucene to Ruby.
If I were engineering something, I'd probably want to use Facebook Thrift to make my searches a service that can be called from anywhere -- just use the Java Thrift binding on top of a Lucene process, and then call into it from my web app.
I will say it depend on how much data you have to search and how tightly link is the search to your business requirements.
If you are in search of some casual search mechanism with few thousand records to index, go for open source; if not you may have to implement your own search engine. I am in the business of search engine though I can’t tell you much without violating my non-compete agreement.
Another option is to look at our hosted search solution not yet in beta. What will be interesting for you will be the data search service, a completely customizable search solution. Check it out here http://www.intelliverb.com/PESS/
Take a look at our open-source tool, http://hounder.org
Among other sites, it powers wordpress.com (over 3M weblogs indexed).
We are looking for feedback and feature suggestions, you can also join our discussion group: http://groups.google.com/group/hounder
I haven't played with it myself (though I intend to) but Lucene might be an option. http://lucene.apache.org/java/docs/index.html
It has also been ported to the zend framework if PHP is your thing.
For a site search, see http://news.ycombinator.com/item?id=184707