I had never heard of it either until I needed to create an internal search engine where I work. After a few days of research, I found that Apache Solr/Lucene is often used for intranet search engines and for e-commerce sites.
I had never heard of it either until I needed to create an internal search engine where I work. After a few days of research, I found that Apache Solr/Lucene is often used for intranet search engines and for e-commerce sites.
We use it to parse and index OCRd PDFs for full-text searching. My advice: Don't use Solr. Don't use PCRd PDFs. Don't support full-text searching, because no one fucking uses it. We get thousands of searches against title, keywords, dates, and other meta shit every day in our internal application. The only full-text searches performed are by me when I'm testing shit.
What would you use as an alternative to Solr? One of the things I use it for is to index several internal wikis so that we can have a centralized search engine (also the default search engines suck). In this case, I need to index the content as well. The thing that gave me the most difficulty was tweaking the config so to get page rankings "just right".
Apache what? (Score:3)
It's so popular that I never heard of it before today.
Re: (Score:4, Informative)
Re: (Score:3)
I had never heard of it either until I needed to create an internal search engine where I work. After a few days of research, I found that Apache Solr/Lucene is often used for intranet search engines and for e-commerce sites.
We use it to parse and index OCRd PDFs for full-text searching.
My advice: Don't use Solr. Don't use PCRd PDFs. Don't support full-text searching, because no one fucking uses it. We get thousands of searches against title, keywords, dates, and other meta shit every day in our internal application. The only full-text searches performed are by me when I'm testing shit.
Re:Apache what? (Score:2)