Open Source Search Engine Software!
SeekQuarry is the parent site for Yioop
. Yioop is a GPLv3
, open source, PHP search engine.
What can Yioop do?
Yioop software provides many of the same features of larger search portals:
- Search Results. Yioop comes with a crawler which can be used to crawl the open web or a selection of URLs of your choice. It also can index popular archive formats like Wikipedia XML-dumps, arc, warc, Open Directory Project-RDF, as well as dumps of emails or databases. Once you have created Yioop indexes of your desired data sources, Yioop can serve as a search engine for your data. It supports "crawl mixes" of different data sources. Yioop also provides tools to classify and sculpt your data before being used in search results.
- News Service. News is best when it is still fresh. Yioop has a news updater process that can be used to re-index RSS and Atom feeds on an hourly basis. This more timely information can then be incorporated into Yioop search results.
- Social Groups, Blogs, and Wikis. Yioop can be configured to allow user's to create discussion groups, blogs, and wikis. If Yioop is configured to allow multiple users, then users can share mixes of crawls they create. Blogs and discussion group can be made public or private. Public ones have public RSS feeds and the better amongst these can be chosen for incorporation in what Yioop's news service indexes.
- Web Sites. Yioop provides a Model View Adapter framework which can be easily extended to build customized search portal websites. Yioop can also be integrated into existing sites to provide search functionality either through an API, Open Search RSS, or JSON services.
The software and hardware requirements for Yioop are relatively low. At a minimum, you only need a web server such as Apache and PHP 5.3 or better. A test set-up consisting of three 2011 Mac Mini's each with 8GB RAM, a single name server, and five fetchers can add a 100 million pages to its index every four weeks.