About SeekQuarry/Yioop
SeekQuarry is the parent site for Yioop. Both SeekQuarry and Yioop were written mainly by myself, Chris Pollett. The project began in Nov. 2009 and had its first publically available release in August, 2010.
The Yioop and SeekQuarry Names
When looking for names for my search engine, I was originally thinking about using the name SeekQuarry whose domain name hadn't been registered. After deciding that I would use Yioop for the name of my search engine site, I decided I would use SeekQuarry as a site to publish the software that is used in the Yioop engine. That is, yioop.com is a live site that demonstrates the open source search engine software distributed on the seekquarry.com site.
The name Yioop has the following history: I was looking for names that hadn't already been registered. My wife is Vietnamese, so I thought I might have better luck with Vietnamese words since all the English ones seemed to have been taken. I started with the word giup, which is the way to spell 'help' in Vietnamese if you remove the accents. It was already taken. Then I tried yoop, which is my lame way of pronouncing how giup sounds like in English. It was already taken. So then I combined the two to get Yioop.
Dictionary Data
Bloom filters for n grams on the Yioop test site were generated using Wikimedia Page View Statistics. Trie's for word suggestion for all languages other than Vietnamese were built using the Wiktionary Frequency Lists. These are available under a Creative Commons Share Alike 3.0 Unported License as described on Wikipedia's Download page. The derived data files (if they were created for that language) for a language IANA tag, locale-tag, can be found in the locale/locale-tag/resources folder of the Yioop project. These are also licensed using the same license. For Vietnamese, I used the following Vietnamese Word List obtained with permision from Ho Ngoc Duc.
Additional Credits
Several people helped with localization: Mary Pollett, Jonathan Ben-David, Ismail.B, Andrea Brunetti, Thanh Bui, Sujata Dongre, Animesh Dutta, Aida Khosroshahi, Youn Kim, Akshat Kukreti, Vijeth Patil, Chao-Hsin Shih, Ahmed Kamel Taha, and Sugi Widjaja. Thanks to Ravi Dhillon, Akshat Kukreti, Tanmayee Potluri, Shawn Tice, and Sandhya Vissapragada for creating patches for Yioop issues. Several of my master's students have done projects related to Yioop: Amith Chandranna, Priya Gangaraju, Vijaya Pamidi, Vijeth Patil, Vijaya Sinha, Tarun Pepira, Tanmayee Potluri, and Sandhya Vissapragada. Amith's code related to an Online version of the HITs algorithm. It is not currently in the main branch of Yioop, but it is obtainable from Amith Chandranna's student page. Vijaya Pamidi developed a Firefox web traffic extension for Yioop. Her code is also obtainable from Vijaya Pamidi's master's pages. Her project was later extended by Tarun Ramaswamy. Neither of these projects is currently in the main Yioop repository. Vijeth Patil's Project involved adding support for Twitter and RSS feeds to add additional real-time search results to the standard search results. This is not currently in main repository. Tanmayee Potluri's Project added log and database archive iterators for Yioop. It is currently not in the main branch. Vijaya Sinha's Project concerned using Open Street Map data in Yioop. This code is also not currently in the main branch. Priya Gangaraju's code served as the basis for the plugin feature currently in Yioop. Shawn Tice's CS288 project served as the basis of a rewrite of the archive crawl feature of Yioop for the multi-queue server setting. Sandhya Vissapragada's Master project served as the basis for the autosuggest and spell checking functionality in Yioop. The following other students have created text processors for Yioop: Nakul Natu (pptx), Vijeth Patil (epub), and Tarun Ramaswamy (xslx). Akshat Kukreti created the Italian language stemmer based on the Snowball version at http://tartarus.org/.
