Simple search engine # Installing It is recommended to use [virtualenv](https://virtualenv.pypa.io). ``` pip install -r requirements.txt ``` ## Testing If you just want to test and don't want to install a PostgreSQL database but have Docker installed, juste use the `docker-compose.yml`. This is only for test, don't use this shit on production (the docker-compose file)! ## Sphinx-search / Manticore-search You can use [Sphinx-search](http://sphinxsearch.com/) but it's recommand to use [Manticore-search](https://manticoresearch.com/) since the last version of Sphinx-search is ditribued in closed-source instead of open-source (for version 3.x). All explication is for Manticore-search for the moment but at many time the term `sphinx` is used in code because Manticore-search want to keep a compatibility with Sphinx-search. # Configuration ## Database The database used for this project is PostgreSQL, you can update login information in `config.py` file. ## Sphinx-search / Manticore-search The configuration for this is in `sphinx_search.conf` file. For update this file please view documentation of [Sphinx-search](http://sphinxsearch.com/docs/manual-2.3.2.html) or [Manticore-search](https://docs.manticoresearch.com). Keep in mind you must keep up to date the file `config.py` in accordance with the `sphinx_search.conf` file. # Crawling For now there is an example spider with neodarz website. For launch all the crawler use the following command: ``` python app.py crawl ``` # Indexing Before lauch indexing or searching command you must verifiy that the folder of `path` option is present in your system (Warning: the last word of the `path` option is the value of the `source` option, don't create this folder but only his parent folder). Example with the configuration for the indexer `datas`: ``` index neodarznet { source = neodarznet path = /tmp/data/neodarznet } ``` Here the folder is `/tmp/data/` The command for indexing is: ``` indexer --config sphinx_search.conf --all ``` Don't forget to launch the crawling command before this ;) # Searching Before you can make search, you must lauch the search server ``` searchd -c sphinx_search.conf ``` ## Enjoy You can now launch the server! ``` python app.py ``` For start searching send `GET` request to the following adresse (without `<` and `>`): ``` 127.0.0.1:5000/?search=&index= ``` Resultat are in json format. If you whant to know witch website is indexed, search in the file [sphinx_search.conf](https://git.khaganat.net/neodarz/khanindexer/blob/master/sphinx_search.conf) all the line who start by `index`.