.. _elasticsearch: ============= Elasticsearch ============= .. note:: The following documentation is deprecated. The approved installation is :ref:`via Docker `. Elasticsearch is a search server. Documents (key-values) get stored, configurable queries come in, Elasticsearch scores these documents, and returns the most relevant hits. Also check out `elasticsearch-head `_, a plugin with web front-end to elasticsearch that can be easier than talking to elasticsearch over curl, or `Marvel `_, which includes a query editors with autocompletion. Installation ------------ Elasticsearch comes with most package managers.:: brew install elasticsearch # or whatever your package manager is called. If Elasticsearch isn't packaged for your system, you can install it manually, `here are some good instructions on how to do so `_. On Ubuntu, you should just download and install a .deb from the `download page `_. Launching and Setting Up ------------------------ Launch the Elasticsearch service. If you used homebrew, ``brew info elasticsearch`` will show you the commands to launch. If you used aptitude, Elasticsearch will come with a start-stop daemon in /etc/init.d. On Ubuntu, if you have installed from a .deb, you can type: sudo service elasticsearch start Olympia has commands that sets up mappings and indexes objects such as add-ons and apps for you. Setting up the mappings is analogous to defining the structure of a table, indexing is analogous to storing rows. For AMO, this will set up all indexes and start the indexing processes:: ./manage.py reindex Or you could use the makefile target:: make reindex If you need to add arguments:: make ARGS='--with-stats --wipe --force' reindex Indexing -------- Olympia has other indexing commands. It is worth noting that the index is maintained incrementally through post_save and post_delete hooks:: ./manage.py cron reindex_addons # Index all the add-ons. ./manage.py index_stats # Index all the update and download counts. ./manage.py cron reindex_collections # Index all the collections. ./manage.py cron reindex_users # Index all the users. ./manage.py cron compatibility_report # Set up the compatibility index. ./manage.py weekly_downloads # Index weekly downloads. Querying Elasticsearch in Django -------------------------------- For now, we have our own query builder (which is an historical clone of `elasticutils `_), but we will switch to the official one very soon. We attach elasticutils to Django models with a mixin. This lets us do things like ``.search()`` which returns an object which acts a lot like Django's ORM's object manager. ``.filter(**kwargs)`` can be run on this search object:: query_results = list( MyModel.search().filter(my_field=a_str.lower()) .values_dict('that_field')) Testing with Elasticsearch -------------------------- All test cases using Elasticsearch should inherit from ``amo.tests.ESTestCase``. All such tests are marked with the ``es_tests`` pytest_ marker. To run only those tests:: pytest -m es_tests or :: make test_es .. _pytest: http://pytest.org/ Troubleshooting --------------- *I got a CircularReference error on .search()* - check that a whole object is not being passed into the filters, but rather just a field's value. *I indexed something into Elasticsearch, but my query returns nothing* - check whether the query contains upper-case letters or hyphens. If so, try lowercasing your query filter. For hyphens, set the field's mapping to not be analyzed:: 'my_field': {'type': 'string', 'index': 'not_analyzed'} Try running .values_dict on the query as mentioned above.