Article catalogue
- 1, Principles of full-text retrieval and search engine
- 2, Introduction to Elasticsearch
- 3, Docker installation Elasticsearch
- 4, haystack extension indexing
- 5, Custom page access
1, Principles of full-text retrieval and search engine
1. Product search demand
When the user enters the product keyword in the search box, we will provide users with relevant product search results.
2. Product search implementation
You can choose to use fuzzy query like bump word to realize.
But the efficiency of like keyword is very low.
The query needs to be carried out in multiple fields, and it is inconvenient to use the like keyword.
3. Full text search scheme
We introduce the scheme of full-text retrieval to realize commodity search.
Full text search is to search and query in any specified field.
The full-text retrieval scheme needs to cooperate with the search engine.
4. Principle of search engine
When the search engine carries out full-text retrieval, it will preprocess the data in the database and establish an index structure data separately.
The index structure data is similar to the index search page of Xinhua Dictionary, which contains the correspondence between keywords and entries, and records the location of entries.
When the search engine carries out full-text retrieval, it quickly compares and searches the keywords in the index data, and then finds the real storage location of the data.
2, Introduction to Elasticsearch
Elasticsearch is the preferred search engine for full-text retrieval.
- Elasticsearch is an open source search engine implemented in Java.
- It can quickly store, search and analyze massive data. Wikipedia, stack overflow, GitHub, etc. all use it.
- At the bottom of Elasticsearch is the open source library Lucene. However, Lucene cannot be used directly. You must write your own code to call its interface.
Word segmentation description
- Search engines need word segmentation when building indexes on data.
- Word segmentation refers to the disassembly of a sentence into multiple words or words, which are the key words of the sentence. For example: I am Chinese
- After word segmentation: I, yes, China, China, people, China and so on can be the key words of this sentence.
- Elasticsearch does not support Chinese word segmentation and indexing. It needs to be combined with the extension of elasticsearch analysis IK to realize Chinese
3, Docker installation Elasticsearch
Obtain the image and pull it through the network
docker image pull delron/elasticsearch-ik:2.4.6-1.0
Or use the image file pulled by yourself:
docker load -i elasticsearch-ik-2.4.6_docker.tar
Modify the configuration file of elasticsearch elasticsearch-2.4.6/config/elasticsearch YML line 54, change the ip address to the local ip address
network.host: 127.0.0.1
Create docker container to run
docker run -dti --network=host --name=elasticsearch -v /home/python/elasticsearch-2.4.6/config:/usr/share/elasticsearch/config Desktop/elasticsearch-ik:2.4.6-1.0
The following message indicates that the service has run successfully
4, haystack extension indexing
1. Haystack introduction and installation configuration
1.1 introduction to haystack
- Haystack is a framework for docking search engines in Django, and builds a communication bridge between users and search engines.
- In Django, we can call the Elasticsearch search search engine by using Haystack.
- Haystack can use different search back ends (such as elastic search, whoosh, Solr, etc.) without modifying the code.
1.2 Haystack installation
pip install django-haystack pip install elasticsearch==2.4.6
1.3 Haystack registration application and routing
Add the following applications to the application configuration
INSTALLED_APPS = [ "haystack',#Full text search ] # Haystack HAYSTACK_CONNECTIONS = { 'default': { 'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine', 'URL': 'http://127.0.0.1:9200 / ', # here is the ip address of the server running elasticsearch, and the port number is fixed as 9200 'INDEX_NAME': 'xxshopping', # Specifies the name of the index library created by elasticsearch }, } # When adding, modifying and deleting data, the index is automatically generated HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'
Create search_indexes.py is in the corresponding commodity directory
from haystack import indexes from apps.goods.models import SKU class SKUIndex(indexes.SearchIndex,indexes.Indexable): # Each SearchIndex needs to have a (and only) field document=True. # This indicates to Haystack and the search engine which field is the primary field to search in. #Allows us to use data templates (rather than error prone concatenation) to build documents that search engines will index # 'name,caption,id' #The Convention is to name this field text text = indexes.CharField(document=True, use_template=True) def get_model(self): # Returns which model to retrieve return SKU def index_queryset(self, using=None): #What data is retrieved return self.get_model().objects.filter(is_launched=True) # return self.get_model().objects.all() # return SKU.objects.all() # pass # class SPUIndex(indexes.SearchIndex, indexes.Indexable): # # Each SearchIndex needs to have a (and only) field document=True. # # This indicates to Haystack and the search engine which field is the primary field to search in. # # # The Convention is to name this field text # text = indexes.CharField(document=True, use_template=True)
Create a new SKU in the template_ text. Txt file
# Here we specify which fields of the model to retrieve # Object can be understood as an instance object of SKU {{ object.name }} {{ object.caption }} {{ object.id }}
Add under global routing file
re_path('^search/', include('haystack.urls'))
Add view search html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head> <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> <title>Xiaoxu mall-Product search</title> <link rel="stylesheet" type="text/css" href="{{ static('css/jquery.pagination.css') }}"> <link rel="stylesheet" type="text/css" href="{{ static('css/reset.css') }}"> <link rel="stylesheet" type="text/css" href="{{ static('css/main.css') }}"> <script type="text/javascript" src="{{ static('js/jquery-1.12.4.min.js') }}"></script> <script type="text/javascript" src="{{ static('js/vue-2.5.16.js') }}"></script> <script type="text/javascript" src="{{ static('js/axios-0.18.0.min.js') }}"></script> </head> <body> <div id="app"> <div class="header_con"> <div class="header" v-cloak> <div class="welcome fl">Welcome to Xiaoxu mall!</div> <div class="fr"> <div v-if="username" class="login_btn fl"> Welcome:<em>[[ username ]]</em> <span>|</span> <a href="#"> Exit</a> </div> <div v-else class="login_btn fl"> <a href="#"> login</a> <span>|</span> <a href="#"> registration</a> </div> <div class="user_link fl"> <span>|</span> <a href="#"> User Center</a> <span>|</span> <a href="#"> my shopping cart</a> <span>|</span> <a href="#"> my order</a> </div> </div> </div> </div> <div class="search_bar clearfix"> <a href="{{ url('contents:index') }}" class="logo fl"><img src="{{ static('images/logo.png') }}"></a> <div class="search_wrap fl"> <form method="get" action="/search/" class="search_con"> <input type="text" class="input_text fl" name="q" placeholder="Search for products"> <input type="submit" class="input_btn fr" name="" value="search"> </form> <ul class="search_suggest fl"> <li><a href="#"> Sony micro order</a></li> <li><a href="#"> 15 yuan discount</a></li> <li><a href="#"> beauty care</a></li> <li><a href="#"> buy 2 free 1</a></li> </ul> </div> </div> <div class="main_wrap clearfix"> <div class=" clearfix"> <ul class="goods_type_list clearfix"> {% for result in page %} <li> {# object Getting is sku object #} <a href="#"><img src="{{ result.object.default_image.url }}"></a> <h4><a href="#">{{ result.object.name }}</a></h4> <div class="operate"> <span class="price">¥{{ result.object.price }}</span> <span>{{ result.object.comments }}evaluate</span> </div> </li> {% else %} <p>The item you want to query is not found.</p> {% endfor %} </ul> <div class="pagenation"> <div id="pagination" class="page"></div> </div> </div> </div> <div class="footer"> <div class="foot_link"> <a href="#"> about us</a> <span>|</span> <a href="#"> contact us</a> <span>|</span> <a href="#"> recruit talent</a> <span>|</span> <a href="#"> links</a> </div> <p>CopyRight © 2016 Xiao Xu All Rights Reserved</p> <p>Tel: 010-****888 Beijing ICP prepare*******8 number</p> </div> </div> <script type="text/javascript" src="{{ static('js/common.js') }}"></script> <script type="text/javascript" src="{{ static('js/search.js') }}"></script> <script type="text/javascript" src="{{ static('js/jquery.pagination.min.js') }}"></script> <script type="text/javascript"> $(function () { $('#pagination').pagination({ currentPage: {{ page.number }}, totalPage: {{ paginator.num_pages }}, callback:function (current) { window.location.href = '/search/?q=iphone&page=1'; window.location.href = '/search/?q={{ query }}&page=' + current; } }) }); </script> </body> </html>
Finally, create indexed data:
python manage.py rebuild_index
Choose y
At this point, we have the data we index in our database;
1.4 testing
/search/?q=Query generation
5, Custom page access
1. Create index class
2. Create serializer
3. Finally, create the indexed data
python manage.py rebuild_index
Choose Y
4. Create a view
5. Create sequencer for index
6. Register in the routing of our application
The last step is to set our front-end search HTML page and corresponding js loading file;