Elastic search -- Rest style operation index, mapping and document

Posted by MikeUK on Mon, 08 Nov 2021 12:32:41 +0100

Elastic search -- Rest style operation index, mapping and document

ElasticSearch provides a REST API that can be accessed through HTTP and JSON.

1. Index operation

1. Basic operations of index:

PUT /Index name      Create index
DELETE /Index name delete index
DELETE /*       Delete all indexes
GET /Index name      View specified index information
GET /_cat/indices?v  View all index information, v Indicates viewing header information

Create index

Command: PUT / index name. The index name cannot appear in uppercase letters

You can also specify configuration information when creating an index:

View all index information

Commands: get/_ cat/indices?v

Command: GET / index name to view the specified index information

Delete index

DELETE a command: DELETE / index name

DELETE all indexes, command: DELETE/*

2. Mapping operation

A map is an outline of a document stored in an index. It defines the data type.

Field data type:

Core type

  • String type string,text,keyword
  • Integer type integer,long,short,byte
  • Floating point type double,float,half_float,scaled_float
  • Logical type boolean
  • Date type date
  • Range type range
  • Binary type binary

Composite type

  • Array type array
  • Object type object
  • Nested type nested
  • Geographic type geographic coordinate type geo_point
  • Geographic map geo_shape

Special type

  • ip type ip
  • Range type completion
  • Token count type token_count
  • attachment type
  • Extraction type percolator

Create mapping

The creation process is as follows:

After creation, view the ems index information through the GET /ems command: you can view the mapping information and configuration information

If you point to view the mapping information, enter GET /ems/_mapping command:

If we do not specify a mapping, ElasticSearch will help us specify the field type by default.

3. Operation document

3.1 basic operation of documents

ElasticSearch provides instructions for adding, deleting, modifying and querying the Rest API: when a specific mapping is used to make a request for the corresponding index, it helps to add or update JSON documents in the index

methodurl addressdescribe
PUTlocalhost:9200 / index name / type name / document idCreate document (specify document id)
POSTlocalhost:9200 / index name / type nameCreate document (random document id)
POSTlocalhost:9200 / index name / type name / document id/_updateModify document
DELETElocalhost:9200 / index name / type name / document idremove document
GETlocalhost:9200 / index name / type name / document idQuery documents by document id
POSTlocalhost:9200 / index name / type name/_ searchQuery all data

Create document (specify document id)

Start ElasticSearch, ElasticSearch head, and Kibana.

1. Create a document and specify the document id as 1

Automatic index creation: when a JSON object is requested to be added to a specific index, if the index does not exist, the index and the underlying mapping of the specific JSON object will be automatically created.

Version control: internal version control is the default version starting with 1. It will be added and deleted every time it is updated.

Version control is a real-time process, which is not affected by real-time search operations.

Operation type: the operation type is used to force the creation of an operation, which helps to avoid overwriting existing documents.

2. View created documents and indexes

Create document (random document id)

Elasticsearch automatically generates an id for the document when no id is specified in the index operation.

View document data

The API helps extract JSON objects by performing get requests on specific documents.

Command: GET / index / type / document id

This operation is real-time and is not affected by the index refresh rate.

You can also specify a version, and ElasticSearch will extract only that version of the document.

You can also specify in the request_ all so that ElasticSearch can search for the document id in each type, and it will return the first matching document.

You can also specify the required fields from the results of that particular document.

Query document information according to rules:

For example, check the document information of type1 type with "Wanli" in name under text1 index:

Command: GET /text1/type1/_search?q=name: Wanli

remove document

You can delete a specified index, map, or document by sending an HTTP DELETE request to * * ElasticSearch * *.

You can see that the document with id 1 has been deleted:

Modify document

Create a document in the ems index:

Modification method 1:

Modify the value directly in the create document command, and then run again:

Modification method 2 (recommended):

Use post to request modification and update on the basis of retaining the original data. Doc will first query the data of the original document, and then replace the fields in the original document with the new fields in doc

After modification, view the modified document information:

Use post to request modification without retaining the original data

After modification, view the modified document information: you can see that only the name field is left in the document

3.2. Complex query of documents

3.2.1 environment construction

Add four document data

PUT /wanli/user/1
{
  "name":"Wan Li Gu Yicheng",
  "age":"3",
  "desc":"senior Java Development Engineer",
  "tags":["technical nerd","Delve into","motion"]
}


PUT /wanli/user/2
{
  "name":"a great distance",
  "age":"18",
  "desc":"primary Java Development Engineer",
  "tags":["technical nerd","Straight man","motion"]
}


PUT /wanli/user/3
{
  "name":"gods",
  "age":"6",
  "desc":"ELK engineer",
  "tags":["game","music","Shy"]
}


PUT /wanli/user/4
{
  "name":"Tang Jing Fang",
  "age":"12",
  "desc":"Database Administrator",
  "tags":["honest","Steadfast","enthusiastic"]
}

3.2.2. Specified field search

Request statement:

#Query the user type of wanli index for documents whose name contains "wanli"
GET /wanli/user/_search
{
  "query":{
    "match": {
      "name": "a great distance"
    }
  }
}

Return result:

{
  "took" : 23,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2, #Total number of documents matched
      "relation" : "eq"
    },
    "max_score" : 1.605183,  #The document with the largest score and the most consistent results will be displayed first
    "hits" : [
      {
        "_index" : "wanli", #Indexes
        "_type" : "user", #type
        "_id" : "2", #Document id
        "_score" : 1.605183,  #fraction
        "_source" : {   #Document data can be traversed
          "name" : "a great distance",
          "age" : "18",
          "desc" : "primary Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Straight man",
            "motion"
          ]
        }
      },
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "1",
        "_score" : 1.0892314,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "age" : "3",
          "desc" : "senior Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Delve into",
            "motion"
          ]
        }
      }
    ]
  }
}

Return result description:

  • took: how many milliseconds did it take to execute the entire search request

  • timed_out: whether the query timed out.

  • _Shards: the total number of shards involved in the query, and how many of these shards succeeded and failed.

  • max_score: the maximum _scoreof the document matching the query.

  • Hits: the most important part of the returned result is hits, which contains the total field to represent the total number of documents matched

  • Hits array: a hits array contains the first ten documents of the query results. In the hits array, each result contains the document's _index, _type, _id, plus _source (document data). This means that we can use the whole document directly from the returned search results.

  • _Score: each result also has a _score, which measures the matching degree between the document and the query. The higher the score, the higher the matching degree, and the earlier it is displayed.

Return the specified field in _source (result filtering):

Request statement:

GET /wanli/user/_search
{
  "query":{
    "match": {
      "name": "a great distance"
    }
  },
  "_source":["name","desc"]
}

Return result:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.605183,
    "hits" : [
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "2",
        "_score" : 1.605183,
        "_source" : {
          "name" : "a great distance",
          "desc" : "primary Java Development Engineer"
        }
      },
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "1",
        "_score" : 1.0892314,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "desc" : "senior Java Development Engineer"
        }
      }
    ]
  }
}

3.2.3 sorting

Request statement:

#Query the user type of wanli index for documents whose name contains "wanli" and sort them by age
GET /wanli/user/_search
{
  "query":{
    "match": {
      "name":"a great distance"
    }
  },
  "sort": [
    { 
    "age.keyword": { 
      "order":"desc" 
     }
   }
 ],
  "from": 0,
  "size": 1
}

Return result: sorting will not display scores_ score

{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "age" : "3",
          "desc" : "senior Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Delve into",
            "motion"
          ]
        },
        "sort" : [
          "3"
        ]
      },
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "2",
        "_score" : null,
        "_source" : {
          "name" : "a great distance",
          "age" : "18",
          "desc" : "primary Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Straight man",
            "motion"
          ]
        },
        "sort" : [
          "18"
        ]
      }
    ]
  }
}

3.3.4 paging

Request statement:

#In the user type of wanli index, query the documents whose name contains "ten thousand miles", sort them by age, and page them. One piece of data is displayed on each page
GET /wanli/user/_search
{
  "query":{
    "match": {
      "name":"a great distance"
    }
  },
  "sort": [
    { 
    "age.keyword": { 
      "order":"desc" 
     }
   }
 ],
  "from": 0, #Start with the first data
  "size": 1  #Show only one piece of data
}

Return result:

#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "age" : "3",
          "desc" : "senior Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Delve into",
            "motion"
          ]
        },
        "sort" : [
          "3"
        ]
      }
    ]
  }
}

3.3.5 bool Boolean query

Must: all conditions must be met (and)

Query the user type of wanli index for the document information whose name contains "wanli" and whose age is 18

GET /wanli/user/_search
{
  "query":{
    "bool": {
      "must": [  #Must: all conditions must be met (and)
        {
          "match": {
            "name": "a great distance"
            }
          },
          {
            "match":{
              "age": 18
            }
        }
      ]
    }
  }
}

Return result:

#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 2.809156,
    "hits" : [
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "2",
        "_score" : 2.809156,
        "_source" : {
          "name" : "a great distance",
          "age" : "18",
          "desc" : "primary Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Straight man",
            "motion"
          ]
        }
      }
    ]
  }
}

should: it can be queried (or) as long as it meets one condition

In the user type of wanli index, query the document information whose name contains "wanli" or age is 3

GET /wanli/user/_search
{
  "query":{
    "bool": {
      "should": [
        {
          "match": {
            "name": "a great distance"
            }
          },
          {
            "match":{
              "age": 3
            }
        }
      ]
    }
  }
}

Return result:

#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 2.2932043,
    "hits" : [
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "1",
        "_score" : 2.2932043,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "age" : "3",
          "desc" : "senior Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Delve into",
            "motion"
          ]
        }
      },
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "2",
        "_score" : 1.605183,
        "_source" : {
          "name" : "a great distance",
          "age" : "18",
          "desc" : "primary Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Straight man",
            "motion"
          ]
        }
      }
    ]
  }
}

must_not: query out unqualified documents (not)

Query the user type of wanli index. The name does not contain "wanli" document information

GET /wanli/user/_search
{
  "query":{
    "bool": {
      "must_not": [
        {
          "match": {
            "name": "a great distance"
            }
          }
      ]
    }
  }
}

Return result:

#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "3",
        "_score" : 0.0,
        "_source" : {
          "name" : "gods",
          "age" : "6",
          "desc" : "ELK engineer",
          "tags" : [
            "game",
            "music",
            "Shy"
          ]
        }
      },
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "4",
        "_score" : 0.0,
        "_source" : {
          "name" : "Tang Jing Fang",
          "age" : "12",
          "desc" : "Database Administrator",
          "tags" : [
            "honest",
            "Steadfast",
            "enthusiastic"
          ]
        }
      }
    ]
  }
}

3.3.6. Range query

In the user type of wanli index, query the document information whose name contains "wanli" and is older than 10 years old

GET /wanli/user/_search
{
       "query": {
         "bool": {
           "must": [
             {
               "match": {
                 "name.keyword": "a great distance"
               }
             }
           ],
           "filter": {
             "range": {
               "age.keyword": {
                 "gt": 10          
               }
             }
           }
         }
    }
}
  • gte greater than and equal to

  • gt greater than

  • lte less than and equal to

  • lt less than

Return result:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.2039728,
    "hits" : [
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "2",
        "_score" : 1.2039728,
        "_source" : {
          "name" : "a great distance",
          "age" : "18",
          "desc" : "primary Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Straight man",
            "motion"
          ]
        }
      }
    ]
  }
}

3.3.7. Query by matching multiple criteria

Request statement:

#Query the document information with "male" and "technology" in the label
GET /wanli/user/_search
{
  "query" :{
    "match":{
      "tags": "Male Technology"  #Space between
    }
  }
}

Return result: according to the result, the more qualified the query result, the higher the score

#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 2.511242,
    "hits" : [
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "2",
        "_score" : 2.511242,
        "_source" : {
          "name" : "a great distance",
          "age" : "18",
          "desc" : "primary Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Straight man",
            "motion"
          ]
        }
      },
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "1",
        "_score" : 1.3440006,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "age" : "3",
          "desc" : "senior Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Delve into",
            "motion"
          ]
        }
      }
    ]
  }
}

3.3.8. Exact match query

Term: stands for exact match, that is, accurate query. The search term will not be broken down before search.

Term can only complete the whole matching search term without any change.

Create a map:

PUT testdb
{
  "mappings": {
    "properties": {
      "name":{
        "type": "text"
      },
      "desc":{
        "type": "keyword"
      },
    }
  }
}

The name field specified in the above mapping is of text type and the desc field is of keyword type.

text will be parsed by the word splitter, and keyword will not be parsed by the word splitter.

Add document data:

PUT testdb/_doc/1
{
  "name":"Wan Li Gu Yicheng",
  "desc":"When that day comes"
}

PUT testdb/_doc/2
{
  "name":"Wan Li Gu Yicheng",
  "desc":"When that day really comes"
}

Use term to query the name field:

GET testdb/_search
{
  "query": {
    "term": {
        "name": "ten thousand"
    }
  }
}

Return results: two results are returned. Because the name field type is text type, the word splitter parses the name field and returns the results with "10000" in name.

{
  "took" : 1086,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.18232156,
    "hits" : [
      {
        "_index" : "testdb",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.18232156,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "desc" : "When that day comes"
        }
      },
      {
        "_index" : "testdb",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.18232156,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "desc" : "When that day really comes"
        }
      }
    ]
  }
}

Query desc field with term:

GET testdb/_search
{
  "query": {
    "term": {
        "desc": "When that day comes"
    }
  }
}

Return result: only one result is returned. Because the desc field type is keyword type, the word splitter will not parse the desc field, but only complete matching search words.

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.6931471,
    "hits" : [
      {
        "_index" : "testdb",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.6931471,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "desc" : "When that day comes"
        }
      }
    ]
  }
}

The difference between term and match

When matching, match will segment the keyword found, and then match and find by word segmentation, while term will directly find the keyword. Generally, match is used for fuzzy search, while term can be used for accurate search.

Exact match query with multiple values

GET testdb/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
              "name": "ten thousand"
          },
          "term": {
              "name": "a great distance"
          }
        }
      ]
    }
  }
}

3.3.9 highlight query

Default highlight

Query request statement: use the highlight attribute to highlight the results. You can add the required field name to fields, and elasticsearch will automatically help us highlight.

GET wanli/user/_search
{
  "query":{
    "match": {
      "name": "a great distance"
    }
    },
    "highlight":{
    "fields": {
      "name": {}
    }
  }
}

Return result:

#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{
  "took" : 61,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.605183,
    "hits" : [
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "2",
        "_score" : 1.605183,
        "_source" : {
          "name" : "a great distance",
          "age" : "18",
          "desc" : "primary Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Straight man",
            "motion"
          ]
        },
        "highlight" : {
          "name" : [
            "<em>ten thousand</em><em>in</em>"
          ]
        }
      },
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "1",
        "_score" : 1.0892314,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "age" : "3",
          "desc" : "senior Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Delve into",
            "motion"
          ]
        },
        "highlight" : {
          "name" : [
            "<em>ten thousand</em><em>in</em>Gu Yicheng"
          ]
        }
      }
    ]
  }
}

elasticsearch will automatically wrap the search results with labels for rendering in the page.

Custom highlighting

If we don't want to use em tag and want to use a p tag, we can customize the tag.

Query request statement: pre_tags is used to implement the first half of our custom tags. Here, we can also add attributes and styles for custom tags. post_tags implements the second half of the tag to form a complete tag. As for the content in the tag, it is still left to fields to complete.

GET wanli/user/_search
{
  "query":{
    "match": {
      "name": "a great distance"
    }
    },
    "highlight":{
    "pre_tags": "<p class='key',style:'color=red'>", 
    "post_tags": "</p>", 
    "fields": {
      "name": {}
    }
  }
}

Return result:

#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.605183,
    "hits" : [
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "2",
        "_score" : 1.605183,
        "_source" : {
          "name" : "a great distance",
          "age" : "18",
          "desc" : "primary Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Straight man",
            "motion"
          ]
        },
        "highlight" : {
          "name" : [
            "<p class='key',style:'color=red'>ten thousand</p><p class='key',style:'color=red'>in</p>"
          ]
        }
      },
      {
        "_index" : "wanli",
        "_type" : "user",
        "_id" : "1",
        "_score" : 1.0892314,
        "_source" : {
          "name" : "Wan Li Gu Yicheng",
          "age" : "3",
          "desc" : "senior Java Development Engineer",
          "tags" : [
            "technical nerd",
            "Delve into",
            "motion"
          ]
        },
        "highlight" : {
          "name" : [
            "<p class='key',style:'color=red'>ten thousand</p><p class='key',style:'color=red'>in</p>Gu Yicheng"
          ]
        }
      }
    ]
  }
}

Topics: ElasticSearch RESTful search engine