ES introduction Trilogy: index operation, mapping operation, document operation

Posted by EmperorDoom on Tue, 01 Feb 2022 16:48:09 +0100

ES introduction Trilogy: index operation, mapping operation, document operation

1, Index operation

1. Create index library

#grammar
PUT /Index name
{
"settings": {
"Attribute name": "Attribute value"
}
}
#Examples
PUT /es_index

Note: settings is the setting of the index library, which can define various properties. Generally, it can be left blank and go directly to the default.

2. Judge whether the index exists

#grammar
HEAD  /Index name

#Examples
HEAD  /es_index

3. View index

# grammar
GET /Index name

# Examples
GET  /es_index

# Batch view index
GET /Index name 1,Index name 2,Index name 3,...

# View all indexes
GET _all        ||        GET /_cat/indices?v

4. Open index

# grammar
POST /Index name/_open

# Examples
POST /es_index/_open

5. Turn off the index

# grammar
POST /Index name/_close

# Examples
POST /es_index/_close

6. Delete index library

# grammar
DELETE /Index name 1,Index name 2,Index name 3...

2, Mapping operation

After the index is created, there is a database in the relational database. Elasticsearch7.x cancels the setting of index type. It is not allowed to specify the type. The default is_ doc, but there are still fields. We need to set the constraint information of the field, which is called field mapping

Field constraints include but are not limited to:

  • Data type of field
  • Do you want to store
  • Do you want to index
  • Tokenizer

1. Create mapping fields

grammar

PUT /Index library name/_mapping
{
"properties": {
"Field name": {
"type": "type",
"index": true,
"store": true,
"analyzer": "Tokenizer "
}
}
}

Field name: fill in any field. Many attributes are specified below, for example:

  • Type: type, which can be text, long, short, date, integer, object, etc
  • Index: whether to index. The default value is true
  • Store: store or not. The default value is false
  • analyzer: Specifies the word breaker

Examples

PUT /es_index/_mapping/
{
"properties": {
"name": {
"type": "text",
"analyzer": "ik_max_word"
},
"job": {
"type": "text",
"analyzer": "ik_max_word"
},
"logo": {
"type": "keyword",
"index": "false"
},
"payment": {
"type": "float"
}
}
}

2. Detailed explanation of mapping attributes

1.type

Here are some key points:

  • There are two types of String:

    • text: participle, not aggregation
    • keyword: non separable word. The data will be matched as a complete field and can participate in aggregation
  • Numerical: numerical type, divided into two categories

    • Basic data types: long, integer, short, byte, double, float, half_float
    • High precision type of floating point number: scaled_float
      • You need to specify an accuracy factor, such as 10 or 100. Elastic search will multiply the real value by this factor, store it, and restore it when it is taken out.
  • Date: date type

    elasticsearch can format the date as a string for storage, but it is recommended that we store it as a millisecond value and a long value to save space

  • Array: array type

    • When matching, any element is considered satisfied
    • When sorting, if the order is ascending, the minimum value in the array is used to sort, and if the order is descending, the maximum value in the array is used to sort
  • Object: object type

    {name:"Amy",age:25,friend:{name: "DaMing", age:25}}
    

    If the object type stored in the index library, such as friend above, will change friend into two fields: friend Name and friend age

2.index

Index affects the index of the field.

  • True: if the field will be indexed, it can be used for search. The default value is true

  • false: the field will not be indexed and cannot be used for search

    The default value of index is true, which means that all fields will be indexed without any configuration.
    However, some fields we don't want to be indexed, such as the logo image address of the enterprise, we need to manually set the index to false.

3.store

Whether to store data independently.
The original text will be stored in the source. By default, other extracted fields are not stored independently, but from the source
_ Extracted from source. Of course, you can also store a field independently. Just set store:true to obtain independent storage
The stored fields are much faster than parsing from source, but they also take up more space, so they should be set according to the actual business needs,
The default is false.

4.analyzer: Specifies the word breaker

Generally, we choose IK word splitter IK when dealing with Chinese_ max_ word ik_ smart

3. View mapping relationship

  • View the mapping relationship of a single index

    # grammar GET /Index name/_mapping# Example GET /es_index/_mapping
    
  • View the mapping relationship of all indexes

    # Syntax get_ Mapping or get_ all/_ mapping
    
  • Modify the mapping relationship of the index

    # Syntax PUT / index library name/_ mapping{"properties": {"field name": {"type": "type", "index": true, "store": true, "analyzer": "word splitter"}
    

    Note: if you modify the mapping and add fields, you can only delete the index and re-establish the mapping

4. Create indexes and mappings at one time

In addition to creating indexes and mapping separately, you can also create indexes and mapping at one time

# grammar put /Index library name{"settings":{"Index library property name":"Index library attribute value"},"mappings":{"properties":{"Field name":{"Map attribute name":"Map attribute values"}}}}# Example PUT /es_index{"settings": {},"mappings": {"properties": {"name": {"type": "text","analyzer": "ik_max_word"}}}}

3, Document addition, deletion, modification and partial update

1. Add a document

  • New document (specify ID manually)

    # grammar POST /Index name/_doc/{id}# Example post / es_ index/_ Doc / 1{"name": "ZAE", "job": "java development", "payment": "1000", "logo":“ http://www.lgstatic.com/thubnail_120x120/i/image/M00/21/3E/CgpFT1kVdzeAJNbUAABJB7x9sm8374.png "}
    
  • New document (automatically generated id)

    # Syntax POST / index name/_ doc{"field":"value"}
    

    After the document is created, there is an id field in the response result, which is the unique identification of the document data
    All additions, deletions and modifications depend on this id as the unique identifier. Here is the id randomly generated by Elasticsearch

2. View a single document

  • Syntax example

    # grammar GET /Index name/_doc/{id}# Example GET/es_index/_doc/1
    
  • Document metadata interpretation

    • Index: the index to which the document belongs
    • _ Type: the type of document, elasticsearch7 X the default type is doc
    • id: represents the unique identification of a document. Together with index and type, a document can be uniquely identified and located
    • _ Version: the version number of the document. Elasticsearch uses version to ensure that conflicting changes in the application will not lead to data loss. When you need to modify data, you need to specify the version number of the document you want to modify. If the version is not the current version number, the request will fail
    • _ seq_no: strictly incremented sequence number, one for each document. The Shard level is strictly incremented to ensure the SEQ of the Doc written later_ No is greater than SEQ of Doc written first_ no
    • primary_term: any type of write operation, including index, create, update and Delete, will generate an seq_no.
    • found: true/false, whether to find the document
    • _ source stores the original document

3. View all documents

# Syntax POST / index name/_ search{"query":{"match_all": {}}}

4._source custom return result

In some business scenarios, we can return all the fields of source without customizing
The fields are separated by commas

# Example GET /es_index/_doc/1?_source=name,job

5. Update documents (all updated)

Change the request mode in the new syntax to PUT, but you need to specify the modified id

# Syntax PUT / index name/_ doc/{id}
  • If the corresponding id exists, it is modified
  • If the corresponding id does not exist, it is new

6. Update the document (partial update)

Elasticsearch can use PUT or POST to update the document (all updates). If the document with the specified ID already exists, it will be executed
Update operation.
Note: when Elasticsearch performs the update operation, Elasticsearch first marks the old document as deleted, and then adds a new one
The old documents will not disappear immediately, but you can't access them. Elasticsearch will add more data when you continue to add more data
Background cleanup of documents that have been marked for deletion.
Update all is to directly mark the previous old data as deleted, and then add an updated one (using PUT or
POST)
Local update, just modify a field (use POST)

# Syntax POST / index name/_ update/{id}{"doc":{"field":"value"}}

7. Delete document

  • Delete according to id

    # Syntax DELETE / index name/_ doc/{id}
    
  • Delete according to conditions

    # grammar POST /Index library name/_delete_by_query{"query": {"match": {"Field name": "Search keywords"}}}# Example: delete post / es from the document data with name field 2_ index/_ delete_ by_ query{"query":{"match":{"name":"2"}}}
    
  • Delete all documents

    POST Index name/_delete_by_query{"query": {"match_all": {}}}
    

8. Full replacement and forced creation of documents

  • Full replacement

    • The syntax is the same as creating a document. If the document id does not exist, it is created; If the document id already exists, it is a full replacement operation to replace the json string content of the document;
    • The document is immutable. If you want to modify the content of the document, the first way is to replace it in full. Directly re index the document and replace all the contents inside. elasticsearch will mark the old document as deleted, and then add a document given by us. When we create more and more documents, elasticsearch will automatically delete the document marked deleted in the background at the appropriate time
  • Force creation

    # Force creation PUT /index/_doc/{id}?op_type=create {},PUT /index/_doc/{id}/_create {}# Note that when forced creation, an error will be reported if the id exists