Concepts

Routing

shard_num = hash(_routing) % num_primary_shards

  • This formula reveals why the number of shards of an index cannot be changed after indexing documents

    • If we want to change the number of shards, we have to create a new index and re-index old document to the new one

Reading/Writing Documents

Reading

  1. Coordinating node receives a GET request (e.g. GET /prod/_doc/100)

  2. Routing

  3. Adaptive replica selection (find the best performance)

  4. Retrieve data in a replication group

  5. Coordinating node respond with the requested document

Writing

  1. Coordinating node receives a PUT request (e.g. PUT /prod/_doc/100)

  2. Routing

  3. Primary shard validate the operation and write the document locally

  4. Primary shard replicates the document to replicas

Here are 2 terms to remember:

  1. _primary_term: If the primary shard is taken place by another one, its value increments

  2. _seq_no: Number of write operation together with the primary term (incremented for each write operation by primary shard)

Global and local checkpoints

  • Essentially sequence numbers

  • Each replication group has a global checkpoint

    • The sequence number that all active shards within a replication group have been aligned at least up to

  • Each replica shard has a local checkpoint

    • The sequence number for the last write operation that was performed

Document update error handling

Use Case: Shopping cart

  • The webapp gets a document containing a number in stock

  • However, when the action is concurrent (e.g. 2 users add the same item at the same time), the number of stock may not be accurate.

    • e.g. In stock 5 originally but it is still 4 after the two actions from webapp

To overcome, we have to make use of primary_term and seq_no

GET /prod/_doc/100
POST /prod/_update/100?if_primary_term=1&if_seq_no=71

In this way, the webapp should be able to know if the document is the latest one thanks to the seq_no field. Note this relies on the webapp to do the check.

If the POST action fails, the webapp will receive version_conflict_engine_exception.

Last updated