Concepts
Routing
shard_num = hash(_routing) % num_primary_shards
This formula reveals why the number of shards of an index cannot be changed after indexing documents
If we want to change the number of shards, we have to create a new index and re-index old document to the new one
Reading/Writing Documents
Reading
Coordinating node receives a GET request (e.g.
GET /prod/_doc/100
)Routing
Adaptive replica selection (find the best performance)
Retrieve data in a replication group
Coordinating node respond with the requested document
Writing
Coordinating node receives a PUT request (e.g.
PUT /prod/_doc/100
)Routing
Primary shard validate the operation and write the document locally
Primary shard replicates the document to replicas
Here are 2 terms to remember:
_primary_term
: If the primary shard is taken place by another one, its value increments_seq_no
: Number of write operation together with the primary term (incremented for each write operation by primary shard)
Global and local checkpoints
Essentially sequence numbers
Each replication group has a global checkpoint
The sequence number that all active shards within a replication group have been aligned at least up to
Each replica shard has a local checkpoint
The sequence number for the last write operation that was performed
Document update error handling
Use Case: Shopping cart
The webapp gets a document containing a number in stock
However, when the action is concurrent (e.g. 2 users add the same item at the same time), the number of stock may not be accurate.
e.g. In stock 5 originally but it is still 4 after the two actions from webapp
To overcome, we have to make use of primary_term
and seq_no
In this way, the webapp should be able to know if the document is the latest one thanks to the seq_no
field. Note this relies on the webapp to do the check.
If the POST action fails, the webapp will receive version_conflict_engine_exception.
Last updated