Q83. Partitions and queries?

Azure storage provides unmatched scalability. This is possible because of the partitioning feature of Azure tables. In our example, Item category is the PartitionKey and Item Code is the RowKey. The Table contains two partitions- Raw Material and Finished Product.

partitions and queries

The best possible query in terms of performance includes both Partition Key and Row Key. This is because the system looks for a particular partition server depending on the partition key given in the query and then identifies the row key and retrieves the entity.

The best approach when building a query is to understand how data flow happens. In this case, the query performs best when we mention the Partition Key. What happens if we provide only the Row Key. The system will not know on which partition server it should perform the query. In such a situation, an entire table scan is performed. As the table may be spanned across many servers the scan operation spans all the servers.

The correct way is to provide a Partition Key in the query. To summarize, provide a partition key if it is available, provide both partition key and row key if available and if only a row key is available, try and create your tables so that it includes a partition key.

Role of Partition server
  • All entities belonging to a single partition in a table are accessed via the same partition server.
  • A partition server has information about a partition and the data that exists in it.
  • The number of partitions a partition server serves is determined based on the partition size and the number times it is being accessed.
  • The Windows Azure Storage system keeps a map of partition servers and the partitions it is serving.

In our example given above, when we query for all of the items in the “Raw Material” category, the system will execute this query against a single partition server because all data in the Raw Material category resides in a single partition server.

Continuation Tokens
A continuation token is raised when:

1. A query has to query from multiple partition servers.
2. A query result is too large and the storage returns a token along with the query result.

However, it is enough if you know that Continuation Tokens exist.