Clustering And Partitioning¶
Multitable Clustering¶
Multitable clustering is great for storing tables that are often joined with each other. In a cluster, tables records are stored interleaved with each other.
dept_name | building | budget |
---|---|---|
CSE | Taylor | 100000 |
PHY | Watson | 70000 |
ID | name | dept_name | salary |
---|---|---|---|
100 | Hector | CSE | 65000 |
101 | Robin | PHY | 57000 |
102 | Seth | CSE | 58000 |
103 | Genny | CSE | 70000 |
These two tables will be merged into the cluster
__ | __ | __ | __ |
---|---|---|---|
CSE | Taylor | 100000 | |
100 | Hector | CSE | 65000 |
102 | Seth | CSE | 58000 |
103 | Genny | CSE | 70000 |
PHY | Watson | 70000 | |
101 | Robin | PHY | 57000 |
This won't work very well when you're querying for only one table, but makes joins faster. This is stored in variable-length records
Partitioning¶
In partitioning, a single table \(X\) is broken into multiple tables \(\{X_i\}\) . Each table \(X_i\) can be stored separately. * This reduces cost of some operations like free space management. * It allows different partitions to be stored on different devices, like storing more important partitions on an SSD and less important ones on a magnetic disk.