Unit III: Storage & Indexing
Database Systems - MTech First Semester
Cloud & Software Operations Cluster | SOCS | UPES
Dr. Mohsin Dar
Assistant Professor
| Index File (Sparse) | |
|---|---|
| Primary Key (First Record) | Block Pointer |
| K1 | → Block 1 |
| K10 | → Block 2 |
| K20 | → Block 3 |
| ... | ... |
A clustering index is created when the data file is ordered on a non-key attribute that may have duplicate values.
Consider an employee database ordered by Department_ID (non-key field):
| Employee_ID | Name | Department_ID | Salary |
|---|---|---|---|
| 101 | Alice | 10 | 50000 |
| 105 | Bob | 10 | 55000 |
| 102 | Charlie | 20 | 60000 |
| 108 | David | 20 | 62000 |
| 103 | Eve | 30 | 58000 |
| Department_ID | Block Pointer |
|---|---|
| 10 | → Block 1 |
| 20 | → Block 2 |
| 30 | → Block 3 |
Note: Only 3 index entries for potentially hundreds of employee records!
A secondary index provides an alternate access path to the data file. It can be created on any field (key or non-key) that is different from the primary ordering field.
Primary Index: On Employee_ID (primary key)
Secondary Index: On Salary (for frequent salary-based queries)
| Emp_ID | Name | Salary |
|---|---|---|
| 101 | Alice | 50000 |
| 102 | Bob | 60000 |
| 103 | Charlie | 55000 |
| 104 | David | 50000 |
| Salary | Record Pointer |
|---|---|
| 50000 | → Record 1 (Alice) |
| 50000 | → Record 4 (David) |
| 55000 | → Record 3 (Charlie) |
| 60000 | → Record 2 (Bob) |
Note: Secondary index has one entry for each record (Dense Index)
Design and analyze the secondary index for this file.
Blocking Factor (bfr): Number of records that can fit in one block
Calculation:
bfr = ⌊1,024 / 100⌋ = ⌊10.24⌋ = 10 records/block
Interpretation: Each data block can store exactly 10 records.
Number of Blocks: Total blocks needed to store all records
Calculation:
b = ⌈30,000 / 10⌉ = 3,000 blocks
Storage for Data File:
Total Space = 3,000 blocks × 1,024 bytes = 3,072,000 bytes ≈ 3 MB
Secondary Index Entry Components:
Calculation:
Index Entry Size = 9 + 6 = 15 bytes
Important Difference: Secondary index uses record pointers (pointing to individual records), not block pointers. This is because the file is unordered.
Index Blocking Factor: Number of index entries per block
Calculation:
bfri = ⌊1,024 / 15⌋ = ⌊68.27⌋ = 68 entries/block
Interpretation: Each index block can store 68 index entries.
Number of Index Entries: One entry per record = 30,000 entries (Dense Index)
Calculation:
bi = ⌈30,000 / 68⌉ = ⌈441.18⌉ = 442 index blocks
Storage for Secondary Index File:
Index Space = 442 blocks × 1,024 bytes = 452,608 bytes ≈ 442 KB
Average block accesses: b / 2 = 3,000 / 2 = 1,500 blocks
Worst case: 3,000 blocks
Index search: ⌈log2(442)⌉ = 9 block accesses
Data record access: 1 block access
Total: 9 + 1 = 10 block accesses
Speedup: 1,500 / 10 = 150× faster on average!
| Characteristic | Primary Index | Clustering Index | Secondary Index |
|---|---|---|---|
| Ordering Field | Primary Key | Non-key attribute | Any field (key/non-key) |
| File Organization | Ordered (sorted) | Ordered (sorted) | Unordered |
| Index Type | Sparse | Sparse/Dense | Dense |
| No. of Entries | = No. of blocks | = No. of unique values | = No. of records |
| Pointer Type | Block pointer | Block pointer | Record pointer |
| Multiple Allowed | No (only one) | No (only one) | Yes (multiple) |
| Space Overhead | Low | Low | High |
In dense index, there is an entry in the index file for every search key value in the main file. This makes searching faster but requires more space to store index records itself.
If an index entry is created only for some records of the main file, then it is called sparse index.
Note: Dense and sparse are not complementary to each other. Sometimes it is possible that an index is both dense and sparse (e.g., clustering index).
| Index (Sparse) | |
|---|---|
| 10 | → Block 1 |
| 30 | → Block 2 |
| 50 | → Block 3 |
Only first record of each block indexed
| Index (Dense) | |
|---|---|
| 10 | → Record 1 |
| 15 | → Record 2 |
| 20 | → Record 3 |
| 30 | → Record 4 |
| ... | ... |
Every record/search key indexed