Need for Distributed DBMS & Architecture
M.Tech - First Semester
Mohsin F. Dar
Assistant Professor
Cloud & Software Operations Cluster | SOCS
University of Petroleum and Energy Studies (UPES)
A Distributed Database is a collection of multiple, logically interrelated databases distributed over a computer network.
Data is stored at different geographic locations (sites/nodes). A relation may exist fully at one site, or be fragmented and stored across sites, or replicated at multiple sites.
Example: A bank stores Customer data at HQ, while each branch stores local Accounts and Transactions for faster access.
Sites communicate over LAN/WAN to exchange data and coordinate queries/transactions. Network quality affects response time and reliability.
Example: An airline reservation system connects airport sites so seat availability updates propagate across regions.
Each site can execute local queries/transactions and enforce local policies (e.g., access rules) even while participating in global operations.
Example: A university’s multi-campus setup lets each campus manage lab inventories locally while the university runs centralized analytics.
Users/applications see the distributed database as one logical database. A global query may access multiple sites, combine partial results, and return one answer.
Example: An e-commerce “track order” query may join warehouse inventory + shipping updates from different sites.
Users can access data without knowing its physical location
• Location transparency: user queries a table without knowing which site stores it
• Replication transparency: multiple copies exist, but updates/reads are handled automatically
• Fragmentation transparency: data is split (horizontal/vertical) but appears as one logical table
User Query SELECT * FROM Student WHERE RollNo = 101;
Reality Student records may be horizontally fragmented by campus (Dehradun/Delhi/Hyderabad), and a hot subset may be replicated at two sites for availability.
The DDBMS uses the global dictionary to find the right fragment/replica and returns a single answer.
Match database distribution to organizational structure
Data is closer to the point of use
Cost-effective compared to centralized systems
No single point of failure
Easy expansion by adding new sites
Sites maintain control over local data
| Advantages | Disadvantages |
|---|---|
| ✓ Reflects organizational structure naturally | ✗ Increased complexity in design and management |
| ✓ Improved shareability and local autonomy | ✗ Security and integrity control more difficult |
| ✓ Better availability and reliability | ✗ Lack of experienced personnel and standards |
| ✓ Enhanced performance through parallelism | ✗ Database design more complex |
| ✓ Economics of scale with smaller systems | ✗ Higher operational and maintenance costs |
| ✓ Modular growth and scalability | ✗ Additional hardware and network costs |
Separation of client processes and server processes
All nodes have equal status and capabilities
Integration of multiple heterogeneous databases
Dividing database into smaller pieces
Storing copies of data at multiple sites
Deciding where to place data fragments
Managing concurrent access to distributed data
Ensuring atomicity of distributed transactions
Handling failures in distributed environment
✓ Banking systems across branches
✓ Airline reservation systems
✓ E-commerce platforms
✓ Social media networks
✓ Content delivery networks
Thank You!
Questions and Discussion