Dr. Mohsin Dar
Assistant Professor | Cloud & Software Operations Cluster | SOCS | UPES
Query Languages for the Relational Model
By the end of this lecture, you will:
- Understand the fundamentals of Relational Algebra
- Master basic and extended operations
- Learn Tuple Relational Calculus (TRC)
- Comprehend Domain Relational Calculus (DRC)
- Compare procedural vs declarative approaches
Key Concepts:
- Set-based operations
- Query optimization foundations
- Logical query formulation
- Mathematical foundations of databases
- SQL theoretical underpinnings
Procedural Languages
Relational Algebra: User specifies what data is required and how to get it step by step.
How: σ(condition)(R) → π(attributes)(result)
Declarative Languages
Relational Calculus: User specifies what data is required without specifying how to retrieve it.
What: {t | condition(t)}
Both are mathematically equivalent in expressive power!
Students Relation
| SID | Name | Age | GPA |
| 101 | Alice | 20 | 3.8 |
| 102 | Bob | 21 | 3.2 |
| 103 | Carol | 19 | 3.9 |
| 104 | Dave | 22 | 3.1 |
Enrollment Relation
| SID | CID | Grade |
| 101 | CS101 | A |
| 102 | CS101 | B |
| 103 | CS102 | A+ |
| 101 | CS102 | A |
Definition
A procedural query language that consists of a set of operations that take one or two relations as input and produce a new relation as output.
Basic Operations
- Select (σ)
- Project (π)
- Union (∪)
- Set Difference (−)
- Cartesian Product (×)
- Rename (ρ)
Extended Operations
- Natural Join (⋈)
- Outer Joins
- Intersection (∩)
- Division (÷)
- Assignment (←)
- Aggregate Functions
Selection
Selects tuples (rows) that satisfy a given predicate condition.
σcondition(R)
Examples:
σAge > 20(Students)
Result: Students with age greater than 20
| SID | Name | Age | GPA |
| 102 | Bob | 21 | 3.2 |
| 104 | Dave | 22 | 3.1 |
σGPA ≥ 3.5 AND Age < 21(Students)
Result: High-performing young students
Projection
Selects specified columns from a relation and eliminates duplicates.
πattribute_list(R)
Examples:
πName, GPA(Students)
Result: Only Name and GPA columns
| Name | GPA |
| Alice | 3.8 |
| Bob | 3.2 |
| Carol | 3.9 |
| Dave | 3.1 |
Projection automatically removes duplicate rows!
Union (∪)
Combines tuples from two compatible relations.
R ∪ S
Requirements: Same arity and compatible domains
Intersection (∩)
Returns tuples present in both relations.
R ∩ S
Derived: R ∩ S = R − (R − S)
Set Difference (−)
Returns tuples in first relation but not in second.
R − S
Note: Not commutative (R − S ≠ S − R)
Cartesian Product (×)
Combines every tuple from first relation with every tuple from second.
R × S
Result size: |R| × |S| tuples
Natural Join (⋈)
Combines relations based on common attributes with same names.
R ⋈ S
Equivalent to: πR.attrs, S.attrs(σR.common = S.common(R × S))
Example: Students ⋈ Enrollment
| SID | Name | Age | GPA | CID | Grade |
| 101 | Alice | 20 | 3.8 | CS101 | A |
| 101 | Alice | 20 | 3.8 | CS102 | A |
| 102 | Bob | 21 | 3.2 | CS101 | B |
| 103 | Carol | 19 | 3.9 | CS102 | A+ |
Query: Find names of students with GPA > 3.5 who are enrolled in CS101
Step-by-step Solution:
Step 1: R1 ← σGPA > 3.5(Students)
Step 2: R2 ← σCID = 'CS101'(Enrollment)
Step 3: R3 ← R1 ⋈ R2
Step 4: Result ← πName(R3)
Single Expression:
πName(σGPA > 3.5(Students) ⋈ σCID = 'CS101'(Enrollment))
Step 1: Filter high-performing students (Alice, Carol)
Step 2: Filter CS101 enrollments (Alice, Bob)
Step 3: Join results (Alice)
Step 4: Project name (Alice)
Division
Finds tuples in R that are associated with ALL tuples in S.
R ÷ S
Use case: "Find students enrolled in ALL courses"
Division Algorithm:
R ÷ S = πR-S(R) − πR-S((πR-S(R) × S) − R)
Interpretation:
If R(A, B) and S(B), then R ÷ S returns all A values that appear with every B value in S.
Division is the most complex operation but very powerful for "for all" queries!
Definition
A non-procedural query language where each query is of the form:
{t | P(t)}
Where t is a tuple variable and P(t) is a predicate
Components
- Tuple variables: Range over relations
- Predicates: Logical conditions
- Quantifiers: ∃ (exists), ∀ (for all)
- Connectives: ∧ (and), ∨ (or), ¬ (not)
Atomic Formulas
- t ∈ R (t is in relation R)
- t.A op s.B (attribute comparison)
- t.A op constant
Where op ∈ {=, ≠, <, ≤, >, ≥}
Example 1: Students with GPA > 3.5
{t | t ∈ Students ∧ t.GPA > 3.5}
Example 2: Names of students enrolled in CS101
{t | ∃s (s ∈ Students ∧ t.Name = s.Name ∧
∃e (e ∈ Enrollment ∧ e.SID = s.SID ∧ e.CID = 'CS101'))}
Example 3: Students enrolled in ALL courses
{t | t ∈ Students ∧ ∀c (c ∈ Courses →
∃e (e ∈ Enrollment ∧ e.SID = t.SID ∧ e.CID = c.CID))}
TRC Query Builder
Query: Find students with high GPA in CS courses
TRC: {t | t ∈ Students ∧ t.GPA > 3.5 ∧ ∃e (e ∈ Enrollment ∧ e.SID = t.SID ∧ e.CID LIKE 'CS%')}
Definition
A non-procedural query language where queries are expressed in terms of domain variables:
{⟨x₁, x₂, ..., xₙ⟩ | P(x₁, x₂, ..., xₙ)}
Where x₁, x₂, ..., xₙ are domain variables and P is a predicate
Key Differences from TRC
- Uses domain variables instead of tuple variables
- Variables range over attribute domains
- More granular control over individual attributes
- Basis for QBE (Query By Example)
Atomic Formulas
- ⟨x₁, x₂, ..., xₙ⟩ ∈ R
- x op y (domain variable comparison)
- x op constant
Where op ∈ {=, ≠, <, ≤, >, ≥}
Example 1: Students with GPA > 3.5
{⟨s, n, a, g⟩ | ⟨s, n, a, g⟩ ∈ Students ∧ g > 3.5}
Returns: SID, Name, Age, GPA for qualifying students
Example 2: Names of students with high GPA
{⟨n⟩ | ∃s, a, g (⟨s, n, a, g⟩ ∈ Students ∧ g > 3.5)}
Returns: Only names of high-performing students
Example 3: Students enrolled in CS101
{⟨s, n, a, g⟩ | ⟨s, n, a, g⟩ ∈ Students ∧
∃c, gr (⟨s, c, gr⟩ ∈ Enrollment ∧ c = 'CS101')}
DRC provides fine-grained control over output format!
| Aspect |
Relational Algebra |
Tuple Relational Calculus |
Domain Relational Calculus |
| Nature |
Procedural |
Declarative |
Declarative |
| Approach |
Specifies HOW |
Specifies WHAT |
Specifies WHAT |
| Variables |
Relations |
Tuple variables |
Domain variables |
| Syntax |
σ, π, ⋈, etc. |
{t | P(t)} |
{⟨x₁,...,xₙ⟩ | P(x₁,...,xₙ)} |
| Query Optimization |
Direct |
Requires transformation |
Requires transformation |
| Expressive Power |
Equivalent |
Equivalent |
Equivalent |
| Real-world Use |
Query optimization |
SQL foundation |
QBE systems |
Safety Problem
Some calculus expressions can produce infinite results!
Unsafe: {t | ¬(t ∈ Students)}
This would return all tuples NOT in Students - potentially infinite!
Safety Conditions
- All variables must be domain-independent
- Every variable must appear in a positive relational atom
- Variables in negated subformulas must be limited
- Result must be from a finite active domain
Safe Equivalent
{t | t ∈ SomeRelation ∧ ¬(t ∈ Students)}
This limits the domain to a specific relation
Only safe expressions are meaningful in practical database systems!
From Relational Algebra to TRC
Relational Algebra:
πName(σGPA > 3.5(Students))
Tuple Relational Calculus:
{t | ∃s (s ∈ Students ∧ s.GPA > 3.5 ∧ t.Name = s.Name)}
From TRC to DRC
Tuple Relational Calculus:
{t | t ∈ Students ∧ t.GPA > 3.5}
Domain Relational Calculus:
{⟨s, n, a, g⟩ | ⟨s, n, a, g⟩ ∈ Students ∧ g > 3.5}
Query Optimization
Relational algebra provides a foundation for query optimization in modern DBMS.
- Selection pushdown
- Join reordering
- Index usage strategies
SQL Translation
Understanding calculus helps in writing complex SQL queries.
- Nested queries ↔ Existential quantification
- NOT EXISTS ↔ Universal quantification
- Complex JOINs ↔ Multiple relations
SQL to TRC Example
SQL:
SELECT Name FROM Students
WHERE GPA > 3.5 AND
EXISTS (SELECT * FROM Enrollment
WHERE SID = Students.SID)
TRC:
{t | ∃s (s ∈ Students ∧ s.GPA > 3.5 ∧
∃e (e ∈ Enrollment ∧ e.SID = s.SID) ∧
t.Name = s.Name)}
Key Takeaways
- Relational Algebra: Procedural, step-by-step query specification
- TRC: Declarative using tuple variables
- DRC: Declarative using domain variables
- Equivalence: All three have same expressive power
- Safety: Critical for practical implementations
Real-world Impact
- Foundation of SQL query processing
- Basis for query optimization algorithms
- Theoretical framework for new query languages
- Essential for database research and development
Next Lecture Preview
SQL: Structured Query Language - Putting theory into practice!
Question 1
Write a relational algebra expression to find students who are enrolled in both CS101 and CS102.
Question 2
Express in TRC: "Find names of students who have never enrolled in any course."
Question 3
Convert the following SQL to DRC: SELECT DISTINCT Age FROM Students WHERE GPA > 3.0;
Try It Yourself!
Solution 1: πSID(σCID='CS101'(Enrollment)) ∩ πSID(σCID='CS102'(Enrollment))
Solution 2: {t | ∃s (s ∈ Students ∧ t.Name = s.Name ∧ ¬∃e (e ∈ Enrollment ∧ e.SID = s.SID))}
Solution 3: {⟨a⟩ | ∃s, n, g (⟨s, n, a, g⟩ ∈ Students ∧ g > 3.0)}
Textbooks
- Database System Concepts - Silberschatz, Korth, Sudarshan
- Fundamentals of Database Systems - Elmasri & Navathe
- Database Management Systems - Raghu Ramakrishnan
Online Resources
- Stanford CS145 Database Systems
- MIT 6.830 Database Systems
- GeeksforGeeks Database Tutorials
Contact Information
Dr. Mohsin Dar | mohsin.dar@ddn.upes.ac.in
Office Hours: By appointment | SOCS Department, UPES
Thank you for your attention! Questions?