LAB-5: SQL Sub-Query

Database Systems - MTech First Semester
Course: Database Systems
Instructor: Dr. Mohsin Dar
Department: Cloud & Software Operations Cluster | SOCS | UPES
Lab Session: 5

Experiment Objective

Title: To understand and use SQL Sub-Query

Objective: To understand the use of SQL subquery and implement various subquery operations on relational database tables.

Introduction to SQL Sub-Queries

A subquery (also known as an inner query or nested query) is a query within another SQL query. The subquery is executed first, and its result is used by the outer query. Subqueries can be used in SELECT, INSERT, UPDATE, and DELETE statements.

Types of Subqueries:

  • Single-row subqueries: Return exactly one row
  • Multiple-row subqueries: Return one or more rows
  • Correlated subqueries: Reference columns from the outer query
  • Non-correlated subqueries: Independent of the outer query
Important Note:
Subqueries must be enclosed in parentheses and are often used with operators like IN, EXISTS, ANY, ALL, etc.

Step 1: Database Schema Creation

We need to create three tables as specified in the experiment:

Table 1: Supplier

Supplier(scode, sname, scity, turnover)

Table 2: Part

Part(pcode, weigh, color, cost, sellingprice)

Table 3: Supplier_Part (Junction Table)

Supplier_Part(scode, pcode, qty)

SQL Commands for Table Creation:

-- Create Supplier table CREATE TABLE Supplier ( scode INT PRIMARY KEY, sname VARCHAR(50) NOT NULL, scity VARCHAR(30), turnover DECIMAL(10,2) ); -- Create Part table CREATE TABLE Part ( pcode INT PRIMARY KEY, weigh DECIMAL(8,2), color VARCHAR(20), cost DECIMAL(8,2), sellingprice DECIMAL(8,2) ); -- Create Supplier_Part junction table CREATE TABLE Supplier_Part ( scode INT, pcode INT, qty INT, FOREIGN KEY (scode) REFERENCES Supplier(scode), FOREIGN KEY (pcode) REFERENCES Part(pcode), PRIMARY KEY (scode, pcode) );

Step 2: Data Population

Let's populate the tables with sample data:

-- Insert data into Supplier table INSERT INTO Supplier VALUES (1, 'ABC Suppliers', 'Bombay', 50.00), (2, 'XYZ Corp', 'Delhi', 75.50), (3, 'PQR Industries', 'Bombay', 50.00), (4, 'LMN Traders', 'Chennai', NULL), (5, 'RST Suppliers', 'Pune', 120.75); -- Insert data into Part table INSERT INTO Part VALUES (1, 15.5, 'Red', 25.00, 35.00), (2, 30.0, 'Blue', 20.00, 28.00), (3, 45.2, 'Green', 40.00, 55.00), (4, 28.8, 'Yellow', 30.00, 42.00), (5, 52.3, 'Black', 60.00, 78.00); -- Insert data into Supplier_Part table INSERT INTO Supplier_Part VALUES (1, 2, 100), (2, 1, 50), (3, 2, 75), (4, 3, 200), (5, 4, 150), (1, 4, 80), (2, 5, 120);

Sample Data Overview:

Supplier Table:

scodesnamescityturnover
1ABC SuppliersBombay50.00
2XYZ CorpDelhi75.50
3PQR IndustriesBombay50.00
4LMN TradersChennaiNULL
5RST SuppliersPune120.75

Step 3: SQL Query Solutions

Query 1: Get the supplier number and part number in ascending order of supplier number
SELECT scode, pcode FROM Supplier_Part ORDER BY scode ASC;
Explanation: This is a simple query without subquery. It retrieves supplier and part numbers from the junction table, sorted by supplier number in ascending order.
Query 2: Get the details of supplier who operate from Bombay with turnover 50
SELECT * FROM Supplier WHERE scity = 'Bombay' AND turnover = 50;
Explanation: Direct query filtering suppliers based on city and turnover conditions.
Query 3: Get the total number of suppliers
SELECT COUNT(*) AS total_suppliers FROM Supplier;
Explanation: Uses COUNT aggregate function to count total number of supplier records.
Query 4: Get the part number weighing between 25 and 35
SELECT pcode FROM Part WHERE weigh BETWEEN 25 AND 35;
Explanation: Uses BETWEEN operator to find parts with weight in the specified range.
Query 5: Get the supplier number whose turnover is null
SELECT scode FROM Supplier WHERE turnover IS NULL;
Explanation: Uses IS NULL to find suppliers with missing turnover values.
Query 6: Get the part number that cost 20, 30 or 40 rupees
SELECT pcode FROM Part WHERE cost IN (20, 30, 40);
Explanation: Uses IN operator to match parts with specific cost values.
Query 7: Get the total quantity of part 2 that is supplied
SELECT SUM(qty) AS total_quantity_part2 FROM Supplier_Part WHERE pcode = 2;
Explanation: Uses SUM aggregate function to calculate total quantity for a specific part.
Query 8: Get the name of supplier who supply part 2 (Using Subquery)
SELECT sname FROM Supplier WHERE scode IN ( SELECT scode FROM Supplier_Part WHERE pcode = 2 );
Subquery Explanation: This demonstrates a classic subquery usage. The inner query finds all supplier codes that supply part 2, and the outer query retrieves the names of these suppliers.
Query 9: Get the part number whose cost is greater than the average cost (Using Subquery)
SELECT pcode FROM Part WHERE cost > ( SELECT AVG(cost) FROM Part );
Subquery Explanation: This uses a subquery with aggregate function. The inner query calculates the average cost of all parts, and the outer query finds parts with cost greater than this average.
Query 10: Get the supplier number and turnover in descending order of turnover
SELECT scode, turnover FROM Supplier WHERE turnover IS NOT NULL ORDER BY turnover DESC;
Explanation: Retrieves supplier information ordered by turnover in descending order, excluding NULL values.

Additional Subquery Examples

Here are some advanced subquery examples to further understand the concept:

Example 1: Correlated Subquery - Find suppliers who supply more than one part
SELECT s.scode, s.sname FROM Supplier s WHERE ( SELECT COUNT(*) FROM Supplier_Part sp WHERE sp.scode = s.scode ) > 1;
Correlated Subquery: The inner query references the outer query's table (s.scode), making it dependent on each row of the outer query.
Example 2: EXISTS Subquery - Find parts that are supplied by at least one supplier
SELECT p.pcode, p.color FROM Part p WHERE EXISTS ( SELECT 1 FROM Supplier_Part sp WHERE sp.pcode = p.pcode );
EXISTS Subquery: Returns TRUE if the subquery returns at least one row. More efficient than IN for existence checks.

Key Learning Outcomes

After completing this lab, students should be able to:
  • Understand the concept and types of SQL subqueries
  • Write single-row and multiple-row subqueries
  • Use subqueries with IN, EXISTS, and comparison operators
  • Implement correlated and non-correlated subqueries
  • Apply aggregate functions within subqueries
  • Optimize queries using appropriate subquery techniques
  • Debug and troubleshoot complex nested queries

Subquery Performance Tips

Performance Considerations:
  • Use EXISTS instead of IN when checking for existence, especially with large datasets
  • Avoid correlated subqueries when possible, as they execute once for each row in the outer query
  • Consider JOINs as alternatives to subqueries for better performance
  • Index columns used in subquery WHERE clauses
  • Limit subquery results when appropriate using TOP, LIMIT, or similar clauses

JOIN vs Subquery Comparison:

Subquery Approach:
-- Using Subquery SELECT sname FROM Supplier WHERE scode IN ( SELECT scode FROM Supplier_Part WHERE pcode = 2 );
JOIN Approach (Alternative):
-- Using JOIN (often more efficient) SELECT DISTINCT s.sname FROM Supplier s JOIN Supplier_Part sp ON s.scode = sp.scode WHERE sp.pcode = 2;

Common Subquery Patterns

1. Finding Maximum/Minimum Values:

-- Find suppliers with highest turnover SELECT scode, sname, turnover FROM Supplier WHERE turnover = ( SELECT MAX(turnover) FROM Supplier WHERE turnover IS NOT NULL );

2. Finding Second Highest Value:

-- Find second highest cost part SELECT pcode, cost FROM Part WHERE cost = ( SELECT MAX(cost) FROM Part WHERE cost < ( SELECT MAX(cost) FROM Part ) );

3. Finding Records Not in Another Table:

-- Find parts that are not supplied by any supplier SELECT pcode, color FROM Part WHERE pcode NOT IN ( SELECT pcode FROM Supplier_Part WHERE pcode IS NOT NULL );

4. Using ALL and ANY Operators:

-- Find parts with cost greater than ALL costs of red parts SELECT pcode, cost, color FROM Part WHERE cost > ALL ( SELECT cost FROM Part WHERE color = 'Red' ); -- Find parts with cost greater than ANY cost of blue parts SELECT pcode, cost, color FROM Part WHERE cost > ANY ( SELECT cost FROM Part WHERE color = 'Blue' );

Practice Exercises

Additional Practice Problems:

Try solving these problems using subqueries:

Exercise 1: Find suppliers who supply the most expensive part
-- Your solution here SELECT s.scode, s.sname FROM Supplier s WHERE s.scode IN ( SELECT sp.scode FROM Supplier_Part sp WHERE sp.pcode = ( SELECT pcode FROM Part WHERE cost = (SELECT MAX(cost) FROM Part) ) );
Exercise 2: Find suppliers whose turnover is above average
-- Your solution here SELECT scode, sname, turnover FROM Supplier WHERE turnover > ( SELECT AVG(turnover) FROM Supplier WHERE turnover IS NOT NULL );
Exercise 3: Find parts supplied by suppliers from Bombay
-- Your solution here SELECT DISTINCT p.pcode, p.color FROM Part p WHERE p.pcode IN ( SELECT sp.pcode FROM Supplier_Part sp WHERE sp.scode IN ( SELECT s.scode FROM Supplier s WHERE s.scity = 'Bombay' ) );

Troubleshooting Common Subquery Issues

1. NULL Value Handling:

Problem:

Using IN with subqueries that may return NULL values can produce unexpected results.

-- Problematic query SELECT * FROM Supplier WHERE scode NOT IN (SELECT scode FROM Supplier_Part); -- Better approach SELECT * FROM Supplier WHERE scode NOT IN ( SELECT scode FROM Supplier_Part WHERE scode IS NOT NULL );

2. Correlated Subquery Performance:

Problem:

Correlated subqueries can be slow with large datasets.

-- Slower correlated subquery SELECT s.sname FROM Supplier s WHERE EXISTS ( SELECT 1 FROM Supplier_Part sp WHERE sp.scode = s.scode AND sp.qty > 100 ); -- Faster JOIN alternative SELECT DISTINCT s.sname FROM Supplier s JOIN Supplier_Part sp ON s.scode = sp.scode WHERE sp.qty > 100;

Lab Assessment Criteria

CriteriaExcellent (4)Good (3)Satisfactory (2)Needs Improvement (1)
Query Correctness All queries execute correctly and produce expected results Most queries correct with minor syntax issues Some queries correct, basic understanding shown Few queries work, fundamental issues
Subquery Understanding Demonstrates clear understanding of different subquery types Good grasp of basic subquery concepts Basic understanding, some confusion Limited understanding of subqueries
Code Quality Well-formatted, commented, efficient queries Good formatting and structure Acceptable formatting Poor formatting, hard to read
Problem Solving Creative solutions, handles edge cases Good analytical approach Basic problem-solving skills Struggles with complex problems

Summary and Next Steps

In this lab, we have covered:

  • Creation of relational database tables with proper constraints
  • Data insertion and population techniques
  • Various types of SQL subqueries and their applications
  • Performance considerations and optimization techniques
  • Common patterns and troubleshooting approaches
For Next Lab Session:

Students should review:

  • Advanced JOIN operations
  • Window functions and analytical queries
  • Stored procedures and functions
  • Database indexing strategies

References and Further Reading:

  • Database System Concepts by Silberschatz, Korth, and Sudarshan
  • SQL: The Complete Reference by James Groff
  • Learning SQL by Alan Beaulieu
  • Online SQL practice platforms: SQLBolt, W3Schools, SQLiteOnline