Chapter :- 3 (English)
| Site: | AssessmentKaro |
| Course: | Database Management System |
| Book: | Chapter :- 3 (English) |
| Printed by: | Guest user |
| Date: | Saturday, 18 April 2026, 6:54 PM |
Description
Topic Wise Chapter (English)
1. Relational Data Model
Relational Data Model
The Relational Data Model (RDM) is a way of structuring data using tables (called relations).
Each table consists of rows (tuples) and columns (attributes).
The model is based on set theory and first-order logic, and it uses keys to uniquely identify rows and enforce relationships between tables.
ü Advantages
1. Simplicity
- Data is stored in intuitive table structures that are easy to understand and work with.
- Reduces complexity compared to hierarchical or network models.
2. Data Independence
- Physical data storage details are separate from logical data structures.
- Applications do not need to change if underlying storage changes.
3. Flexibility in Querying (via SQL)
- SQL allows complex queries (joins, aggregations, filtering) with simple syntax.
- Query optimization improves efficiency.
4. Normalization Reduces Redundancy
- Organizing data into multiple related tables minimizes duplication.
- Helps maintain consistency.
5. Strong Data Integrity
- Supports constraints (primary keys, foreign keys, unique, check).
- Ensures accuracy and consistency.
6. Security
- Fine-grained access control (table-level or column-level).
- Mature security models in commercial RDBMSs.
ü Disadvantages
1. Costly for Very Large or Distributed Data
- Scaling horizontally (across servers) is difficult.
- Not ideal for big data or high-velocity applications.
2. Performance Overhead with Complex Joins
- Queries involving many joins can be slow.
- Not suitable for deep or highly interconnected data.
3. Requires Structured, Predefined Schema
- Schema must be defined before inserting data.
- Not flexible for unstructured or semi-structured data (unlike NoSQL).
4. Normalization Can Make Queries Complex
- Splitting data into many tables reduces redundancy but increases query complexity.
5. Less Suitable for Hierarchical or Graph-like Data
- Modeling networks or hierarchical relationships requires multiple join tables.
2. Relational Database Constraints
Relational Database Constraints
Constraints in a relational database are rules applied to table columns to ensure data accuracy, integrity, and reliability. They prevent invalid data from being inserted into the database.
Types of Relational Database Constraints
1. PRIMARY KEY Constraint
· Uniquely identifies each row in a table.
· Cannot be NULL.
Advantages
· Ensures each record is unique.
· Speeds up searching and indexing.
Disadvantages
· Requires careful design to avoid complex composite keys.
· Changing a primary key can be difficult.
2. FOREIGN KEY Constraint
· Ensures relationship between two tables.
· Enforces referential integrity.
Advantages
· Prevents orphan records.
· Maintains consistency across tables.
Disadvantages
· Can make deletion or updating of records difficult due to dependency.
· Slows down operations if many relationships exist.
3. UNIQUE Constraint
· Ensures all values in a column are unique.
· Allows only one NULL (varies by DBMS).
Advantages
· Prevents duplicate entries for important fields (e.g., email).
· Helps maintain data accuracy without needing a primary key.
Disadvantages
· May slow down insert operations due to uniqueness checks.
· Not suitable for fields that may need duplicates.
4. NOT NULL Constraint
· Ensures a column cannot store NULL values.
Advantages
· Guarantees required fields are always filled.
· Prevents accidental missing data.
Disadvantages
· Reduces flexibility if NULL values might be meaningful.
· Poorly applied NOT NULL constraints can result in dummy or invalid values.
5. CHECK Constraint
· Validates data based on a logical expression.
Example: age > 18
Advantages
· Enforces complex business rules.
· Prevents invalid data that meets the datatype but fails rules.
Disadvantages
· Poorly designed check constraints can cause errors.
· Complex checks may add performance overhead.
6. DEFAULT Constraint
· Sets a default value for a column when no value is provided.
Advantages
· Eliminates need to manually insert common values.
· Ensures fields are never left empty.
Disadvantages
· Might unintentionally fill values the user never intended.
· Overuse may hide missing data patterns.
ü Overall Advantages of Constraints
· Improve data integrity.
· Prevent invalid or inconsistent data.
· Enforce business rules automatically.
· Reduce the need for application-level validations.
· Maintain data consistency across related tables.
ü Overall Disadvantages of Constraints
· Can make database design more complex.
· May reduce performance for large-scale insert/update operations.
· Hard-to-change constraints may slow down development.
· Overuse can make database structures rigid and less adaptable.
3. ER/EER to Relational Model Mapping
ER/EER to Relational Model Mapping
ER (Entity–Relationship) and EER (Enhanced Entity–Relationship) diagrams are conceptual models.
To implement them in a database, we convert (map) them into the Relational Model (tables).
1. ER to Relational Model Mapping Steps
Step 1: Mapping Strong Entities
· Each strong entity becomes a table.
· Key attribute → Primary Key (PK).
Example:
Entity: Student(ID, Name, Age)
Table: Student(ID PK, Name, Age)
Step 2: Mapping Weak Entities
· Weak entity → table
· Includes its partial key + primary key of owner entity.
· Combined key becomes the primary key.
Example:
Dependent(Name, Age) weak entity owned by Employee(EID).
Table: Dependent(Name, EID PK, Age)
Step 3: Mapping Relationships
A. 1:1 (One-to-One)
Add PK of one table as foreign key in the other.
B. 1:N (One-to-Many)
Add PK of the “one” side as FK in the “many” side.
C. M:N (Many-to-Many)
Create a new relation (table):
· Primary keys of both entities become composite PK.
· Include any relationship attributes.
Example:
Table: Enroll(StudentID PK/FK, CourseID PK/FK, Grade)
Step 4: Mapping Multivalued Attributes
· Each multivalued attribute becomes a separate table.
· Attribute + primary key of the main entity = composite key.
Example:
Skill(skillName, EmployeeID PK)
Step 5: Mapping Derived Attributes
· Usually not stored, calculated when needed.
2. EER to Relational Model Mapping
EER extends ER with:
✔ Specialization
✔ Generalization
✔ Categories (Union types)
✔ Inheritance constraints
Mapping Specialization/Generalization
There are 4 main strategies:
A. Top-Down Mapping (One table for superclass + one per subclass)
Superclass table → common attributes
Subclass tables → specific attributes + PK of superclass as FK/PK.
B. Single-Table (One table for entire hierarchy)
All attributes (superclass + subclass) kept in one large table.
Unused attributes are NULL.
C. Separate Tables for Each Subclass
Superclass table is removed; only subclass tables created.
Subclass tables include all inherited attributes.
D. Mixed Strategy
Depends on constraints like disjoint/overlapping and total/partial.
Advantages of ER/EER to Relational Mapping
✔ 1. Simplifies Database Implementation
ER/EER diagrams give a clear structure; mapping converts them into tables easily.
✔ 2. Maintains Data Integrity
Relationships (1:1, 1:N, M:N) become foreign keys, ensuring consistency.
✔ 3. Supports Normalization
Reduces redundancy and anomalies.
✔ 4. Provides a Blueprint for Developers
Easy to understand and implement in SQL.
✔ 5. EER Mapping supports Inheritance
More efficient representation of real-world hierarchical data.
Disadvantages of ER/EER to Relational Mapping
1. Loss of Some Semantics
Relational models cannot directly represent:
· inheritance
· overlapping categories
· constraints (e.g., conditional constraints)
2. Mapping Can Become Complex
Especially for:
· M:N relationships
· weak entities
· multivalued attributes
· large inheritance hierarchies (EER)
3. Increased Number of Tables
Multivalued attributes, M:N relationships create more tables → more joins.
4. Performance Issues
More tables and joins may slow queries.
5. Some EER Features Are Hard to Implement
E.g.,:
· partial vs total specialization
· disjoint vs overlapping constraints
These require complex rules or triggers.
4. Relational Algebra
Relational Algebra
Relational Algebra is a procedural query language used in relational database theory. It operates on relations (tables) and uses a set of operations to produce new relations as output. It forms the mathematical foundation for SQL and query processing in DBMS.
Basic Relational Algebra Operations
|
Operation |
Symbol |
Description |
|
Selection |
σ |
Selects rows based on a condition |
|
Projection |
π |
Selects specific columns |
|
Union |
∪ |
Combines tuples from two relations |
|
Set Difference |
− |
Returns tuples in the first relation but not in the second |
|
Cartesian Product |
× |
Combines all tuples from two relations |
|
Rename |
ρ |
Renames relation or attributes |
|
Join operations |
⋈ |
Combines related tuples from two relations |
Advantages of DBMS
✔ 1. Data Redundancy Control
Minimizes duplication of data through normalization and centralized storage.
✔ 2. Data Consistency
Because redundancy is reduced, the chances of inconsistent data are also minimized.
✔ 3. Improved Data Security
DBMS provides:
· Authentication
· Authorization
· Access control
to protect sensitive data.
✔ 4. Data Independence
Changes in data structure do not affect application programs.
✔ 5. Efficient Data Access
Indexing, query optimization, and concurrency control improve performance.
✔ 6. Backup and Recovery
Automatic backup and recovery ensure data is protected from failures.
✔ 7. Multi-User Access
Supports concurrent access with proper isolation and locking mechanisms.
✔ 8. Centralized Management
Allows centralized control over data and policies.
Disadvantages of DBMS
1. High Cost
DBMS software, hardware requirements, and trained personnel can be expensive.
2. Complexity
Requires specialized knowledge to design, manage, and maintain.
3. High Resource Usage
DBMS consumes significant memory, storage, and CPU resources.
4. Performance Overhead
For small applications, DBMS may slow things down due to:
· Communication overhead
· Security checks
· Concurrency controls
5. Risk of System Failure
If the central DBMS fails, the entire system may stop functioning until recovery.
5. Relational Calculus
Relational Calculus
Relational Calculus is a non-procedural (declarative) query language used in relational databases.
It tells what data to retrieve, not how to retrieve it.
It is based on mathematical predicate logic and forms a theoretical foundation for SQL.
There are two types:
1. Tuple Relational Calculus (TRC)
Uses tuple variables that represent rows.
Example:
{ t | t ∈ Student AND t.age > 18 }
This means:
"Find all tuples t from Student where age > 18"
2. Domain Relational Calculus (DRC)
Uses domain variables that represent column values.
Example:
{ <name, age> | ∃ id (Student(id, name, age) AND age > 18) }
Advantages of Relational Calculus
✔ 1. Non-Procedural Language
Users specify only what they want, making it easier to write and understand.
✔ 2. Basis for SQL
SQL's declarative nature comes from relational calculus.
✔ 3. Higher Level of Abstraction
Focuses on logic, not operations—ideal for designing queries conceptually.
✔ 4. Concise and Clean
Calculus expressions are often shorter than relational algebra expressions.
✔ 5. DBMS Query Optimization
Since users don’t specify the execution steps, the DBMS can optimize the query on its own.
Disadvantages of Relational Calculus
1. Not User-Friendly for Complex Queries
Logical expressions can become difficult when queries involve many conditions.
2. May Produce Infinite Results
Unsafe expressions can generate infinite sets, so safety rules are needed.
3. Not Used Directly in DBMS
It is mainly theoretical; users work with SQL, not calculus.
4. Harder to Implement Internally
DBMS must convert calculus expressions into relational algebra, adding complexity.
5. Less Intuitive for Beginners
Logical notation can confuse students unfamiliar with predicate logic.
6. Relational Algebra
Relational Algebra
Relational Algebra is a procedural query language used in relational databases.
It uses a set of mathematical operations to manipulate relations (tables) and produce new relations as output.
Here we cover four basic operations:
1. Union ( ∪ )
Definition:
Combines tuples (rows) from two relations and removes duplicates.
Requirements:
· Both relations must be union-compatible (same number of attributes, same data types).
Example:
A ∪ B
Gives all tuples present in either A or B.
2. Intersection ( ∩ )
Definition:
Returns only those tuples that are present in both relations.
Example:
A ∩ B
Gives common tuples of A and B.
3. Set Difference ( − )
Definition:
Returns tuples that are in the first relation but not in the second.
Example:
A − B
Gives tuples that exist in A but not in B.
4. Cartesian Product ( × )
Definition:
Combines every tuple of the first relation with every tuple of the second relation.
Example:
A × B
Produces a large table with all possible combinations.
· .
Used for:
Basis of JOIN operations in databases
7. Additional Relational Algebraic Operations
Additional Relational Algebraic Operations
1. Projection (π)
Definition:
Projection selects specific columns (attributes) from a relation, removing duplicates.
Notation:
π<sub>column_list</sub>(Relation)
Example:
π<sub>Name, Salary</sub>(EMPLOYEE)
Advantages:
- ✔ Reduces the number of columns → smaller and faster intermediate results
- ✔ Helps hide irrelevant or sensitive attributes
- ✔ Simplifies query results
Disadvantages:
- ❌ Removes duplicates automatically → may add extra processing cost
- ❌ Potential loss of information (non-selected attributes gone)
2. Selection (σ)
Definition:
Selection retrieves specific rows (tuples) that satisfy a given condition.
Notation:
σ<sub>condition</sub>(Relation)
Example:
σ<sub>Salary > 50000</sub>(EMPLOYEE)
Advantages:
- ✔ Filters rows → smaller data and faster downstream operations
- ✔ Helps enforce conditions (WHERE clause in SQL)
- ✔ Improves query efficiency when used early
Disadvantages:
- ❌ Can be slow if applied on large relations without indexes
- ❌ Incorrect or complex conditions may result in empty or huge outputs
3. Division (÷)
Definition:
Division returns rows from relation A that match all rows in relation B.
Used for "for all" queries.
Example:
STUDENT_COURSE ÷ COURSE_REQUIRED
→ students who took all required courses
Advantages:
- ✔ Useful for universal quantification ("for all" conditions)
- ✔ Simplifies complex logic
Disadvantages:
- ❌ Hard to understand and implement
- ❌ Computationally expensive (requires multiple intermediate operations)
- ❌ Rarely used compared to joins
4. Rename (ρ)
Definition:
Rename changes the name of a relation or its attributes.
Notation:
ρ<sub>NewName</sub>(Relation)
ρ<sub>NewName(attr1, attr2,...)</sub>(Relation)
Advantages:
- ✔ Essential for self-joins or eliminating ambiguity
- ✔ Improves readability
- ✔ Helps align attribute names for joins
Disadvantages:
- ❌ Adds extra steps without contributing to actual data
- ❌ Misuse may create confusion
5. Join (⨝)
Definition:
Join combines two relations based on a related attribute.
Types:
- Theta join (conditions)
- Equi-join (= only)
- Natural join (automatic match on same attribute names)
- Outer joins (left, right, full — in extended algebra)
Example:
EMPLOYEE ⨝<sub>EMPLOYEE.DID = DEPARTMENT.ID</sub> DEPARTMENT
Advantages:
- ✔ Brings together related data from multiple tables
- ✔ Powerful tool for complex queries
- ✔ Reduces redundancy when relations are normalized
Disadvantages:
- ❌ Expensive operation on large tables
- ❌ Natural join can cause unexpected results if attribute names overlap incorrectly
- ❌ Requires careful attribute matching
Table
|
Operation |
Purpose |
Advantages |
Disadvantages |
|
Projection (π) |
Select columns |
Reduces data, hides irrelevant info |
Loses discarded data, duplicate elimination cost |
|
Selection (σ) |
Select rows |
Efficient filtering, reduces dataset |
Slow on large data without indexes |
|
Division (÷) |
“For all” queries |
Handles universal conditions |
Hard, slow, rarely used |
|
Rename (ρ) |
Rename relations/attributes |
Removes ambiguity, essential for self-joins |
Adds complexity |
|
Join (⨝) |
Combine relations |
Powerful, essential for relational model |
Computationally expensive |