PSU QA
**Ques. State whether the following statements are true or false:
- Intension - represents the number of tuples presented in a database relation (table) at any instance. β
- Extension - gives the name, structure and constraint of a database relation (table). β
- Corrected: β
-
Intension - represents the structure (schema) of a database relation β includes name, attributes, data types, and constraints.
-
Extension - refers to the set of tuples (rows) in a relation at a particular point in time.
-
Ques. What is Inverted File Organisation ββ
Inverted file organization is a file organization technique where separate indexes are built on secondary (non-primary) keys, and each index entry stores a list of pointers to all records having that key value.
Two files are maintained:
- Data file β actual records stored normally
- Inverted index file β key value mapped to a list of record addresses
Key Value β [Record Pointer 1, Record Pointer 2, ...]working
- Choose one or more secondary attributes
- For each distinct value of the attribute, create an index entry
- Each entry points to all records containing that value
- On query, system directly follows pointers instead of scanning data file
example
Occupation indexENG β 2, 7, 15DOC β 1, 9MUS β 4, 10, 21why efficient
- Eliminates full file scan for secondary key searches
- Supports many-to-many relationships
- Multiple secondary keys can be indexed simultaneously
advantages
- Very fast retrieval using secondary keys
- Supports complex queries on non-primary attributes
- Good for read-heavy databases
disadvantages
- Extra storage for pointer lists
- Update overhead on insert/delete
- Pointer lists can become large
why called βinvertedβ
- Traditional file: record β field values
- Inverted file: field value β list of records
**Ques. Which of the following is NOT an aggregate function in SQL? **
Round()-> As it operates on a single value -> β Not a Aggregate FunctionAVG(), SUM(), MAX()-> operate on a set of values and return a single result. -> β Aggregate Function
Ques. Which of the following statements is/are true about secondary index? β
- Secondary index may be created on a key field. β
- Secondary index may be created on non-key field. β
- Because:
- Secondary Index can be built on key or non-key fields.
- Used when the attribute is not sorted or is not a primary key.
- Helps in speeding up search operations on non-clustering attributes.
Indexing in DBMS β
- What is Indexing?
-
Indexing is a data structure technique to efficiently access records in a database file.
-
It reduces the time complexity of searching from O(n) to O(log n) (e.g., using B+ Trees).
-
- How Indexing Works
-
An index is created on one or more columns.
- It maintains pointers to actual data locations (record/block addresses).
- Think of it like a book index: You search via keyword and find the page quickly.
-
- Types of Indexing β
- A. Primary Index
-
Based on primary key (unique, sorted).
- One entry per block (in sparse index).
- Sparse Index β Common for primary index.
-
- B. Secondary Index
-
Based on key or non-key attributes (excluding the primary access path).
-
Can be dense Index (entry for every record). *ultiple secondary indexes possible.
-
- C. Clustering Index
-
Based on non-key field, but records are physically ordered by this field.
- One per table.
- Useful for range-based queries.
-
- D. Dense Index β
-
Contains entry for every record.
- Faster searches, more space required.
-
- E. Sparse Index β
-
Contains entry for only some records.
- Used when records are sorted.
-
- F. Multilevel Index
- Index is large β Build index on index.
- Reduces I/O using B-tree or B+ tree structure.
- G. Composite Index
-
Index on multiple columns.
- E.g.,
INDEX(emp_name, emp_id).
-
- A. Primary Index
- Indexing Data Structures
| Structure | Used for | Notes |
|---|---|---|
| B-Tree | Balanced search | Moderate search time |
| B+ Tree | Modern DBs | Leaf nodes store data, faster range queries |
| Hash Index | Exact match search | Fast lookup, but not for range queries |
- Advantages
- Fast data retrieval
- Efficient range queries
- Reduces CPU and I/O cost
- Helps in ORDER BY, GROUP BY, JOIN operations
- Disadvantages
- Slower INSERT/UPDATE/DELETE (index must be updated)
- Requires extra space
- Can degrade performance if too many indexes
- Commonly Asked Interview/Exam Questions
Ques. In ER Model, an identifying relationship (between weak entity and owner) is represented by:
- Double-lined diamond β
- Identifying relationship
- Double-lined oval β
- Represents multivalued attribute)
- Dashed oval β
- Represents derived attribute)
Entity Relationship Model (Weak Entity Representation)
- Weak Entity: β`
-
A weak entity is an entity that cannot exist without a related strong entity.
- It doesnβt have a primary key of its own.
- Relies on the primary key of the owner entity + its partial key.
- Example :
Dependent(like a child or spouse) cannot exist without its relatedEmployee, and has no unique key of its own.
-
- Identifying Relationship :
-
A special type of relationship that connects a weak entity to its owner (strong entity).
- Called identifying because it helps in identifying instances of the weak entity.
- Drawn using a double-lined diamond in ER diagrams.
- The participation of the weak entity is always total (shown by double line connecting to the relationship).
- Example :
DependentOfis the identifying relationship that links eachDependentto its owningEmployee, shown with a double-lined diamond in ER diagram.
-
- Related Special ER Symbols:
| Symbol | Meaning |
|---|---|
| Double-lined rectangle | Weak Entity β |
| Double-lined diamond | Identifying Relationship β |
| Double-lined oval | Multivalued Attribute |
| Dashed oval | Derived Attribute |
Ques. In a database, a primary index is an _____ file whose records are of ____ length.
- ordered, fixed β
- ordered, variable β
- unordered, fixed β
- unordered, variable β
-
Concept: Primary Index is built on the primary key, and the data file is sorted on that key.
-
Hence, the index file is also in ordered form.
-
Each index entry contains a fixed-length pair: (key, pointer) β making it fixed-length.
-
Efficient searching using binary search== is possible ==due to ordering.
-
Ques. Which of the following is used to represent βtotalβ participation in ER diagram?
- Double Line β (Correct)
- Dashed Line β (Not used in ER diagrams)
- Single Line β (Represents partial participation)
- Double-lined rectangle β (Represents weak entity, not participation)
- Total Participation: β
- Means every record of an entity must be linked to at least one record in the related entity.
- Example: Every
Dependentmust be related to someEmployee. - In ER diagram, it is shown by a double line connecting the entity to the relationship.
- Partial Participation:
- Means some records of the entity may or may not be linked.
- Example: Some
Employeesmay not have anyDependent. - In ER diagram, it is shown by a single line between the entity and relationship.
Q117. Which of the following SQL functions does NOT ignore NULL by default?
COUNT()βMAX()βMIN()βSUM()βCOUNT(*)β Includes NULL values (counts all rows).COUNT(column_name)β Ignores NULLs.MAX(), MIN(), SUM()β Always ignore NULLs by default.- To handle NULLs explicitly, use:
COALESCE(column, default_value)IFNULL(column, default_value) -- MySQLNVL(column, default_value) -- Oracle- Note:
COUNT(*)is the only standard aggregate function that includes NULLs.
Important Concepts
Section titled βImportant ConceptsβB+ Tree in DBMS
Section titled βB+ Tree in DBMSβA B+ Tree is a balanced m-ary search tree used for indexing in DBMS, where all actual data records are stored only at the leaf level==, and ==internal nodes contain only keys.
Properties: of B++ Tree
- All leaves are at same level (balanced tree).
- Internal nodes store only keys, not actual data.
- Leaf nodes contain keys + data pointers.
- Leaf nodes are linked (for fast range queries).
- Tree is always balanced β height is kept minimal.
- Efficient for sequential and random access.
Order of B+ Tree:
- For order
m, each internal node can have:- Max
mchildren - Min
ceil(m/2)children (except root) - Max
m - 1keys - Min
ceil(m/2) - 1keys
- Max
Types of Nodes:
- Internal Node: Contains only keys and pointers to children.
- Leaf Node: Contains keys and pointers to data records.
- Leaf nodes are connected using linked list.
Operations:
- Search(key):
- Start from root, go down the tree using keys
- At leaf node, find exact key and return data pointer
- Insertion(key, data):
- Insert in appropriate leaf node
- If leaf node is full:
- Split the node and promote middle key to parent
- Recursively adjust parent if needed
- Deletion(key):
- Remove key from leaf
- If underflow:
- Try borrow from sibling
- Else, merge with sibling and adjust parent
- Repeat recursively up
Advantages of B+ Tree:
- Fast search, insert, delete:
O(log n) - Efficient range queries (due to linked leaf nodes)
- Reduces disk I/O (used in disk-based systems)
- Keeps data sorted
Difference: B+ Tree vs B-Tree
| Feature | B Tree | B+ Tree |
|---|---|---|
| Data Storage | Internal + leaf nodes | Only leaf nodes |
| Internal Nodes | Keys + data | Only keys |
| Sequential Access | Slower | Faster (linked leaves) |
| Range Queries | Less efficient | Very efficient |
Use in DBMS:
- Used for primary & secondary indexing
- Used in file systems, databases (e.g., MySQL, Oracle)
B Tree in DBMS
Section titled βB Tree in DBMSβDefinition:
A B Tree is a balanced m-ary search tree used in DBMS for indexing and storing sorted data, where data is stored in both internal and leaf nodes.
Properties:
- All leaves are at the same level (balanced).
- A node with
nkeys hasn+1children. - Keys in a node are sorted.
- Every node (except root) has at least
ceil(m/2) - 1keys. - Maximum
m - 1keys andmchildren per node. - Supports dynamic insertions/deletions while maintaining balance.
Order of B Tree:
- For order
m, each node has:- Max
mchildren - Min
ceil(m/2)children - Max
m-1keys - Min
ceil(m/2)-1keys (except root)
- Max
Operations:
- Search(key):
- Start from root, check keys
- If not found, follow the appropriate child pointer
- Repeat until leaf
- Insertion(key, data):
- Insert in correct position in a node
- If node overflows (more than
m-1keys):- Split node
- Promote middle key to parent
- Repeat recursively if needed
- Deletion(key):
- If in leaf node, delete directly
- If in internal node:
- Replace with in-order predecessor/successor
- If node underflows:
- Try borrowing key from sibling
- Else merge with sibling and adjust parent
Advantages of B Tree:
- Maintains sorted data
- Balanced: ensures
O(log n)time - Efficient for insert/delete/search
- Minimizes disk access, suitable for disk-based systems
Difference: B Tree vs B+ Tree
| Feature | B Tree | B+ Tree |
|---|---|---|
| Data Storage | Internal + leaf nodes | Only leaf nodes |
| Internal Nodes | Keys + data | Only keys |
| Range Queries | Less efficient | More efficient (linked leaves) |
| Sequential Access | Slower | Faster |
Use in DBMS:
- Used for indexing where fast search + updates are needed
- Not as common as B+ Tree for range queries