RemNote Community
Community

Database management system Study Guide

Study Guide

📖 Core Concepts Database – Organized collection of data managed by a Database Management System (DBMS) that stores, retrieves, and analyzes data. Database System – DBMS + the actual database + applications that use it. Three‑level Architecture External view – How individual users see the data. Conceptual view – Global logical schema for developers/administrators. Internal view – Physical storage layout, indexes, files. Data Independence – Changes in one level (e.g., storage) do not require changes in higher‑level views. Relational Model – Data in tables (rows = records, columns = attributes). Primary keys uniquely identify rows; foreign keys create relationships. ACID Properties – Guarantees for a transaction: Atomicity – All‑or‑nothing. Consistency – Database stays valid. Isolation – Transactions don’t interfere. Durability – Committed changes survive failures. CAP Theorem – In a distributed system you can have at most two of Consistency, Availability, Partition tolerance. Normalization – Decompose tables so each fact is stored once, eliminating insert/update/delete anomalies. Materialized View – Physically stored query result for fast reads; must be refreshed. Replication – Copies of data for higher availability and load balancing. 📌 Must Remember Primary key = unique row identifier; foreign key links tables. SQL = single language that covers DDL, DML, DQL, DCL. Data definition = CREATE / ALTER / DROP objects. Data manipulation = INSERT, UPDATE, DELETE. Data query = SELECT with WHERE, JOIN, GROUP BY, HAVING. Data control = GRANT / REVOKE permissions. Indexing speeds look‑ups; typical types: B‑tree (row‑oriented) and column‑oriented indexes. Three major DBMS architectures: Shared‑memory – common RAM, multiple CPUs. Shared‑disk – each CPU has own memory, shares disks. Shared‑nothing – each node has its own memory and disk (eliminates contention). Backup vs. Restore – Backup captures full state; Restore re‑creates a prior state (point‑in‑time recovery). Eventual consistency – Guarantees that replicas will converge eventually, used by many NoSQL systems. 🔄 Key Processes Design → Logical → Physical Conceptual model (ER diagram) → Logical schema (tables, keys) → Physical design (storage format, indexes). Transaction Execution Begin → Perform reads/writes → Commit (make durable) or Rollback (undo). Query Optimization Parse → Generate logical plan → Apply rewrite rules → Choose physical plan (indexes, join order) → Execute. Backup Procedure Initiate full or incremental backup → Store backup files → Verify integrity. Restore Procedure Select backup set → Apply logs (if point‑in‑time) → Bring database online. 🔍 Key Comparisons Relational vs. NoSQL (Not‑Only‑SQL) Schema: Fixed vs. flexible. Consistency: ACID vs. eventual. Query: SQL joins vs. key/value or document queries. Shared‑memory vs. Shared‑disk vs. Shared‑nothing Resource sharing: RAM only vs. disk only vs. both private. Scalability: Shared‑nothing scales best, shared‑memory limited by bus contention. Row‑oriented vs. Column‑oriented storage Workload: OLTP (many writes) vs. OLAP (large scans, aggregates). Compression: Column stores compress better due to homogeneous data. ⚠️ Common Misunderstandings “SQL = only SELECT” – SQL also defines schemas (DDL) and controls access (DCL). “NoSQL = no structure” – Document and column‑family stores still enforce a schema at the application level. “Normalization always best” – Over‑normalization can hurt performance; denormalization is common in OLAP and NoSQL. “Replication = backup” – Replication provides live copies for availability; backups are immutable snapshots for recovery. 🧠 Mental Models / Intuition Table as a spreadsheet – rows = records, columns = attributes; primary key = row label. Transaction as a bank transfer – money leaves one account and appears in another or nothing changes (atomic). CAP Triangle – imagine a triangle; you can only pick two corners at a time for a distributed system. Index as a book’s table of contents – lets you jump directly to the data without scanning every page. 🚩 Exceptions & Edge Cases Materialized view staleness – must be refreshed after base‑table updates, otherwise queries return outdated data. Eventual consistency windows – reads may return stale data shortly after a write. Shared‑nothing failures – loss of a node removes both compute and its data slice; replication needed to avoid data loss. Normalization anomalies – only apply up to 3NF/BCNF for typical OLTP; higher normal forms rarely needed. 📍 When to Use Which Choose Relational DB when you need strong ACID guarantees, complex joins, and a fixed schema (e.g., financial systems). Choose Document Store for semi‑structured JSON‑like data with frequent schema changes (e.g., content management). Choose Key‑Value Store for simple look‑ups by a unique key with massive scale (caching, session storage). Choose Column‑Family for wide tables with many columns and analytical workloads (time‑series, log analytics). Use Shared‑nothing architecture for horizontally scalable workloads; use Shared‑disk for smaller clusters needing easy failover. 👀 Patterns to Recognize JOIN on primary‑foreign key → typical relational relationship. Frequent “SELECT … WHERE indexedcolumn = …” → candidate for an index. High write‑to‑read ratio with simple look‑ups → key‑value or document DB may be more efficient. Repeated aggregation over many rows → column‑oriented storage or materialized view improves performance. CAP trade‑off clues – if the question mentions network partitions, decide which two guarantees the system favors. 🗂️ Exam Traps “All NoSQL systems are eventually consistent.” – Some (e.g., NewSQL) provide strong consistency. “Normalization eliminates all redundancy.” – Denormalization is deliberately used for performance; not all redundancy is bad. “Replication automatically provides backup.” – Replicas are live copies; they do not protect against logical errors or accidental deletes. “Shared‑nothing means no communication between nodes.” – Nodes still coordinate (e.g., for distributed transactions) but they do not share memory/disk. “Materialized view can be updated directly.” – Usually read‑only; updates must be propagated to base tables. --- Focus on these high‑yield points during your final review; they cover the core terminology, guarantees, architectures, and decision‑making that most exam questions test.
or

Or, immediately create your own study flashcards:

Upload a PDF.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or