Database management system - Foundations of Database Systems
Understand core database concepts, the evolution from relational to NoSQL models, and their key use cases and standards.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz
Quick Practice
What is the definition of a database?
1 of 16
Summary
Overview of Databases
What Is a Database?
A database is an organized collection of data stored and accessed through a database management system (DBMS). Think of a database as a digital filing system that's far more sophisticated than a simple spreadsheet. The DBMS is the software that sits between you and the data, handling all the complex tasks of storing, organizing, and retrieving information.
The DBMS enables three critical functions:
Capture: Recording new data into the system
Storage: Organizing data efficiently so it can be found quickly
Retrieval and Analysis: Getting the data back out and making sense of it
When we talk about a database system, we mean the complete package: the database itself, the DBMS software, and all the applications that use it together.
What Does a DBMS Actually Do?
A modern DBMS handles several essential responsibilities that make data management possible at scale:
Data Security and Integrity: The DBMS enforces security rules so that only authorized people can access sensitive data. It also maintains data integrity, meaning it ensures the data stays accurate and consistent.
Performance Monitoring: The DBMS watches system performance and identifies bottlenecks before they become problems.
Concurrency Control: In real-world systems, many users or applications access the database simultaneously. The DBMS manages this controlled access so that everyone's operations don't interfere with each other or corrupt the data.
Recovery from Failures: If the system crashes unexpectedly, the DBMS has mechanisms to recover data and return to a consistent state without losing information.
These responsibilities are collectively called database design concerns, and they guide how databases are built and maintained.
History and Evolution of Databases
Understanding how databases evolved helps you appreciate why modern databases are designed the way they are. Each major evolution solved problems that earlier systems couldn't handle.
The Relational Model: A Revolutionary Approach (1970s)
The most important moment in database history came in 1970 when computer scientist Edgar Frank Codd proposed the relational model. This simple but powerful idea transformed data management: organize all data as tables with rows and columns, just like a spreadsheet.
The image above shows a real SQL query in action—notice how the result appears as a table with columns like title, releaseyear, length, and replacementcost. This is the relational model in practice.
Primary Keys and Relationships
In a relational database, every row in a table needs a unique identifier called a primary key. For example, if you have a table of customers, you might use a customer ID number as the primary key—no two customers can share the same ID.
Primary keys do more than just identify rows; they also enable relationships between tables. Here's how it works:
Notice the "key" label pointing to login, and the related table below showing the connection through login and phone. This demonstrates how a primary key from one table (the login column) can connect to another table to show that the phone number belongs to that specific login. This linking mechanism is fundamental to relational databases and lets you break data into separate tables without losing the ability to connect related information.
Why Separate Tables? Normalization
You might wonder: why not just put all the data in one giant table? The answer is normalization, a key principle of relational design. Normalization means organizing data so that each fact is stored in only one place.
For example, suppose you have a table of movie rentals. Without normalization, you might repeat the director's name dozens of times for each movie they've directed. With normalization, you put director information in a separate table and reference it using primary keys. This has several advantages:
Easier updates: If a director's information changes, you update it once instead of in hundreds of rows
Smaller storage: You're not duplicating data
Consistency: You're guaranteed that all references to the same director use the same information
Query Languages and Views
The relational model introduced declarative queries—you describe what data you want, not how to get it. The DBMS figures out the most efficient way to retrieve it. This is revolutionary because it separates what you want to know from how the computer finds the answer.
The relational model also introduced views, which are virtual tables that present data in different ways. A view might combine information from multiple tables or show only certain columns. Unlike a regular table, a view doesn't store its own data—it just shows you a particular arrangement of data from underlying tables.
<extrainfo>
Before the Relational Model: The Hierarchical Database (1960s)
Before Codd's innovation, databases used a hierarchical model, organizing data in a tree-like structure where each record has a single parent record.
This image shows a hierarchical structure: notice how each record points to a parent above it, creating a tree. While this worked for some situations, it was inflexible. If you wanted to query data in ways that didn't fit the hierarchy, you were stuck. You had to navigate down the tree manually, rather than asking for data directly. The relational model's flexibility was a massive improvement.
</extrainfo>
<extrainfo>
Later Developments (1980s–Present)
Object-Relational Mismatch and Solutions
When object-oriented programming became popular, developers faced a problem: their code used objects (with methods and properties), but databases stored tables (with rows and columns). This object-relational impedance mismatch meant tedious conversion work to move data between the two formats.
Two solutions emerged:
Object Databases: Purpose-built to store objects directly, avoiding the conversion step.
Object-Relational Mapping (ORM) Libraries: Software libraries that automatically convert between objects in code and rows in relational tables, allowing developers to keep using relational databases while writing object-oriented code.
The Rise of NoSQL and NewSQL (2000s–Present)
As the internet and big data arrived, traditional relational databases struggled with scale. Not Only Structured Query Language (NoSQL) databases emerged as alternatives offering:
Fast key-value stores: Look up data by a simple key, very quickly
Flexibility: No fixed schema required—different records can have different structures
Availability at scale: Trade off strict consistency for guaranteed availability across distributed systems
However, NoSQL systems forced developers to give up guarantees that relational databases provided. Enter NewSQL databases, which aim to preserve relational guarantees (atomicity, consistency, isolation, durability—ACID properties) while delivering the scalability of NoSQL systems.
An important theoretical limitation exists here: the CAP theorem (Consistency, Availability, Partition tolerance) states that distributed systems can guarantee at most two of these three properties simultaneously. Understanding this tradeoff is crucial for modern database architecture.
</extrainfo>
Real-World Applications
Operational Databases
Most organizations use databases to run their day-to-day operations. An operational database stores detailed transaction data: customer contact information, employee records, product details, inventory levels, and financial transactions. These databases need to be reliable, available, and secure since organizations depend on them constantly.
<extrainfo>
Specialized Database Uses
Data Warehouses serve a different purpose: they aggregate, transform, and combine data from multiple operational databases into a central repository. Data warehouses are optimized for analysis and reporting, allowing managers to make decisions based on historical patterns rather than just today's transactions.
Real-Time Databases are built for speed—they process transactions fast enough that systems can take immediate action based on results. For example, telecommunications switching systems use real-time databases to route calls instantly based on current network conditions.
</extrainfo>
Standards: SQL and XML
Structured Query Language (SQL) and Extensible Markup Language (XML) are the foundational standards for relational databases. SQL is the language you use to query and manipulate data in relational systems—it's the tool that makes those declarative queries possible. XML is a standard format for representing and exchanging structured data.
These standards ensure that databases from different vendors can work together and that data can be portable across systems.
Flashcards
What is the definition of a database?
An organized collection of data stored and accessed through a database management system.
What is the primary function of a database management system (DBMS)?
It is software that enables users and applications to capture, store, retrieve, and analyze data.
How is data structured in the hierarchical model used in the 1960s?
In a tree-like structure where each record has a single parent.
Who proposed the relational model in 1970?
Edgar Frank Codd
How does the relational model describe data structures?
As tables consisting of rows and columns.
What is the purpose of a primary key in a relational table?
To uniquely identify rows and establish cross-table relationships.
What is the goal of normalization in database design?
To split data into separate tables so each fact is stored only once, simplifying updates.
What are views in a relational database?
Virtual tables that present data in alternative ways but cannot be directly updated.
What is the function of query optimization?
It allows the DBMS to find efficient access paths for declarative queries.
What specific problem were object databases designed to solve?
The object-relational impedance mismatch between programmed objects and relational tables.
What is the purpose of object-relational mapping (ORM) libraries?
To automatically map objects in code to relational tables.
What consistency model do NoSQL systems often use to achieve availability?
Eventual consistency.
What is the primary objective of NewSQL databases?
To retain relational semantics and ACID guarantees while delivering NoSQL-like scalability.
What does the CAP theorem state regarding distributed systems?
A system can provide at most two of the three guarantees (Consistency, Availability, Partition tolerance) simultaneously.
What is the function of a data warehouse?
To aggregate, transform, and load data from multiple sources for managerial analysis.
Which two languages are considered core standards for data manipulation and representation in relational systems?
Structured Query Language (SQL)
Extensible Markup Language (XML)
Quiz
Database management system - Foundations of Database Systems Quiz Question 1: What term describes the combination of a database, its DBMS, and associated applications?
- Database system (correct)
- Data warehouse
- Network topology
- Operating system
Database management system - Foundations of Database Systems Quiz Question 2: How does the hierarchical database model organize data?
- In a tree‑like structure where each record has a single parent (correct)
- As a flat file with no relationships
- Using a network of interlinked nodes without hierarchy
- Through tables with rows and columns
Database management system - Foundations of Database Systems Quiz Question 3: Who proposed the relational model and in what year?
- Edgar Frank Codd in 1970 (correct)
- Michael Stonebraker in 1985
- James Gosling in 1995
- Donald Knuth in 1968
Database management system - Foundations of Database Systems Quiz Question 4: What is the purpose of query optimization?
- To find efficient access paths for declarative queries (correct)
- To compress query text for storage
- To translate queries into natural language
- To generate graphical charts from query results
Database management system - Foundations of Database Systems Quiz Question 5: Which consistency model is commonly used by NoSQL systems to achieve availability and partition tolerance?
- Eventual consistency (correct)
- Strong consistency with immediate replication
- Transactional consistency across all nodes
- Read‑only consistency
Database management system - Foundations of Database Systems Quiz Question 6: Which two standards are core for data manipulation and representation in relational database systems?
- SQL and XML (correct)
- HTML and CSS
- JSON and SOAP
- SMTP and POP3
Database management system - Foundations of Database Systems Quiz Question 7: According to its definition, which two characteristics best describe a database?
- An organized collection of data accessed through a DBMS (correct)
- A collection of unrelated files stored on a computer
- A software program that compiles source code
- A network protocol for transmitting multimedia
Database management system - Foundations of Database Systems Quiz Question 8: What is the primary purpose of operational databases within an organization?
- Supporting day‑to‑day transaction processing (correct)
- Aggregating historical sales data for reporting
- Storing backup copies of archived data
- Hosting the company’s public website
Database management system - Foundations of Database Systems Quiz Question 9: Which responsibility is performed by a DBMS to ensure reliable data handling?
- Maintain data integrity (correct)
- Encrypt network traffic
- Provide user interface design
- Manage hardware fan speed
Database management system - Foundations of Database Systems Quiz Question 10: Real‑time databases are commonly used in which of the following applications?
- Telecommunications switching (correct)
- Batch payroll processing
- Static reporting
- Archival storage
What term describes the combination of a database, its DBMS, and associated applications?
1 of 10
Key Concepts
Database Fundamentals
Database
Database Management System
Relational Model
Normalization (database)
SQL
Object-Relational Mapping
Advanced Database Concepts
NoSQL
CAP Theorem
Data Warehouse
Real-time Database
Definitions
Database
An organized collection of data stored electronically and accessed via a database management system.
Database Management System
Software that enables users and applications to create, read, update, and delete data in a database.
Relational Model
A data model that represents information as tables (relations) of rows and columns, based on set theory.
Normalization (database)
The process of organizing data to reduce redundancy and improve integrity by dividing it into related tables.
NoSQL
A class of database systems that provide flexible schemas and scalable performance, often using key‑value, document, or graph storage.
CAP Theorem
A principle stating that a distributed system can simultaneously provide at most two of three guarantees: consistency, availability, and partition tolerance.
Data Warehouse
A centralized repository that aggregates and stores data from multiple sources for analytical querying and reporting.
Real-time Database
A database designed to process transactions with minimal latency to support immediate decision‑making or control.
SQL
A standardized language for defining, manipulating, and querying relational databases.
Object-Relational Mapping
A programming technique that maps objects in code to relational database tables, facilitating data persistence.