Subjects/Science/Computer and Information Science/Computer Science/Data structure

Introduction to Data Structures

Understand the core data structures, their performance trade‑offs, and how to select the right one for a given problem.

Summary

Read Summary

Flashcards

Save Flashcards

Quiz

Take Quiz

Quick Practice

In terms of rules, what does a data structure act as for items?

1 of 23

Summary

Introduction to Data Structures What Is a Data Structure? A data structure is a systematic way of organizing and storing data in a computer's memory. Think of it as a specialized container that not only holds your data, but also defines the rules for how you can add, remove, find, and modify items within it. Just like choosing the right filing system affects how quickly you can retrieve a document, choosing the right data structure affects how efficiently your program runs. Why Data Structures Matter The choice of data structure can dramatically affect your program's performance. Consider a simple task: finding a specific value in a collection of data. Using a simple list: You'd have to scan through items one by one, taking linear time $O(n)$ where $n$ is the number of items. Using a binary search tree: You could eliminate half the remaining items with each comparison, taking logarithmic time $O(\log n)$. Using a hash table: You could find the item almost instantly in constant time $O(1)$ on average. This isn't a minor difference—with one million items, a linear scan might require one million operations, while a hash table would need just one or two. Data structures also influence memory usage and implementation difficulty. The fundamental trade-off is this: faster operations often require additional memory, and simpler structures may use less memory but have slower operations. Part of programming skill is recognizing these trade-offs and choosing wisely for your specific problem. Fundamental Data Structures Arrays An array is a collection of elements stored in a contiguous block of memory, where each element can be accessed by its numerical position or index. Key characteristics: Access by index: $O(1)$ — you can instantly reach any element if you know its position Insert or delete at the end: $O(1)$ — quick operation Insert or delete elsewhere: $O(n)$ — requires shifting all subsequent elements Fixed size: once created, an array's size typically cannot change (though some languages provide dynamic arrays that handle this automatically) When to use arrays: Simple lists, static lookup tables, or situations where you frequently need random access by position. Linked Lists A linked list is a chain of nodes, where each node contains data and a pointer (reference) to the next node in the sequence. Instead of storing elements in contiguous memory, linked lists store them wherever memory is available and link them together. Key characteristics: Insert or delete at the front: $O(1)$ — just update the pointer at the beginning Insert or delete elsewhere: $O(n)$ — must traverse the list to find the position Access by position: $O(n)$ — must follow pointers from the start Dynamic size: easily grows or shrinks as needed Memory overhead: requires extra memory for pointers When to use linked lists: Queues, stacks, or any situation where you're frequently adding/removing from the front and don't need random access by position. Stacks A stack is a specialized data structure that follows last-in, first-out (LIFO) order. Imagine a stack of plates: the last plate you put on is the first one you remove. Key operations: Push (add): $O(1)$ — add an element to the top Pop (remove): $O(1)$ — remove the top element Peek: $O(1)$ — view the top element without removing it When to use stacks: Managing function calls in programming languages, implementing undo features in applications, or solving problems requiring backtracking. Queues A queue is a specialized data structure that follows first-in, first-out (FIFO) order. Think of a line at a store: the first person in line is the first person served. Key operations: Enqueue (add): $O(1)$ — add an element to the back Dequeue (remove): $O(1)$ — remove the front element When to use queues: Processing items in order (like print job scheduling), breadth-first search in graphs, or any situation where fairness demands "first come, first served." Binary Search Trees A binary search tree (BST) is a hierarchical structure where each node has up to two children (left and right), and follows an ordering rule: all values in the left subtree are less than the node's value, and all values in the right subtree are greater. Key characteristics: Search, insert, delete: $O(\log n)$ when the tree is balanced (roughly equal branches) Search, insert, delete: $O(n)$ when the tree becomes unbalanced (like a linked list) Ordered iteration: you can traverse the tree to get values in sorted order When to use binary search trees: Problems requiring both fast lookup and the ability to retrieve data in sorted order. Balanced variants (like AVL trees or red-black trees) guarantee logarithmic performance. Hash Tables A hash table uses a hash function to convert keys into array indices. This allows direct access to data without searching through it. Here's how it works: when you insert a key-value pair, the hash function processes the key and produces an index. The value is then stored at that index (or in a bucket at that index if multiple keys hash to the same location). Key characteristics: Insert, search, delete: $O(1)$ average case — the hash function ideally maps each key to a unique index Insert, search, delete: $O(n)$ worst case — occurs when many keys hash to the same location (a collision) No ordering: hash tables don't maintain sorted order Memory: typically uses more memory than needed for the actual data When to use hash tables: Dictionary-like structures where you need fast lookup by key, caching frequently-accessed data, or counting occurrences of items. They're one of the most practically useful data structures. Graphs A graph is a collection of vertices (nodes) connected by edges (links). Graphs can be directed (edges have direction) or undirected (edges go both ways), and weighted (edges have values) or unweighted. Graphs are typically represented using either an adjacency list (each vertex maintains a list of its neighbors) or an adjacency matrix (a grid showing which vertices are connected). Key operations: Add/remove vertices or edges: $O(1)$ to $O(V)$ depending on representation Traverse using depth-first search or breadth-first search: $O(V + E)$ where $V$ is vertices and $E$ is edges When to use graphs: Modeling networks (social networks, computer networks), route planning, dependency tracking, or any problem involving connections between entities. Choosing the Right Data Structure The key to selecting an appropriate data structure is understanding the problem you're solving and matching it to a structure's strengths: Need fast lookup by key? Use a hash table Need ordered data or range queries? Use a binary search tree Need LIFO behavior? Use a stack Need FIFO behavior? Use a queue Need to model connections? Use a graph Need to frequently modify the beginning? Use a linked list Need fast random access? Use an array Always learn the Big-O time and space costs for each operation of the data structures in your toolbox. This knowledge allows you to make informed decisions that keep your algorithms running efficiently, even as the size of your data grows.

Flashcards

In terms of rules, what does a data structure act as for items?

A container that defines rules for adding, removing, finding, and changing items.

What is the time complexity of scanning a simple list for a search?

Linear time $O(n)$ (where $n$ is the number of items).

What is a common trade-off when attempting to achieve faster data operations?

Faster operations often require additional memory.

What physical structure does an array use to store data?

A fixed-size contiguous block of memory indexed by position.

What is the time complexity for accessing an array element by its index?

Constant time $O(1)$.

What is the time complexity for inserting or deleting an item at the end of an array?

Constant time $O(1)$.

What is the time complexity for inserting or deleting an item in an array (excluding the end)?

Linear time $O(n)$.

What are the two components of a node in a linked list?

Data and a pointer to the next node.

What is the time complexity for inserting or deleting at the front of a linked list?

Constant time $O(1)$.

What is the time complexity to traverse all elements in a linked list?

Linear time $O(n)$.

For which three scenarios are linked lists typically used?

Queues Stacks Situations requiring dynamic size

What specific data ordering principle does a stack follow?

Last-in, first-out (LIFO).

What is the time complexity of the push and pop operations in a stack?

Constant time $O(1)$.

What specific data ordering principle does a queue follow?

First-in, first-out (FIFO).

What is the time complexity of enqueue and dequeue operations in a queue?

Constant time $O(1)$.

How are nodes organized in a binary search tree relative to a parent node?

The left child is less than the parent and the right child is greater.

What is the time complexity for search, insertion, and deletion in a balanced binary search tree?

Logarithmic time $O(\log n)$.

How does a hash table index its array of buckets?

By a hash of the key.

What is the average time complexity for insertion, search, and deletion in a hash table?

Constant time $O(1)$.

What are the two fundamental components of a graph?

Vertices and edges.

What are the two common ways to represent a graph?

Adjacency list or adjacency matrix.

What is the time complexity for traversing a graph using DFS or BFS?

$O(V+E)$ (where $V$ is vertices and $E$ is edges).

Which data structures should be matched to backtracking, fast key lookup, and connectivity problems?

Stack for backtracking Hash table for fast key lookup Graph for connectivity problems

Quiz

What is the time complexity for searching an item in a balanced binary search tree?

1 of 6

Key Concepts

Data Structures

Data structure

Array

Linked list

Stack (data structure)

Queue (data structure)

Binary search tree

Hash table

Graph (data structure)

Algorithm Analysis

Big O notation

Algorithmic complexity

Definitions

Data structure

An organized way of storing and managing data in a computer, defining how items are added, removed, accessed, and modified.

Array

A fixed‑size, contiguous block of memory that provides constant‑time indexed access to its elements.

Linked list

A collection of nodes where each node holds data and a reference to the next node, allowing efficient insertions and deletions at the front.

Stack (data structure)

A last‑in, first‑out collection supporting push and pop operations in constant time.

Queue (data structure)

A first‑in, first‑out collection supporting enqueue and dequeue operations in constant time.

Binary search tree

A hierarchical structure of nodes where left children are less than the parent and right children are greater, enabling logarithmic‑time search, insertion, and deletion when balanced.

Hash table

A structure that maps keys to values using a hash function to index an array of buckets, offering average constant‑time operations.

Graph (data structure)

A set of vertices connected by edges, represented via adjacency lists or matrices, used to model relationships and support traversal algorithms.

Big O notation

A mathematical notation describing the upper bound of an algorithm’s time or space complexity relative to input size.

Algorithmic complexity

The study of how the resource requirements of algorithms grow with input size, guiding efficient data structure and algorithm selection.