Data structure Study Guide
Study Guide
📖 Core Concepts
Data Structure – An organized way to store and retrieve data efficiently; it couples values, the relationships among them, and the operations you can perform.
Abstract Data Type (ADT) – The logical description (what operations do) that a data structure implements physically.
Memory Access – Implementations rely on pointers (memory addresses) to fetch/store items anywhere in RAM.
Contiguous vs. Linked Storage –
Arrays: data placed in consecutive memory; address computed by arithmetic.
Linked structures: each element stores a pointer to the next, allowing flexible insertion/removal.
Algorithm Design – Choosing the right structure is often the single biggest factor in making an algorithm fast.
---
📌 Must Remember
Array – O(1) random access via index; fixed‑length unless resized.
Linked List – O(1) insert/delete at head or known position; O(n) search/random access.
Hash Table – Average‑case O(1) lookup/insert/delete; collisions handled by chaining or open addressing.
Tree – Hierarchical; balanced trees give O(log n) search/insert/delete.
Graph – Vertices + edges; traversal via BFS (shortest unweighted path) or DFS (depth exploration).
Stack – LIFO; push & pop are O(1).
Queue – FIFO; enqueue & dequeue are O(1).
Trie – Stores strings character‑by‑character; prefix search O(k) where k = length of the prefix.
---
🔄 Key Processes
Creating an Array
Allocate contiguous block of size n·size(element).
Compute element address: addr = base + index × elementsize.
Inserting into a Singly Linked List (at head)
Allocate new node → set newNode.next = head.
Update head = newNode.
Hash Table Insertion
Compute h = hash(key) mod tableSize.
If slot empty → store; else resolve collision (chaining: add to list; open addressing: probe).
BST Search
Start at root; go left if key < node.key, right otherwise; repeat until found or leaf.
DFS (recursive)
DFS(v): mark v; for each neighbor u of v if not marked → DFS(u).
BFS (queue‑based)
Enqueue start node; while queue not empty → dequeue, visit, enqueue unvisited neighbors.
Stack Push/Pop
Push: top = top + 1; stack[top] = x.
Pop: x = stack[top]; top = top – 1.
Trie Insert (string s)
Start at root; for each character c in s → follow/create child edge c; mark final node as end‑of‑word.
---
🔍 Key Comparisons
Array vs. Linked List
Access: Array O(1) by index ↔ Linked List O(n) traversal.
Insert/Delete: Array O(n) (shifts) ↔ Linked List O(1) at known position.
Hash Table vs. Array
Lookup: Hash O(1) avg ↔ Array O(1) only with index, not key‑based.
Memory: Hash may waste slots for collisions ↔ Array uses exactly n slots.
Tree vs. Graph
Structure: Tree = acyclic, single parent ↔ Graph can have cycles & multiple connections.
Search: Balanced tree gives O(log n) ↔ Graph BFS/DFS O(V+E).
Stack vs. Queue
Order: Stack LIFO ↔ Queue FIFO.
Typical Use: Undo/recursion ↔ Scheduling, breadth‑first processing.
Trie vs. Hash Table (for strings)
Prefix queries: Trie O(k) ↔ Hash table O(k) only for exact key, no prefix support.
Space: Trie may use many nodes for sparse prefixes ↔ Hash table generally more compact.
---
⚠️ Common Misunderstandings
“Linked lists give fast random access.” – Access still requires O(n) traversal.
“Hash tables are always O(1).” – Worst‑case (many collisions) degrades to O(n).
“All trees are balanced.” – Unbalanced trees can become linked‑list‑like, losing O(log n) guarantees.
“Pointers guarantee safety.” – Mismanaged pointers cause leaks, dangling references, and bugs.
“Arrays can grow without cost.” – Dynamic arrays need occasional reallocation & copying (amortized O(1) but occasional O(n)).
---
🧠 Mental Models / Intuition
Array = Row of lockers – Each locker has a fixed number; you can jump directly to any locker.
Linked List = Treasure map – Each clue (node) points to the next location; you follow the path.
Hash Table = Mailroom with bins – A key’s zip code (hash) decides which bin; collisions = multiple letters in same bin.
Tree = Family tree – One ancestor (root) branching into children; depth = generations.
Stack = Plate pile – Only the top plate is reachable.
Queue = Checkout line – First person in line is served first.
Trie = Word ladder – Each rung (character) leads to the next; sharing prefixes saves space.
---
🚩 Exceptions & Edge Cases
Collision handling – Chaining can turn a hash table’s O(1) lookups into O(k) where k is chain length.
Dynamic array resizing – When capacity is exceeded, allocate new block (usually 2×) and copy → occasional O(n) cost.
Unbalanced BST – Degenerates to linked list → O(n) search/insert.
Graph cycles – DFS/BFS must track visited nodes to avoid infinite loops.
Trie memory blow‑up – Sparse alphabets cause many empty child pointers; use compact representations (e.g., ternary search trees).
---
📍 When to Use Which
Need constant‑time index access → Array (or dynamic array if size changes).
Frequent insert/delete in the middle → Linked List (or doubly linked list for back‑ward removal).
Key‑value lookups, fast average case → Hash Table.
Ordered data, range queries, or sorted output → Balanced Tree (AVL, Red‑Black, B‑Tree).
Hierarchical relationships (XML/JSON, file systems) → Tree.
Modeling networks, routes, or dependencies → Graph (choose adjacency list for sparse, matrix for dense).
Undo/recursive algorithms → Stack.
Processing tasks in arrival order → Queue (or priority queue if ordering by priority).
Prefix search, autocomplete, spell‑check → Trie.
---
👀 Patterns to Recognize
“index × elementsize + base” → Direct array address ⇒ O(1) access.
“pointer → next” chain → Insertion/deletion without shifting ⇒ O(1) at known spot.
“hash(key) mod m” → Constant‑time bucket lookup; watch for clustering.
“balanced height ≈ log₂ n” → Guarantees logarithmic operations in trees.
“BFS level order” → Shortest path in unweighted graphs.
“DFS backtrack” → Detect cycles, topological order, connectivity.
“Trie depth = query length” → Search time depends only on string length, not on number of stored keys.
---
🗂️ Exam Traps
Choice claiming “O(1) random access in linked list” – Wrong; only arrays give true O(1) index access.
Option stating “hash tables never degrade” – Ignoring worst‑case collisions; can become O(n).
Answer that “all trees are binary” – Trees may have any branching factor (B‑trees, multi‑way).
Distractor that “queues are implemented with arrays only” – Queues can be circular arrays or linked lists.
Misleading “trie memory is always less than hash table” – Tries can consume more memory for sparse keys.
“DFS always finds shortest path” – Only BFS guarantees shortest path in unweighted graphs.
---
or
Or, immediately create your own study flashcards:
Upload a PDF.
Master Study Materials.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or