Subjects/Technology/Software and Web Development/Software Engineering/Data structure

Common Data Structure Types

Understand the core concepts, operations, and typical use cases of common data structures like arrays, linked lists, hash tables, trees, graphs, stacks, queues, and tries.

Summary

Read Summary

Flashcards

Save Flashcards

Quiz

Take Quiz

Quick Practice

What mechanism is used to access a specific element within an array?

1 of 18

Summary

Common Data Structure Types Data structures are fundamental organizational patterns for storing and accessing information in computer programs. Each structure is optimized for different types of operations, and understanding their strengths and weaknesses is crucial for writing efficient code. This overview covers the most important data structures you'll encounter in computer science. Arrays An array is a sequence of elements stored in contiguous memory locations, all typically of the same type. Each element is accessed through an index, which is usually an integer starting from 0. This means you can directly retrieve any element if you know its position—a capability called random access. Arrays come in two varieties: fixed-length arrays, which cannot change size after creation, and resizable arrays (sometimes called dynamic arrays or vectors), which can grow or shrink as needed. The contiguous memory layout makes arrays very efficient for reading data, but insertion or deletion in the middle of an array is costly because it requires shifting all subsequent elements. Linked Lists A linked list is fundamentally different from an array in how it organizes data. Instead of storing elements in consecutive memory locations, a linked list chains elements together through nodes. Each node contains a value and a pointer (a reference) to the next node in the sequence. The key advantage of linked lists is that inserting or removing an element anywhere in the list is efficient—you only need to modify a few pointers, not shift all remaining elements. However, this comes with a trade-off: if you want to access the element at position 5, you must traverse from the beginning through positions 1, 2, 3, and 4 first. This makes random access much slower than with arrays. Key trade-off to remember: Arrays are fast for random access but slow for insertions/deletions. Linked lists are fast for insertions/deletions but slow for random access. Records (Structs or Tuples) A record (sometimes called a struct or tuple depending on the programming language) is a composite data structure that groups multiple pieces of data together. Unlike arrays where elements are accessed by numeric index, records organize data into named fields or members. For example, a record representing a person might have fields like name, age, and address. You access these fields by name rather than by position. Records are essential building blocks for more complex data structures and allow you to represent real-world entities naturally in code. Hash Tables (Hash Maps) A hash table is a powerful data structure that provides remarkably fast lookups. It uses a hash function to convert a key into an array index, allowing you to find values in average constant time—that is, in $O(1)$ time regardless of how many items the table contains. Here's how it works: when you insert a key-value pair, the hash function processes the key and produces an index. The value is then stored at that index in an underlying array. When you later search for that key, you apply the same hash function to instantly locate the value. The diagram above illustrates this process. Three different names are passed through a hash function, which maps them to different array locations (buckets). This allows rapid retrieval without examining all stored values. Hash Collisions: A problem arises when two different keys produce the same index—this is called a collision. Good hash functions minimize collisions, but they cannot eliminate them entirely. Two common solutions exist: Chaining: Each array position stores a linked list of all key-value pairs that hashed to that index. When a collision occurs, the new pair is added to the chain. Open addressing: When a collision occurs, the algorithm searches for the next available empty position in the array to store the new pair. Hash tables are incredibly useful and widely deployed in real-world applications whenever you need to associate keys with values. Trees A tree is a hierarchical data structure consisting of nodes connected by edges. One special node, called the root, sits at the top of the hierarchy. All other nodes are organized into subtrees, each rooted at a child of the parent node. The hierarchical structure is powerful because it naturally represents many real-world relationships. For example, file systems use trees: the root directory contains subdirectories, which contain further subdirectories and files. Binary trees are the most common variant, where each node has at most two children. Other important types include: AVL trees: Self-balancing binary trees that maintain height balance to ensure efficient operations B-trees: Trees with multiple children per node, commonly used in databases for efficient disk access Trees enable efficient searching and sorting operations, and they're fundamental to understanding more advanced topics like binary search trees and heaps. <extrainfo> Why trees matter in computer science: Trees are essential for representing hierarchical data, enabling efficient search algorithms (like binary search), and serving as the backbone for more specialized structures like heaps and binary search trees. </extrainfo> Graphs A graph is a more general structure than a tree, consisting of vertices (nodes) connected by edges. While trees are restricted to a hierarchical parent-child relationship, graphs allow arbitrary relationships between any vertices. Graphs can be directed (edges have a direction, like a one-way street) or undirected (edges work both ways). They can also be acyclic (containing no loops) or contain cycles (paths that loop back on themselves). A tree is actually a special case of a graph—specifically, a connected acyclic graph with a designated root. Common graph traversal algorithms explore all vertices in a graph: Breadth-first search (BFS): Explores all neighbors before moving deeper, useful for finding shortest paths Depth-first search (DFS): Explores as far as possible along each branch before backtracking Graphs are used to represent networks of all kinds: social networks, computer networks, transportation systems, and dependency relationships in software. Stacks A stack is a simple data structure following the Last In, First Out (LIFO) principle. Imagine a stack of plates: you add plates to the top and remove plates from the top. The last plate you added is the first one you remove. A stack supports two primary operations: push: Adds an element to the top of the stack pop: Removes and returns the topmost element Stacks are used throughout computer science, most famously in function call management (the "call stack"), expression evaluation, and backtracking algorithms. Queues A queue is the opposite of a stack, following the First In, First Out (FIFO) principle. Like a line at a grocery store, the first person to enter the line is the first to be served. A queue supports two primary operations: enqueue: Adds an element to the rear of the queue dequeue: Removes and returns the front element Queues are essential for scheduling tasks, managing requests in web servers, simulating real-world waiting scenarios, and as a core component of algorithms like breadth-first search. Tries (Prefix Trees) A trie is a specialized tree designed specifically for storing and searching strings efficiently. Each node in a trie represents a single character, and edges represent transitions from one character to the next. Words are stored by following a path from the root through nodes representing each character in sequence. The power of tries lies in their efficiency for prefix-based operations. You can quickly find all words beginning with a specific prefix by following the path representing that prefix. This makes tries ideal for: Autocomplete: As a user types, the system finds all matching words with that prefix Spell-checking: Determining if a word exists in the dictionary Dictionary implementations: Storing large collections of words efficiently While more specialized than general-purpose data structures, tries are invaluable when working with string data that benefits from prefix operations. <extrainfo> Tries can be memory-intensive compared to other string storage methods, but their speed for prefix searches often justifies this trade-off in applications like search engines and text editors. </extrainfo>

Flashcards

What mechanism is used to access a specific element within an array?

An integer index.

How do typical array implementations allocate memory for their elements?

They allocate contiguous memory.

What two components are contained within each node of a linked list?

A value and a pointer to the next node.

Why are insertion and removal operations efficient in a linked list?

They do not require relocating the rest of the list.

How does the speed of random access in a linked list compare to an array?

It is slower.

What is the common term for the fields within a record?

Members.

How does a hash table map keys to indexes for fast retrieval?

By using a hashing function.

What is the average case access time complexity for a hash table?

Constant time ($O(1)$).

What are two common techniques used to handle hash collisions?

Chaining Open addressing

What is the name of the single top-level node in a tree's hierarchical organization?

The root.

What are the two primary components that make up a graph?

Vertices (nodes) and edges.

In terms of direction and loops, what are the different classifications of graphs?

Directed or undirected, and cyclic or acyclic.

What are the two primary algorithms used for graph traversal?

Breadth-first search (BFS) Depth-first search (DFS)

Which data ordering principle does a stack follow?

Last In, First Out (LIFO).

What are the two primary operations supported by a stack?

Push (adds an element to the top) Pop (removes the topmost element)

Which data ordering principle does a queue follow?

First In, First Out (FIFO).

What are the two primary operations supported by a queue?

Enqueue (adds an element to the rear) Dequeue (removes the front element)

How are strings represented within a trie structure?

Characters are stored as nodes, and each edge represents a character.

Quiz

How are the individual components of a record accessed?

1 of 5

Key Concepts

Linear Data Structures

Array

Linked list

Stack

Queue

Non-Linear Data Structures

Tree

Graph

Hash table

Trie

Data Aggregation

Record (struct)

Definitions

Array

A contiguous memory collection of elements accessed by integer index.

Linked list

A linear sequence of nodes where each node points to the next, allowing efficient insertions and deletions.

Record (struct)

An aggregate data type containing named fields that can hold values of various types.

Hash table

A data structure that maps keys to values using a hash function for near‑constant‑time lookup.

Tree

A hierarchical structure with a root node and child subtrees, enabling efficient search and ordering.

Graph

A set of vertices connected by edges, representing relationships and supporting traversal algorithms.

Stack

A LIFO (last‑in, first‑out) collection supporting push and pop operations.

Queue

A FIFO (first‑in, first‑out) collection supporting enqueue and dequeue operations.

Trie

A prefix tree that stores strings character by character for fast retrieval of words and prefixes.