Computer science Study Guide
Study Guide
📖 Core Concepts
Computation – the process of transforming inputs to outputs via defined rules (algorithms).
Algorithm – a finite, step‑by‑step procedure that solves a problem; its efficiency is measured in time and space.
Data Structure – organized ways to store data (e.g., arrays, linked lists, trees, graphs) to enable fast access/manipulation.
Computability Theory – studies which problems can be solved by any algorithmic model (e.g., Turing machines).
Computational Complexity – classifies problems by the resources (time, space) needed; central classes: P, NP, NP‑complete, NP‑hard.
Information Theory – quantifies information (bits) and sets limits on compression & reliable transmission (Shannon entropy).
Coding Theory – designs codes for error detection/correction, compression, and cryptographic security.
Concurrency / Parallelism – execution of multiple computations at the same time; concurrency may involve interaction, parallelism focuses on speed‑up.
Distributed Systems – collection of independent computers that communicate to achieve a common goal; issues include consistency, fault‑tolerance, and latency.
Computer Networks – hardware and protocols that move data between nodes; key concerns: scalability, security, reliability.
Computer Security & Cryptography – protects data confidentiality, integrity, and availability; uses encryption, authentication, and secure protocols.
Databases & Data Mining – organized storage (DBMS) + techniques to discover patterns/trends in large datasets.
Artificial Intelligence (AI) – goal‑oriented processes (search, learning, planning); Turing Test evaluates human‑like behavior.
Programming Paradigms – styles of coding (procedural, object‑oriented, functional) often supported together in modern languages.
---
📌 Must Remember
P vs NP: P = problems solvable in polynomial time; NP = problems whose solutions can be verified in polynomial time. The open question: P = NP?
Big‑O notation: captures asymptotic upper bound of algorithm running time; e.g., $O(n\log n)$ for mergesort.
Shannon Entropy: $H = -\sum pi \log2 pi$ bits, measures average information per symbol.
Error‑detecting code: parity bit adds 1 bit to detect odd‑numbered errors.
Error‑correcting code: Hamming distance $d$ allows detection of up to $d-1$ errors and correction of up to $\lfloor (d-1)/2 \rfloor$ errors.
Amdahl’s Law (parallel speed‑up): $S = \frac{1}{(1-P) + \frac{P}{N}}$, where $P$ = parallelizable fraction, $N$ = number of processors.
CAP Theorem (distributed DB): cannot simultaneously guarantee Consistency, Availability, and Partition tolerance; pick two.
OSI Model layers: Physical → Data Link → Network → Transport → Session → Presentation → Application.
Public‑key cryptography: uses a key pair (public for encryption, private for decryption). RSA security relies on difficulty of factoring large integers.
Turing Test: if a human judge cannot reliably distinguish machine from human responses, the machine is said to exhibit intelligence.
---
🔄 Key Processes
Algorithm Analysis (Time/Space)
Define input size $n$.
Count dominant operations → express as function of $n$.
Simplify using Big‑O (drop constants, lower‑order terms).
Designing a Data Structure for Fast Lookup
Identify required operations (search, insert, delete).
Choose structure: hash table for $O(1)$ average lookup, balanced BST for $O(\log n)$ ordered operations.
Proving a Problem is NP‑Complete
Show problem is in NP (solution verifiable in poly time).
Reduce a known NP‑complete problem to it in polynomial time.
Constructing a Hamming Code
Determine parity‑bit positions (powers of 2).
Compute each parity bit as XOR of covered data bits.
Transmit codeword; on receipt, recompute parities to locate error.
Executing a Distributed Transaction (Two‑Phase Commit)
Phase 1: Coordinator sends “prepare” to all nodes; each replies “ready” or “abort”.
Phase 2: If all ready → “commit”; else → “abort”.
Parallelizing a Task (Amdahl’s Law)
Identify serial vs parallel portions.
Estimate $P$ (parallel fraction).
Compute expected speed‑up for given $N$ processors.
---
🔍 Key Comparisons
P vs NP – P: solvable quickly; NP: verifiable quickly.
Hash Table vs Balanced BST – Hash: $O(1)$ average lookup, unordered; BST: $O(\log n)$ guaranteed lookup, maintains order.
Symmetric vs Asymmetric Encryption – Symmetric (same key both sides, fast); Asymmetric (public/private keys, slower, enables key exchange).
Concurrency vs Parallelism – Concurrency: multiple tasks progress, may share resources; Parallelism: tasks run simultaneously to reduce runtime.
CAP vs PACELC – CAP focuses on trade‑off among Consistency, Availability, Partition tolerance; PACELC adds latency vs consistency trade‑off when no partition.
---
⚠️ Common Misunderstandings
“NP‑hard means unsolvable.” → NP‑hard includes problems that may be outside NP (e.g., optimization) but not necessarily unsolvable; they’re at least as hard as NP‑complete problems.
“Big‑O is the exact running time.” → It’s an upper bound; constant factors and lower‑order terms are ignored.
“Hash collisions cannot occur.” → Collisions are inevitable with limited hash space; proper handling (chaining, open addressing) is required.
“Encryption guarantees integrity.” – Encryption hides data; integrity needs separate mechanisms (MAC, digital signatures).
“Distributed systems are always faster.” – Network latency, consistency protocols, and coordination overhead can offset speed gains.
---
🧠 Mental Models / Intuition
“Algorithm as a recipe” – each step must be clear, finite, and lead to a finished dish (solution).
“Complexity class as a fence” – P‑problems stay inside the “easy” fence; NP‑problems lie outside but can be peeked at via verification.
“Information as water in a pipe” – entropy measures how much water (uncertainty) can flow; compression squeezes the pipe, error‑correction adds extra pipes for leaks.
“Parallelism like adding more workers to a factory line” – speed‑up limited by the portion of work that cannot be divided (serial bottleneck).
“CAP as a three‑leg stool” – you can only keep two legs on the ground at once; the third will wobble.
---
🚩 Exceptions & Edge Cases
Hash tables: performance degrades to $O(n)$ when many collisions or poor hash function.
Amdahl’s Law: assumes fixed problem size; for scaling problems, Gustafson’s Law may be more appropriate.
NP‑completeness reductions: reduction must be polynomial‑time and many‑one (output instance of target problem).
CAP: under certain consistency models (e.g., eventual consistency), systems can appear to provide all three properties over time.
Hamming codes: only correct single‑bit errors; multiple‑bit errors may be mis‑detected.
---
📍 When to Use Which
Choose Data Structure → Need $O(1)$ average lookups? → Hash table. Need ordered traversal? → Balanced BST or B‑tree (disk‑based).
Select Encryption → Large bulk data → Symmetric (AES). Need secure key exchange or digital signatures → Asymmetric (RSA/ECC).
Apply Parallelism → Problem is embarrassingly parallel (independent sub‑tasks) → Use data parallelism; has dependencies? → Consider pipeline or task parallelism with synchronization.
Pick Consistency Model → Real‑time user interaction (e.g., social media) → Eventual consistency; financial transactions → Strong consistency (two‑phase commit).
Algorithm Design → If problem size is huge and exact solution is NP‑hard, consider approximation or heuristic algorithms (greedy, local search).
---
👀 Patterns to Recognize
“Divide and Conquer” → Recurrence $T(n)=aT(n/b)+f(n)$ → look for $O(n\log n)$ or $O(n)$ patterns.
“Greedy‑choice property” → Optimal substructure + locally optimal choice → e.g., Huffman coding, Kruskal’s MST.
“Dynamic Programming” → Overlapping subproblems & optimal substructure → memoization table or bottom‑up array.
“Reduction to known NP‑complete problem” → Spot clause/variable mapping → often SAT, 3‑SAT, or CLIQUE.
“Network protocol layering” → Identify which OSI layer a function belongs to (e.g., routing = Network layer).
---
🗂️ Exam Traps
Misreading Big‑O – choosing $O(n^2)$ when algorithm is actually $O(n\log n)$ because of hidden divide‑and‑conquer.
Confusing “NP” with “NP‑complete” – NP includes many problems that are not NP‑complete; answer choices may blur this.
Assuming “secure” = “encrypted” – exam may list “encryption provides confidentiality only,” and a choice adding integrity is wrong.
CAP Theorem misinterpretation – some questions present “a system can be both consistent and partition‑tolerant,” which is false in the strict CAP sense.
Hash table collision handling – a distractor may claim “open addressing guarantees $O(1)$ worst‑case,” which is incorrect; worst‑case can degrade to $O(n)$.
Parallel speed‑up – forgetting Amdahl’s limit; a choice that suggests infinite speed‑up with more processors is a trap.
Hamming code capabilities – a choice stating “detects and corrects any number of errors” is wrong; only single‑bit correction.
---
or
Or, immediately create your own study flashcards:
Upload a PDF.
Master Study Materials.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or