Neural network Study Guide
Study Guide
📖 Core Concepts
Neural network – a collection of interconnected neurons (biological cells or mathematical units) that send signals to one another.
Artificial vs. Biological – Biological networks are real brain circuits; Artificial networks are mathematical models that approximate nonlinear functions.
Neuron role – can be excitatory (amplifies signals) or inhibitory (suppresses signals).
Layered architecture – input layer → hidden layer(s) → output layer; each hidden layer adds depth.
Connection strength (weight) – a scalar that determines how strongly one neuron influences another; the network’s behavior is governed by these weights.
Training objective – adjust weights so the network’s output fits a given dataset (empirical risk minimization).
Backpropagation – the standard algorithm that computes gradients of the loss w.r.t. each weight and updates them.
Deep neural network (DNN) – any network with > 3 layers (i.e., at least two hidden layers).
Emergence – complex behavior that arises from many simple interacting neurons.
---
📌 Must Remember
Definition – Neural network: group of neurons that exchange signals.
Artificial NN layers – Input → one or more hidden → Output.
Weight update rule – performed via backpropagation & empirical risk minimization.
Deep → ≥ 2 hidden layers (≥ 3 total layers).
Hebbian learning – “neurons that fire together, wire together” – synaptic strengthening with repeated activation.
Perceptron (1943) – first simple artificial NN; basis for modern models.
---
🔄 Key Processes
Forward Pass
Compute each neuron’s input: $z = \sumi wi xi$ (linear combination of previous layer outputs).
Apply activation function $a = f(z)$ → output passed to next layer.
Training (Backpropagation)
Compute loss (difference between predicted and true output).
Propagate error backward through layers, calculating gradients $\partial L/\partial w$.
Update weights: $w \leftarrow w - \eta \,\partial L/\partial w$ (where $\eta$ = learning rate).
Depth Decision
If problem requires learning hierarchical features (e.g., image, speech), add hidden layers → deep network.
---
🔍 Key Comparisons
Biological vs. Artificial NN
Biological: chemical synapses, action potentials, real neurons.
Artificial: mathematical units, weighted sums, activation functions.
Excitatory vs. Inhibitory Neuron
Excitatory: increases downstream activity.
Inhibitory: reduces downstream activity.
Perceptron vs. Deep NN
Perceptron: single layer, linear separability only.
Deep NN: multiple hidden layers, can model highly non‑linear relationships.
---
⚠️ Common Misunderstandings
“More layers = better” – depth helps only when data has hierarchical structure; too many layers cause over‑fitting or vanishing gradients.
“Backpropagation learns the architecture” – it only tunes weights; the network’s layer layout must be chosen beforehand.
“Artificial neurons fire like real neurons” – they compute a deterministic function; there is no electrochemical signaling.
---
🧠 Mental Models / Intuition
Weight as “volume knob” – turning a weight up makes the upstream neuron’s signal louder; turning it down mutes it.
Activation function as “gate” – decides whether the summed signal is strong enough to pass forward (e.g., ReLU: $f(z)=\max(0,z)$).
Training = “learning the right knob settings” – the network tries many knob configurations (weights) to minimize prediction error.
---
🚩 Exceptions & Edge Cases
Linear activation – if all layers use a linear activation, the whole network collapses to a single linear transformation, regardless of depth.
Hebbian learning – works only for unsupervised, correlation‑based scenarios; not a replacement for supervised backpropagation.
Single‑layer networks – can solve only linearly separable problems; XOR is a classic failure case.
---
📍 When to Use Which
Perceptron (single layer) → simple binary classification with linearly separable data.
Shallow network (1 hidden layer) → modest non‑linear problems, limited data.
Deep network (≥ 2 hidden layers) → image, speech, text, or any task needing hierarchical feature extraction.
Hebbian update → exploratory, unsupervised learning or modeling synaptic plasticity; not for typical supervised tasks.
---
👀 Patterns to Recognize
“Layer‑wise abstraction” – early hidden layers detect low‑level features (edges), deeper layers combine them into high‑level concepts.
“Vanishing gradient” – training stalls when many layers use saturating activations (sigmoid/tanh) and gradients shrink toward zero.
“Emergent behavior” – complex outputs (e.g., generated images) often arise when many simple neurons interact.
---
🗂️ Exam Traps
Distractor: “Backpropagation changes the network architecture.” – Wrong: it only updates weights.
Distractor: “A deep network always outperforms a shallow one.” – Wrong: performance depends on data, regularization, and proper depth.
Distractor: “Biological neurons use the same activation functions as artificial neurons.” – Wrong: biological firing is spike‑based, not a simple mathematical function.
Distractor: “Hebbian learning guarantees optimal classification.” – Wrong: it’s a local, unsupervised rule, not a global error‑minimizing method.
or
Or, immediately create your own study flashcards:
Upload a PDF.
Master Study Materials.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or