Foundations of Neural Networks
Understand the basics of neural networks, the distinction between biological and artificial types, and how artificial networks are structured, activated, and trained.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz
Quick Practice
What is the basic definition of a neural network?
1 of 13
Summary
Overview of Neural Networks
Introduction
Neural networks are among the most important tools in modern artificial intelligence and machine learning. Whether you're processing images, understanding language, or making predictions, neural networks likely power the system behind the scenes. But what exactly is a neural network, and why are they so effective?
At their core, neural networks are systems composed of many simple computational units—called neurons—that work together to solve complex problems. The key insight is that while individual neurons are quite simple, when you connect many of them together in the right way, they can learn to perform remarkably sophisticated tasks. To understand how this works, we'll start by looking at how neural networks exist in nature, then explore how we replicate this idea mathematically.
Biological Neural Networks: Inspiration from the Brain
Structure and Communication
The human brain contains roughly 86 billion neurons, and these neurons are the foundation of all thought and behavior. A biological neural network is a connected population of these biological neurons, linked together by structures called synapses.
When you think, learn, or perceive the world, neurons communicate with each other through electrical and chemical signals called action potentials. A neuron receives signals from its neighbors, integrates that information, and then sends its own signal onward to other neurons. This chain of communication across billions of neurons is what creates consciousness and intelligence.
How Individual Neurons Work
Not all neurons do the same thing. When a neuron receives incoming signals, it can respond in two fundamentally different ways:
Excitatory neurons amplify and propagate signals they receive, strengthening the message as it passes forward
Inhibitory neurons suppress signals they receive, weakening or blocking messages
This is crucial: a neural network isn't just about connecting units together. The nature of each connection matters. Some connections strengthen signals, while others weaken them. This balance between excitation and inhibition is what allows biological neural networks to perform complex computations.
Organization at Scale
Neurons don't exist in isolation. In the brain, they're organized into different levels of structure:
Small groups of interconnected neurons form neural circuits, which handle specific tasks (like detecting motion or processing colors)
These circuits connect to form even larger large-scale brain networks, which handle higher-level functions (like memory or decision-making)
<extrainfo>
The hierarchy of neural organization is an interesting detail about how brains are structured, but for most artificial intelligence courses, the key concept is simply that neurons connect together and their connection patterns determine what the network can do.
</extrainfo>
Artificial Neural Networks: The Mathematical Model
Why Artificial Neural Networks?
Biological neural networks are incredibly complex. A human brain has roughly 170 trillion synaptic connections—far too intricate to fully understand or replicate precisely. Instead of copying biology exactly, computer scientists created artificial neural networks: mathematical models that capture the essential ideas while being much simpler to build and work with.
The core idea is elegant: if we can arrange simple mathematical units (our artificial "neurons") and tune the strength of connections between them, we can build a system that learns from data and makes predictions. This is exactly what we do in practice.
Architecture: Layers of Neurons
The most common way to organize an artificial neural network is into layers:
The input layer receives data from the outside world
One or more hidden layers process this information
The output layer produces the final result
Information flows in one direction: from the input layer, through the hidden layers, and finally to the output layer. This unidirectional flow makes the mathematics tractable and allows us to train these networks efficiently.
Think of it like an assembly line in a factory. Raw materials (input) come in, pass through various processing stages (hidden layers), and emerge as finished products (output).
The Input to a Neuron
Before a neuron in an artificial network can produce output, it must receive input. The input to any neuron is calculated as a linear combination of the outputs from neurons in the previous layer.
In other words, if a neuron receives signals from multiple neurons in the previous layer, we multiply each incoming signal by a number (representing connection strength) and add them all together. Mathematically, this looks like:
$$z = w1 \cdot x1 + w2 \cdot x2 + ... + wn \cdot xn + b$$
where $xi$ are the incoming signals, $wi$ are the weights (connection strengths), and $b$ is a bias term. This linear combination is the raw input that the neuron will process.
The Activation Function: Adding Nonlinearity
Here's a critical point that trips up many students: if we only used linear combinations, our entire network would just be one big linear transformation. We wouldn't be able to learn complex, nonlinear patterns in data—which defeats the purpose of having multiple layers.
The solution is the activation function. Each neuron applies an activation function to its input to produce output:
$$\text{output} = f(z)$$
where $f$ is some nonlinear function. Common activation functions include the ReLU (which outputs $\max(0, z)$) or the sigmoid function. The activation function is what allows the network to learn nonlinear relationships and solve complex problems.
Connection Weights: The Parameters That Matter
Everything a neural network does depends on its weights—the numbers that control the strength of each connection between neurons. In a large network, there might be millions or even billions of these weights.
The remarkable fact is that we don't manually set these weights. Instead, we have an automatic way to adjust them based on data, which brings us to training.
Training Neural Networks
The Goal of Training
When we first create a neural network, its weights are typically set to random values. Unsurprisingly, a randomly initialized network makes terrible predictions. The purpose of training is to automatically adjust all the weights in the network so that it learns to make good predictions on data we care about.
More formally, training means finding weights that minimize error on a dataset we provide. This is called empirical risk minimization—we're minimizing the actual risk (error) on real examples.
How Training Works: Backpropagation
The standard algorithm for training neural networks is called backpropagation. Here's the intuition:
We show the network an example from our training data
The network makes a prediction
We measure how wrong that prediction is
We use calculus to figure out how to adjust each weight to make the error smaller
We update the weights slightly
We repeat this process on many examples until the network learns
Backpropagation is efficient because it computes how much each weight contributed to the error, allowing us to update all millions of weights effectively. Without this algorithm, training deep neural networks would be computationally infeasible.
Deep Neural Networks
What Makes a Network "Deep"?
You've probably heard the term "deep learning." The depth of a network refers to how many layers it has. A deep neural network has more than three layers total—typically including at least two or more hidden layers in addition to the input and output layers.
Why does depth matter? Networks with more layers can learn more complex hierarchical patterns. The early hidden layers might learn simple features (like edges in an image), middle layers learn combinations of those features (like shapes), and deeper layers learn very abstract concepts (like object categories).
However, deep networks are also harder to train effectively, which is why training algorithms and techniques have been a major focus of research over the past decade.
Flashcards
What is the basic definition of a neural network?
A group of interconnected units called neurons that send signals to one another.
What are the two main types of neural networks?
Biological neural networks
Artificial neural networks
How are biological neurons physically connected to one another?
Through synapses.
What are the electrochemical signals sent and received by biological neurons called?
Action potentials.
What does it mean for a biological neuron to serve an excitatory role?
It amplifies and propagates signals it receives.
What is the term for small groups of interconnected neurons?
Neural circuits.
How are neurons organized within the architecture of an artificial neural network?
They are arranged into layers.
In what order does information pass through the layers of an artificial neural network?
Input layer
One or more hidden layers
Output layer
What numerical value serves as the input to an individual artificial neuron?
A linear combination of the outputs of connected neurons in the previous layer.
What term refers to the strengths of the connections between artificial neurons?
Weights.
What is the primary objective of training an artificial neural network?
To modify connection weights to fit a preexisting dataset.
What common algorithm is used to update connection weights during training?
Backpropagation.
In addition to input and output layers, how many hidden layers does a deep neural network typically contain?
At least two.
Quiz
Foundations of Neural Networks Quiz Question 1: What characterizes a biological neural network?
- A population of neurons chemically linked by synapses (correct)
- A set of artificial nodes connected by wires
- Neurons that communicate only via electrical gaps
- An arrangement of isolated neurons with no connections
Foundations of Neural Networks Quiz Question 2: Which algorithm is commonly used to update weights during training of an artificial neural network?
- Backpropagation (correct)
- Gradient ascent
- K-means clustering
- Monte Carlo simulation
Foundations of Neural Networks Quiz Question 3: What is the primary function of an inhibitory neuron in a biological neural network?
- It suppresses signals it receives (correct)
- It amplifies signals it receives
- It stores long-term memories
- It transmits signals unchanged
Foundations of Neural Networks Quiz Question 4: What are the two primary categories of neural networks?
- Biological neural networks and artificial neural networks (correct)
- Linear regression models and decision trees
- Supervised learning systems and unsupervised learning systems
- Hardware circuits and software algorithms
Foundations of Neural Networks Quiz Question 5: What term describes a small group of interconnected neurons?
- Neural circuit (correct)
- Large-scale brain network
- Synaptic cleft
- Axonal bundle
Foundations of Neural Networks Quiz Question 6: In an artificial neural network, a group of neurons that share the same depth is called a what?
- Layer (correct)
- Cluster
- Node set
- Module
Foundations of Neural Networks Quiz Question 7: How is the value fed into an artificial neuron computed?
- As a linear combination of outputs from the previous layer (correct)
- By selecting a random constant for each neuron
- Through a fixed threshold independent of other neurons
- By reusing the neuron’s own output from the prior time step
Foundations of Neural Networks Quiz Question 8: What term refers to the connection strengths that govern an ANN’s behavior?
- Weights (correct)
- Biases
- Thresholds
- Learning rates
Foundations of Neural Networks Quiz Question 9: Which of the following statements does NOT describe a deep neural network?
- It contains only an input layer and an output layer (correct)
- It has more than three layers with at least two hidden layers
- It includes multiple hidden layers between input and output
- It typically has a hierarchy of layers beyond a shallow network
What characterizes a biological neural network?
1 of 9
Key Concepts
Neural Network Fundamentals
Neural network
Neuron
Synapse
Biological neural network
Artificial neural network
Neural circuit
Artificial Neural Network Techniques
Activation function
Backpropagation
Deep neural network
Empirical risk minimization
Definitions
Neural network
A network of interconnected neurons that processes information, encompassing both biological and artificial systems.
Biological neural network
A network of living neurons linked by synapses that transmit electrochemical signals in the brain.
Artificial neural network
A computational model composed of layered artificial neurons that approximates nonlinear functions.
Neuron
The fundamental unit of the nervous system that transmits electrical and chemical signals.
Activation function
A mathematical function applied to a neuron's input to generate its output in an artificial neural network.
Backpropagation
An algorithm that trains artificial neural networks by propagating error gradients backward to update connection weights.
Deep neural network
An artificial neural network with multiple hidden layers, enabling hierarchical feature learning.
Synapse
The junction between two neurons where chemical signals are transmitted.
Neural circuit
A small, interconnected group of neurons that performs a specific functional task.
Empirical risk minimization
A learning principle that selects model parameters to minimize average loss on training data.