Subjects/Technology/Data and AI/Machine Learning/Neural network

Foundations of Neural Networks

Understand the basics of neural networks, the distinction between biological and artificial types, and how artificial networks are structured, activated, and trained.

Summary

Read Summary

Flashcards

Save Flashcards

Quiz

Take Quiz

Quick Practice

What is the basic definition of a neural network?

1 of 13

Summary

Overview of Neural Networks Introduction Neural networks are among the most important tools in modern artificial intelligence and machine learning. Whether you're processing images, understanding language, or making predictions, neural networks likely power the system behind the scenes. But what exactly is a neural network, and why are they so effective? At their core, neural networks are systems composed of many simple computational units—called neurons—that work together to solve complex problems. The key insight is that while individual neurons are quite simple, when you connect many of them together in the right way, they can learn to perform remarkably sophisticated tasks. To understand how this works, we'll start by looking at how neural networks exist in nature, then explore how we replicate this idea mathematically. Biological Neural Networks: Inspiration from the Brain Structure and Communication The human brain contains roughly 86 billion neurons, and these neurons are the foundation of all thought and behavior. A biological neural network is a connected population of these biological neurons, linked together by structures called synapses. When you think, learn, or perceive the world, neurons communicate with each other through electrical and chemical signals called action potentials. A neuron receives signals from its neighbors, integrates that information, and then sends its own signal onward to other neurons. This chain of communication across billions of neurons is what creates consciousness and intelligence. How Individual Neurons Work Not all neurons do the same thing. When a neuron receives incoming signals, it can respond in two fundamentally different ways: Excitatory neurons amplify and propagate signals they receive, strengthening the message as it passes forward Inhibitory neurons suppress signals they receive, weakening or blocking messages This is crucial: a neural network isn't just about connecting units together. The nature of each connection matters. Some connections strengthen signals, while others weaken them. This balance between excitation and inhibition is what allows biological neural networks to perform complex computations. Organization at Scale Neurons don't exist in isolation. In the brain, they're organized into different levels of structure: Small groups of interconnected neurons form neural circuits, which handle specific tasks (like detecting motion or processing colors) These circuits connect to form even larger large-scale brain networks, which handle higher-level functions (like memory or decision-making) <extrainfo> The hierarchy of neural organization is an interesting detail about how brains are structured, but for most artificial intelligence courses, the key concept is simply that neurons connect together and their connection patterns determine what the network can do. </extrainfo> Artificial Neural Networks: The Mathematical Model Why Artificial Neural Networks? Biological neural networks are incredibly complex. A human brain has roughly 170 trillion synaptic connections—far too intricate to fully understand or replicate precisely. Instead of copying biology exactly, computer scientists created artificial neural networks: mathematical models that capture the essential ideas while being much simpler to build and work with. The core idea is elegant: if we can arrange simple mathematical units (our artificial "neurons") and tune the strength of connections between them, we can build a system that learns from data and makes predictions. This is exactly what we do in practice. Architecture: Layers of Neurons The most common way to organize an artificial neural network is into layers: The input layer receives data from the outside world One or more hidden layers process this information The output layer produces the final result Information flows in one direction: from the input layer, through the hidden layers, and finally to the output layer. This unidirectional flow makes the mathematics tractable and allows us to train these networks efficiently. Think of it like an assembly line in a factory. Raw materials (input) come in, pass through various processing stages (hidden layers), and emerge as finished products (output). The Input to a Neuron Before a neuron in an artificial network can produce output, it must receive input. The input to any neuron is calculated as a linear combination of the outputs from neurons in the previous layer. In other words, if a neuron receives signals from multiple neurons in the previous layer, we multiply each incoming signal by a number (representing connection strength) and add them all together. Mathematically, this looks like: $$z = w1 \cdot x1 + w2 \cdot x2 + ... + wn \cdot xn + b$$ where $xi$ are the incoming signals, $wi$ are the weights (connection strengths), and $b$ is a bias term. This linear combination is the raw input that the neuron will process. The Activation Function: Adding Nonlinearity Here's a critical point that trips up many students: if we only used linear combinations, our entire network would just be one big linear transformation. We wouldn't be able to learn complex, nonlinear patterns in data—which defeats the purpose of having multiple layers. The solution is the activation function. Each neuron applies an activation function to its input to produce output: $$\text{output} = f(z)$$ where $f$ is some nonlinear function. Common activation functions include the ReLU (which outputs $\max(0, z)$) or the sigmoid function. The activation function is what allows the network to learn nonlinear relationships and solve complex problems. Connection Weights: The Parameters That Matter Everything a neural network does depends on its weights—the numbers that control the strength of each connection between neurons. In a large network, there might be millions or even billions of these weights. The remarkable fact is that we don't manually set these weights. Instead, we have an automatic way to adjust them based on data, which brings us to training. Training Neural Networks The Goal of Training When we first create a neural network, its weights are typically set to random values. Unsurprisingly, a randomly initialized network makes terrible predictions. The purpose of training is to automatically adjust all the weights in the network so that it learns to make good predictions on data we care about. More formally, training means finding weights that minimize error on a dataset we provide. This is called empirical risk minimization—we're minimizing the actual risk (error) on real examples. How Training Works: Backpropagation The standard algorithm for training neural networks is called backpropagation. Here's the intuition: We show the network an example from our training data The network makes a prediction We measure how wrong that prediction is We use calculus to figure out how to adjust each weight to make the error smaller We update the weights slightly We repeat this process on many examples until the network learns Backpropagation is efficient because it computes how much each weight contributed to the error, allowing us to update all millions of weights effectively. Without this algorithm, training deep neural networks would be computationally infeasible. Deep Neural Networks What Makes a Network "Deep"? You've probably heard the term "deep learning." The depth of a network refers to how many layers it has. A deep neural network has more than three layers total—typically including at least two or more hidden layers in addition to the input and output layers. Why does depth matter? Networks with more layers can learn more complex hierarchical patterns. The early hidden layers might learn simple features (like edges in an image), middle layers learn combinations of those features (like shapes), and deeper layers learn very abstract concepts (like object categories). However, deep networks are also harder to train effectively, which is why training algorithms and techniques have been a major focus of research over the past decade.

Flashcards

What is the basic definition of a neural network?

A group of interconnected units called neurons that send signals to one another.

What are the two main types of neural networks?

Biological neural networks Artificial neural networks

How are biological neurons physically connected to one another?

Through synapses.

What are the electrochemical signals sent and received by biological neurons called?

Action potentials.

What does it mean for a biological neuron to serve an excitatory role?

It amplifies and propagates signals it receives.

What is the term for small groups of interconnected neurons?

Neural circuits.

How are neurons organized within the architecture of an artificial neural network?

They are arranged into layers.

In what order does information pass through the layers of an artificial neural network?

Input layer One or more hidden layers Output layer

What numerical value serves as the input to an individual artificial neuron?

A linear combination of the outputs of connected neurons in the previous layer.

What term refers to the strengths of the connections between artificial neurons?

Weights.

What is the primary objective of training an artificial neural network?

To modify connection weights to fit a preexisting dataset.

What common algorithm is used to update connection weights during training?

Backpropagation.

In addition to input and output layers, how many hidden layers does a deep neural network typically contain?

At least two.

Quiz

What characterizes a biological neural network?

1 of 9

Key Concepts

Neural Network Fundamentals

Neural network

Neuron

Synapse

Biological neural network

Artificial neural network

Neural circuit

Artificial Neural Network Techniques

Activation function

Backpropagation

Deep neural network

Empirical risk minimization

Definitions

Neural network

A network of interconnected neurons that processes information, encompassing both biological and artificial systems.

Biological neural network

A network of living neurons linked by synapses that transmit electrochemical signals in the brain.

Artificial neural network

A computational model composed of layered artificial neurons that approximates nonlinear functions.

Neuron

The fundamental unit of the nervous system that transmits electrical and chemical signals.

Activation function

A mathematical function applied to a neuron's input to generate its output in an artificial neural network.

Backpropagation

An algorithm that trains artificial neural networks by propagating error gradients backward to update connection weights.

Deep neural network

An artificial neural network with multiple hidden layers, enabling hierarchical feature learning.

Synapse

The junction between two neurons where chemical signals are transmitted.

Neural circuit

A small, interconnected group of neurons that performs a specific functional task.

Empirical risk minimization

A learning principle that selects model parameters to minimize average loss on training data.