Subjects/Math/Statistics and Discrete Math/Probability/Markov chain

Markov chain - Advanced Theory and Continuous‑Time Chains

Understand ergodic behavior and convergence of discrete‑time chains, the core theory of continuous‑time Markov chains, and key concepts such as hitting times and reversibility.

Summary

Read Summary

Flashcards

Save Flashcards

Quiz

Take Quiz

Quick Practice

What two conditions must a state satisfy to be considered ergodic?

1 of 15

Summary

Ergodicity and Convergence What Makes a State Ergodic? An ergodic state is one that has two key properties: it is aperiodic (meaning the chain doesn't return to it at regular intervals) and positive recurrent (meaning the chain returns to it with finite expected time). When all states in a chain are ergodic, the chain itself is called ergodic. This is important because ergodic chains have the nice property of eventually "forgetting" their initial state and settling into a long-run behavior. Convergence to the Stationary Distribution One of the most important results in Markov chain theory concerns what happens as time goes on. For an irreducible, aperiodic chain with a finite state space, the powers of the transition matrix converge to a limiting distribution. Specifically, as $n \to \infty$, each row of the matrix $P^n$ converges to the same stationary distribution $\pi$. This means that no matter where you start, the probability of being in any particular state eventually becomes independent of your starting state—the chain "mixes" and reaches equilibrium. How fast does convergence happen? The rate of convergence is determined by the second-largest eigenvalue magnitude of the transition matrix. If $\lambda1 = 1$ is the largest eigenvalue and $\lambda2$ is the second-largest in absolute value, then the convergence speed is proportional to: $$\left(\frac{|\lambda2|}{|\lambda1|}\right)^n$$ This tells us that eigenvalues close to 1 (in absolute value) result in slow convergence, while eigenvalues close to 0 result in rapid mixing. The Perron–Frobenius Theorem The Perron–Frobenius theorem is a powerful result that guarantees the existence and uniqueness of a stationary distribution. For a finite, irreducible, aperiodic Markov chain, the theorem guarantees: There exists a unique positive stationary distribution $\pi$ (all components are positive) This distribution satisfies $\pi = \pi P$ The chain will converge to this distribution regardless of initial conditions This theorem eliminates any ambiguity: for well-behaved chains, there is exactly one long-run behavior, and the chain will reach it. Continuous-Time Markov Chains Continuous-time Markov chains (CTMCs) extend the discrete-time framework to processes that can change at any point in time, not just at discrete intervals. They are essential for modeling real-world phenomena like customer arrivals, equipment failures, and chemical reactions. The Transition Rate Matrix Instead of a transition matrix, continuous-time chains are characterized by a transition rate matrix (also called the generator matrix) denoted $Q$. The off-diagonal elements $q{ij}$ represent the instantaneous rate at which the chain transitions from state $i$ to state $j$. Key property: Each row of $Q$ sums to zero. The diagonal elements are given by: $$q{ii} = -\sum{j \neq i} q{ij}$$ This ensures that the total rate of leaving any state equals the sum of rates going to all other states. The Infinitesimal Definition The heart of continuous-time chains lies in how we define transitions over small time increments. For a small time interval $h$: $$P(X{t+h}=j \mid Xt=i) = q{ij}h + o(h) \quad \text{for } i \neq j$$ $$P(X{t+h}=i \mid Xt=i) = 1 - \left(\sum{j \neq i} q{ij}\right)h + o(h)$$ The $o(h)$ term represents "higher-order small quantities"—terms that become negligible compared to $h$ as $h \to 0$. This definition means the transition rates are constant over time, giving CTMCs the Markov property at all times, not just specific intervals. Jump Chain and Holding Times One of the most useful ways to understand a continuous-time chain is to decompose it into two parts: The Jump Chain (Embedded Markov Chain): This is the discrete-time chain formed by recording only the sequence of states the process visits. Each time a transition occurs, the jump chain records which state was entered. The transition probabilities of the jump chain are: $$s{ij} = \frac{q{ij}}{-q{ii}} \quad \text{for } i \neq j$$ Note that $s{ii} = 0$ (the chain must move to a different state at each jump). Holding Times: Between jumps, the process stays in the current state for a random duration. The holding time in state $i$ follows an exponential distribution with rate $-q{ii}$. This exponential holding time is a defining feature of CTMCs and reflects the "memoryless" nature of the process—the remaining holding time doesn't depend on how long you've already been in that state. <extrainfo> Finding the Stationary Distribution of a CTMC To find the stationary distribution of a continuous-time chain, you can work with the embedded jump chain. Solve $\pi S = \pi$ for the embedded chain's stationary distribution, where $S$ is the transition matrix of the jump chain. The same distribution is also stationary for the original continuous-time chain. </extrainfo> The Forward Equation (Kolmogorov Differential Equation) The evolution of transition probabilities in a continuous-time chain is governed by a differential equation. Let $P(t)$ be the matrix of transition probabilities at time $t$ (where $P(t){ij}$ is the probability of being in state $j$ at time $t$ given that you started in state $i$). The forward equation is: $$\frac{dP(t)}{dt} = P(t)Q$$ with initial condition $P(0) = I$ (the identity matrix, since you start in your initial state with probability 1). There is also an equivalent backward equation: $$\frac{dP(t)}{dt} = QP(t)$$ Both equations have the same solution form: $P(t) = e^{Qt}$, where $e^{Qt}$ is the matrix exponential. The forward equation is the continuous-time analogue of iterating the transition matrix $P$ repeatedly. It tells us exactly how the transition probabilities change over time, making it possible to compute probabilities at any time $t$. Key Additional Concepts Hitting Times A hitting time is the first moment at which a Markov chain enters a particular set of states. Formally, if $A$ is a set of states, the hitting time is: $$\tauA = \min\{t : Xt \in A\}$$ We often want to know the expected hitting time—on average, how long until the chain reaches set $A$ starting from state $i$? To find expected hitting times, denote $ki$ as the expected time to reach set $A$ from state $i$ (where $i \notin A$). These values satisfy a system of linear equations: $$ki = 1 + \sumj p{ij} kj$$ where the sum is over all states $j \notin A$, and $kj = 0$ for $j \in A$. This equation says: from state $i$, it takes 1 step plus the expected remaining time after that step. Time Reversal and Reversibility Here's a surprising fact: if you observe a Markov chain running backward in time, the resulting process is also a Markov chain. This reversed process has different transition probabilities, but it maintains the Markov property. A chain is reversible (or time-reversible) if the reversed chain has the same transition probabilities as the forward chain. This occurs when the detailed-balance equations hold: $$\pii p{ij} = \pij p{ji}$$ for all pairs of states $i$ and $j$. These equations have a natural interpretation: the probability of being in state $i$ and transitioning to state $j$ equals the probability of being in state $j$ and transitioning to state $i$. In other words, the flow from $i$ to $j$ balances the flow from $j$ to $i$ at equilibrium. A key advantage: if you can verify the detailed-balance equations, you automatically know both the stationary distribution and that the chain is reversible. <extrainfo> Examples of Continuous-Time Markov Processes Two important examples of continuous-time Markov processes are the Wiener process (standard Brownian motion) and the Poisson process. The Wiener process has continuous sample paths with Gaussian increments, while the Poisson process counts discrete arrivals with exponentially distributed inter-arrival times. Both are fundamental in probability theory and have the Markov property at all times. </extrainfo>

Flashcards

What two conditions must a state satisfy to be considered ergodic?

Aperiodic and positive recurrent

To what does the transition matrix power converge for an irreducible, aperiodic chain with finite state space?

A matrix whose rows are the stationary distribution

What value governs the rate of convergence to the stationary distribution?

The magnitude of the second-largest eigenvalue ($|\lambda{2}|$)

What is the formula for the convergence speed of the transition matrix, where $\lambda{1}=1$?

$( |\lambda{2}| / |\lambda{1}| )^{n}$

What does the Perron–Frobenius theorem guarantee for a finite, irreducible, aperiodic chain?

A unique positive stationary distribution

What does the off-diagonal element $q{ij}$ represent in a transition rate matrix?

The instantaneous rate of moving from state $i$ to state $j$

In a Continuous-Time Markov Chain, what is the infinitesimal probability $P(X{t+h}=j \mid X{t}=i)$ for a small time $h$ where $i \neq j$?

$q{ij}h + o(h)$

What is the name of the discrete-time chain that records the sequence of states visited at each jump?

The jump chain (or embedded Markov chain)

What probability distribution does the holding time in state $i$ follow?

Exponential distribution with rate $-q{ii}$

What is the initial condition $P(0)$ for the Kolmogorov Differential Equation?

$I$ (the identity matrix)

How is the hitting time of a set $A$ defined for a Markov chain?

The first time the chain enters $A$ starting from a given state

What linear system do the expected hitting times $k{i}$ satisfy for states $i \notin A$?

$k{i} = 1 + \sum{j} p{ij} k{j}$

What specific equations must hold for a Markov chain to be considered reversible?

Detailed-balance equations: $\pi{i} p{ij} = \pi{j} p{ji}$

How are the transition probabilities $s{ij}$ of the embedded chain calculated from the rate matrix elements $q{ij}$ and $q{ii}$?

$s{ij} = q{ij} / (-q{ii})$ (for $i \neq j$)

How can the stationary distribution of a continuous-time chain be derived using its embedded chain $S$?

By solving $\pi S = \pi$ and then normalizing the result

Quiz

Markov chain - Advanced Theory and Continuous‑Time Chains Quiz Question 1: What term is used for a state that is both aperiodic and positive recurrent?

ergodic (correct)
transient
null recurrent
periodic

Markov chain - Advanced Theory and Continuous‑Time Chains Quiz Question 2: In a continuous‑time Markov chain, the time the process remains in state i before jumping follows which distribution and what parameter determines it?

Exponential distribution with rate $-q_{ii}$ (correct)
Uniform distribution with mean $1/(-q_{ii})$
Poisson distribution with parameter $-q_{ii}$
Deterministic time equal to $-1/q_{ii}$

Markov chain - Advanced Theory and Continuous‑Time Chains Quiz Question 3: In a continuous‑time Markov chain, what does an off‑diagonal entry $q_{ij}$ of the transition rate matrix represent?

The instantaneous rate of moving from state $i$ to state $j$. (correct)
The probability of moving from $i$ to $j$ in one time unit.
The stationary probability of being in state $j$.
The expected holding time in state $i$.

Markov chain - Advanced Theory and Continuous‑Time Chains Quiz Question 4: When a Markov chain is observed backward in time, the resulting reversed process has which of the following properties?

It is also a Markov process. (correct)
It is always reversible.
The Markov property is lost.
The reversed process becomes deterministic.

Markov chain - Advanced Theory and Continuous‑Time Chains Quiz Question 5: In the infinitesimal definition of a continuous‑time Markov chain, what is the approximate probability of transitioning from state i to a different state j (i ≠ j) during a very small time interval h?

q_{ij} · h, up to lower‑order terms (correct)
h / q_{ij}
q_{ij} / h
0, because jumps cannot occur over infinitesimal time

Markov chain - Advanced Theory and Continuous‑Time Chains Quiz Question 6: What is the definition of the hitting time of a set A for a Markov chain starting from a given state?

The first time the chain enters set A (correct)
The total number of visits to set A
The expected number of steps before leaving set A
The time until the chain returns to its starting state

Markov chain - Advanced Theory and Continuous‑Time Chains Quiz Question 7: According to the Perron–Frobenius theorem, what is true about the stationary distribution of a finite, irreducible, aperiodic Markov chain?

It exists, is unique, and all its entries are positive. (correct)
It may be non‑unique and can contain zero entries.
It must be the uniform distribution over all states.
Such a chain does not necessarily have a stationary distribution.

Markov chain - Advanced Theory and Continuous‑Time Chains Quiz Question 8: Which differential equation correctly expresses the Kolmogorov forward (forward‑equation) relationship for the transition‑probability matrix $P(t)$ of a continuous‑time Markov chain with generator matrix $Q$?

$\displaystyle \frac{d}{dt}P(t)=P(t)\,Q$ (correct)
$\displaystyle \frac{d}{dt}P(t)=Q\,P(t)$
$\displaystyle \frac{d}{dt}P(t)=Q$
$\displaystyle \frac{d}{dt}P(t)=P(t)^{-1}$

What term is used for a state that is both aperiodic and positive recurrent?

1 of 8

Key Concepts

Markov Chains and Properties

Ergodic state

Stationary distribution

Perron–Frobenius theorem

Continuous‑time Markov chain

Transition rate matrix

Embedded (jump) chain

Kolmogorov forward equation

Hitting time

Reversibility (detailed balance)

Stochastic Processes

Brownian motion

Poisson process

Definitions

Ergodic state

A state that is both aperiodic and positive recurrent, ensuring long‑run statistical regularity.

Stationary distribution

A probability vector that remains unchanged under the transition matrix of a Markov chain.

Perron–Frobenius theorem

A result guaranteeing a unique positive eigenvector (stationary distribution) for a finite, irreducible, aperiodic non‑negative matrix.

Continuous‑time Markov chain

A stochastic process with the Markov property where transitions can occur at any continuous time point.

Transition rate matrix

A matrix $Q$ whose off‑diagonal entries $q_{ij}$ give instantaneous rates of moving from state i to state j.

Embedded (jump) chain

The discrete‑time Markov chain obtained by recording the sequence of states visited at each jump of a continuous‑time chain.

Kolmogorov forward equation

A differential equation $dP(t)/dt = P(t)Q = QP(t)$ describing the evolution of transition probabilities over time.

Hitting time

The random time at which a Markov chain first enters a specified set of states.

Reversibility (detailed balance)

A property where the stationary flow satisfies $\pi_i p_{ij} = \pi_j p_{ji}$ for all state pairs, making the time‑reversed process Markovian.

Brownian motion

A continuous‑time stochastic process with independent, normally distributed increments, also known as the Wiener process.

Poisson process

A continuous‑time counting process with independent, exponentially distributed inter‑arrival times and stationary increments.