The idea is to unroll the equality a random variable at a time, working backwards from t to 1. The probability
is equivalent to the conditional probability
by the Markov property. The equality
provides a tool to pick apart the conditional probability by defining
B = { Xt = st }
C = X1 = s1
intersection(A, B) represents the intersection of all Markov-chain instances that have st as their t-th element and all Markov-chain instances that have s2 through st-1 as their second through t-1th elements; that is, it's the Markov chain { X2, ..., Xt } = { s2, ..., st }.
The conditional probability can be rewritten as
By the Markov property,
and P(Xt = st | Xt-1 = st-1) is equal to the single-step transition probability pst-1st. The conditional probability
can now be rewritten as
and that's the first turn of the crank. Turning the crank t-2 more times reduces the conditional probability to
P(X1 = s1) is given by the Markov chain's initial distribution.
This page last modified on 2 March 2005.