One of the classes that I’m taking this semester is on stochastic processes. Stochastic processes are sequences of random variables $X_1, X_2, …, X_n$, where $n$ represents a specific moment in time.1 The first topic we’ve focused on this semester has been Markov chains, specific stochastic models that only depend on the current state. They are unique in that future events are only affected by where you currently are, not where you have been in the past. Markov chains arise from the law of total probability, which is a way to express the probability of an event as the sum of many other probabilities. In general, when dealing with a stochastic process, you have to condition the probability of an event on all previous events before, as they all have some effect on whether or not you reach that state. Mathematically, this can be expressed as
$$P[X_{n+1} = m \mid X_0 = i, X_1 = j, …, X_n = k]$$
However, with Markov chains, we only need to focus on the most recent event.
$$P[X_{n+1} = m \mid X_n = k]$$
This is a remarkable simplification. We are able to ignore all previous events and condition our probability on $X_n$. This greatly simplifies both the conceptual understanding of each probability and the number of calculations required to obtain it.
Markov chains appear everywhere: the Wikipedia page for Markov chains lists several general examples, including random walks, board games played with dice (assuming players have no agency), weather predictions, and the stock market.2 Other examples we’ve gone over in class include bacteria reproduction, the PageRank algorithm, gambler’s ruin, and random walks. However, I’ve become curious: what are some nonobvious things in my life specifically that I could potentially model with Markov chains?
Some qualifiers first: I will only be focused on chains with discrete states that I can quantify (Snakes & Ladders has 100 squares, gamblers have $N$ dollars, etc.). I want to pick things where I can answer a meaningful question about them. Some examples of these questions could include:
Below are some events I came up with:
After going through all of these examples, one might see what they have in common: human behavior. I am a human, and though I could model any one of these processes with a Markov chain, some would fit better than others (and none would be perfect). Just to pick apart two examples, what I eat for dinner is affected by more than what I ate the night before: it depends on what food I have left in the fridge, where I am eating dinner, who I am eating with, if I’ve eaten lunch, etc. Additionally, what I choose to study might be more directly affected by what is due soon than what I studied the day before. If I have a test in a class the next day, I’ll be more focused on that than the class I’m ususally doing homework for that day.
This is why Markov chains are often more useful when there are less factors affecting the outcome of an event. If this is the case, it is more likely that the previous event is the determining factor. The closer a situation is to a board game relying on the outcomes of a dice roll, the better it can be modeled by these types of Markov chains. There are other Markov processes, such as hidden Markov models, that are modeled based on different assumptions (for example - what if you can’t observe the states of the process directly?).
Note also that some questions one might answer with Markov chains in general are not useful here. Rarely do I have an activity where I reach an absorbing state, one where I will not move to some other state. The probability that I end up in a particular state at some point is not as useful as the probability that I do so before a certain time. Also, the expected time to complete some process is not useful for any of the examples I have picked. When will I “finish” running, or eating dinner?
Though Markov chains are models (not exact calculations), they can prove very useful in other situations, such as applications of random walks, modeling Brownian motion, and many other areas. However, it is important not to forget that they are a tool that can be applied to whatever we choose, depending on what process we are trying to mode. Even when trying to figure out how likely I am to eat ramen tomorrow.
One way to understand what the term stochastic process refers to is to break it down word by word. If something is stochastic, it means that it is inherently random, and that it can be described by a probability distribution. A process occurs over time. Putting those together, you get a sequence of events that are time-dependent, where each event can be modeled by some probability distribution (and represented as a random variable). Because each event is time-dependent, future states can be dependent on current and past states, allowing one to model the whole system. ↩︎
Note how each row of the transition matrix sums to 1. This is because of the law of total probability: the probability that event $X_n = j$ occurs is equal to the sum of probabilities that partition the set. That is, $P[X_n = j] = \sum_i P[X_n + 1 = i \mid X_n = j]$. Since we are already in state $X_n = j$, $P[X_n = j] = 1$, and $P[X_n + 1 = i \mid X_n = j] = P[X_n + 1 = i]$. ↩︎