Each year, some fraction of Seattle's population moves to Portland. And some fraction of Portland's population moves back to Seattle. If this keeps up year after year, what happens in the long run? Will the populations reach a steady state, or will one city empty out into the other?
These are questions we can answer with simple linear algebraic models. In this post I'll explain how to set up models for problems like this, and how to solve them for their long-run values.
Setting Up the Problem
Here are some basic facts. Imagine Seattle and Portland both have populations of 500,000. Each year, 5 percent of Seattle's population moves to Portland and 20 percent of Portland's population moves to Seattle. The rest don't move. This pattern repeats every year, forever.
The first step is to write this as a linear system. The idea is that we have a system -- in this case, the population in both cities -- that starts at an initial state u0 at time t = 0. At time t = 1, some transformation happens, taking us to state u1. Then the transformation is repeated at time t = 2, 3, ... bringing us to states u2, u3, and so on. The question we're after is what does the system look like after k steps, and what happens in the long run when k gets large?
In our problem, we can write the initial state u0 as a vector of populations in the two cities. The first entry is Seattle's population and the second entry is Portland's:
The population shifts between the two cities -- that is, the transformation that happens at time t = 1, 2, ... -- can be described by this matrix:
Here's how to read this. The first column and first row describe Seattle's population, and the second column and second row describe Portland's. Reading down the first column, it says 95 percent of Seattle's population stays in Seattle, while 5 percent moves to Portland. Similarly, the second column tells us that 20 percent of Portland's population moves to Seattle, while 80 percent stays at home. Since the columns of A all sum to 1, this is what's known as a Markov matrix.
Putting these together, our model works like this. Start with the vector describing the initial state, u0. To move to state u1, we multiply by the matrix A. That is,
To move to state u2, u3, ... we again multiply by A, or
That's our basic model describing the population in the two cites after k steps. In the rest of this post, I'll explain how to solve it for the long-run equilibrium.
For any linear system, the key to understanding how it behaves over time is through eigenvalues and eigenvectors. We need to get to the bottom of what the matrix A is doing to the system each time it is multiplied. The way to see that is by examining A's eigenvalues and eigenvectors.
The first step is to find the eigenvalues of A. Since we've got a 2x2 matrix, we'll normally expect to find two of them.
Here's how to find them. Start with the definition of eigenvalues as,
To solve for we first rewrite this as
Looking at this expression, we definitely know that the matrix isn't invertible for any non-zero vector x. If isn't invertible, that means its determinant must be zero. Using that fact, we can solve for the eigenvalues.
In our case, we have a 2x2 matrix which has a pretty simple determinant,
That last equation is called the "characteristic polynomial" of A. It's what we solve to find the eigenvalues. In this case, it's a quadratic since we're working with a 2x2 matrix, which will have two solutions. If the problem was for n cities instead, we'd end up with a characteristic polynomial of degree n.
In this case, we have an easy solution to the polynomial:
These two s are the two eigenvalues of A.
It's no coincidence that one of our eigenvalues is 1. That's always the case with Markov matrices. And that makes the characteristic polynomial easy to solve. We always know that the trace of A -- the sum of A's elements down the diagonal -- is equal to the sum of the eigenvalues. If one of them is 1, then the other must be 1.75 minus 1 or .75. Those are our two solutions, so we can immediately factor and solve.
Now that we have the two eigenvalues, the next step is to find the eigenvectors x1 and x2. Here's how we do that. For each eigenvalue and , we find the null space of the matrix . That is, we find the vectors x that make .
We do this by using row reduction on until we get it in row echelon form. Then we can pick off the pivots and free variables, and work out the eigenvectors.
First take the case of . Then the matrix becomes
Adding row one to row two, we get
So our equation becomes,
Since -.05 is attached to our "pivot" and .20 is attached to our "free" variable, we set our free variable x2 equal to 1 and solve for the pivot variable x1, or
So our first eigenvector x1 is
Following the same procedure for the other eigenvalue , we find the second eigenvector x2 is
Creating S and Λ Matrices
The next step is to form a new matrix S which has the eigenvectors found above as its columns. That is, let
Also, let's form a new matrix which has the eigenvalues we found above along the diagonal and zeros elsewhere. That is, let
We know from the definition of eigenvalues and eigenvectors, . Now that we've got our eigenvectors in the columns of the matrix S, and our eigenvalues are along the diagonal of , we can rewrite our basic relationship as
Solving this for A, we can see an important relationship as the system moves from one state to the next:
Note what happens to this equation if we multiply by A over and over, moving from the first state u1 to the second, third, and so on:
Our original model was in the form . But now we've found that just equals . So we can rewrite our model as
But we can compress this model even further. It's possible to write the initial state vector u0 as a linear combination of the two eigenvectors x1 and x2. That is, we can write
which can be solved for c as
This expression for c can now be substituted back into the model above to get
This is the model we've been working to find. It says the kth state of our model is equal to the matrix of eigenvectors S times the matrix of eigenvalues raised to the power of k, times some vector c that gives combinations of them.
Interpreting the Model
Expanding out the expression , we can see better why this is a useful way to model our population problem,
What this is saying is that the kth state of our model is just the sum of each of the eigenvectors, times their corresponding eigenvalues raised to the power of k, times some constant c. Written this way, it's easy to see where this model is headed in the long run. All changes as we move forward in time are determined by the eigenvalues and .
If the absolute value of an eigenvalue is less than one -- that is, if -1 < < 1 -- it will rapidly decay and go to zero as k grows. If the absolute value is > 1, it will explode and grow without bound as k grows. And if it equals one, it's what we call a "steady state" of the model since it remains unchanged as k grows.
In our case, because we set up our problem using population proportions that always sum to one, our matrix A was a Markov matrix. It always has an eigenvalue that equals one, and the rest have absolute values less than one. That means our model is guaranteed to have a steady state, as any other factors influencing the model will decay and disappear over time as k grows.
Plugging in the values for our population problem above, here's our actual solution,
At this point, it's easy to see what will happen to this model in the long run. Let k grow very large. The term on the right will rapidly shrink to zero -- this is the "transitory" component of the model. The term on the left just gets multiplied over and over by 1. So that's our steady state, or
This is the long-run equilibrium for our model. What it means is that starting with populations of 500,000 each in Seattle and Portland, migration slowly eroded Portland and expanded Seattle. And in the long run, Seattle's population levels off at 800,000 and Portland's shrinks to 200,000.
For those who want to see how this works in an Excel spreadsheet, here's a file with the above model.
[Here's a post where I explain how to use this method to generate the (k+1)th term of the Fibonacci sequence given some input k.]