Developing a good model for population growth is a pretty common problem faced by economists doing applied work. In this post, I'll walk through the derivation of a simple but flexible model I've used in the past known as the "logistic" model of population growth.
Some Background
I first ran into the problem of modeling population growth when building a gas tax forecasting model for the City of Seattle. In Washington State, gas taxes are first collected at the state level, and revenue is then distributed to cities based on population. From the standpoint of a particular city, population shifts between areas can have a big impact on gas tax revenue. This may not matter much for short-term forecasts, but it can have a huge effect on 20-year forecasts that serve as the basis for the city's long-term transportation plans.
To address this issue, we developed a model of city-level population growth as part of the larger tax revenue forecasting model, dramatically improving the quality of our long-term forecasts. In the rest of this post, I'll walk through the derivation of the population model we ended up using.
The Naïve Model: Exponential Growth
The simplest way of modeling population is to assume "exponential" growth. That is, just assume population grows by some annual rate, forever. If we let "y" be a city's population and "k" be the annual growth rate, the exponential growth model is given by
This is a simple first-order differential equation. We can solve this for "y" by using a technique called "separation of variables". First, we separate variables like this:
Then we integrate both sides and solve for y, as follows:
Since C is just an arbitrary constant, we can let e^C just equal C, which gives us
where k is the annual growth rate, t is the number of years from today, and C is the population at time t=0. This is the famous "exponential growth" model.
While the exponential model is useful for short-term forecasts, it gives unrealistic estimates for long time periods. After just a few decades, population would rapidly grow toward infinity in this model. A more realistic model should capture the idea that population does not grow forever, but instead levels off around some long-term level. This leads us to our second model.
A Better Model: Logistic Growth
We can improve the above model by making a simple adjustment. Let "A" be the maximum long-term population a city can reasonably sustain. Then multiply the model above by a factor (1 - y/A), giving us
In this model, the population starts out growing exponentially. But as "y" approaches the maximum level "A", the term (1 - y/A) approaches zero, slowing down the growth rate. In the long run, growth will slow to a crawl as cities approach their maximum sustainable size -- a much more reasonable way to model population growth. This is known as the "logistic" model.
To solve for "y," we can again use separation of variables. However, we'll first need to use a trick from algebra known as the "partial fractions decomposition."
An Aside: The Partial Fractions Decomposition
The partial fractions decomposition is a theorem about rational functions of the form P(x)/Q(x). Here is what it says. If P(x) and Q(x) are polynomials, and P(x)/Q(x) is "proper" -- that is, the order of P(x) is less than the order of Q(x) -- then we can "decompose" P(x)/Q(x) as follows:
where a1...an are the n roots of the polynomial Q(x), and C1...Cn are constants. Using this theorem, we can decompose hard-to-handle rational functions into much simpler pieces -- something we'll need to do to solve the logistic population model above.
Back to the Model: Solving the Logistic Equation
Recall that the logistic population model is given by:
Separating variables, we have:
The term on the left-hand side is hard to integrate as written. Since it's a proper rational polynomial function, we can now use the partial fractions decomposition to simplify it. By the theorem above, we can rewrite it as:
To solve for C1 and C2, first multiply both sides by y(1 - y/A) to clear the denominators, like this:
This equation is true for all values of y. To solve for C1 and C2, simply plug in values for y that allow us to solve for them. To solve for C1, let y = 0. This "zeros out" C2 in the equation and lets us solve for C1, as follows:
To solve for C2 we repeat the process, plugging in a value for y that "zeros out" C1. To do this, Let y = A, and solve for C2 as follows:
Using these constants, now we can rewrite our original function using the partial fractions decomposition as follows:
This simpler function can then be plugged into our integration problem above, allowing us to integrate the logistic model and solve for y. Returning to our problem, we have:
Integrating both sides and solving for y, we have:
Dividing the numerator and denominator by Ce^kt, we get an easier-to-handle version of the population equation:
This is the famous "logistic model" of population growth. To solve for "C" in the equation, note that if we let t=0, C = y0/(y0 - A) where y0 is the beginning population. And we're done. This basic model can then be used to develop pretty reasonable long-term forecasts for city populations.
Posted by Andrew on Saturday March 13, 2010 | Feedback?