(See this post at Columbia Economics, LLC for the related comment thread.)
A common problem with time-series data is getting them into the right time interval. Some data are daily or weekly, while others are in monthly, quarterly or annual intervals. Since most regression models require consistent time intervals, an econometrician's first job is usually getting the data into the same frequency.
In this post I'll explain how to solve a common problem I've run into: how to divide quarterly data into monthly data. To do so, we'll use a method known as "cubic spline interpolation." In the example below we use Matlab and Excel. For Stata users, I've posted a Stata do file that illustrates how to work through the below example in Stata.
Cubic Spline Interpolation
One of the most widely used data sources in economics is the National Income and Product Accounts (NIPAs) from the U.S. Bureau of Economic Analysis. They're the official source for U.S. GDP, personal income, trade flows and more. Unfortunately, most data are published only quarterly or annually. So if you're hoping to run a regression using monthly observations -- for example, this simple estimate of the price elasticity of demand for gasoline -- you'll need to split these quarterly data into monthly ones.
A common way to do this is by "cubic spline interpolation." Here's how it works. We start with n quarterly data points. That means we have n-1 spaces between them. Across each space, we draw a unique 3rd-degree (or "cubic") polynomial connecting the two points. This is called a "piecewise polynomial" function.
To make sure our connecting lines form a smooth line, we force all our first and second derivatives to be continuous; that is, at each connecting point we make them equal to the derivitive on either side. When all these requirements are met -- along with a couple end-point conditions you can read about here -- we have a (4n-4) x (4n-4) linear system that can be solved for the coefficients of all n-1 cubic polynomials.
Once we have these n-1 piecewise polynomials, we can plug in x values for whatever time intervals we want: monthly, weekly or even daily. The polynomials will give us a pretty good interpolation between our known quarterly data points.
An Example Using MATLAB
While the above method seems simple, doing cubic splines by hand is not. A spline for just four data points requires setting up and solving a 12 x 12 linear system, then manually evaluating three different polynomials at the desired x values. That's a lot of work. To get a sense of how hard this is, here's my own Excel file showing what's involved in fitting a cubic spline to four data points by hand.
In practice, the best way to do a cubic spline is to use MATLAB. It takes about five minutes. Here's how to do it.
MATLAB has a built-in "spline()" function that does the dirty work of cubic spline interpolation for you. It requires three inputs: a list of x values from the quarterly data you want to split; a list of y values from the quarterly data; and a list of x values for the monthly time intervals you want. The spline() function formulates the n-1 cubic polynomials, evaluates them at your desired x values, and gives you a list of interpolated monthly y values.
Here's an Excel file showing how to use MATLAB to split quarterly data into monthly. In the file, the first two columns are quarterly values from BEA's Personal Income series. Our goal is to convert these into monthly values. The next three columns (highlighted in yellow) are the three inputs MATLAB needs: the original quarterly x values (x); the original quarterly y values (y); and the desired monthly x values (xx).
In the Excel file, note that the first quarter is listed as month 2, the second quarter as month 5, and so on. Why is this? BEA's quarterly data represent an average value over the three-month quarter. That means they should be treated as a mid-point of the quarter. For Q1 that's month 2, for Q2 that's month 5, and so on.
The next step is to open MATLAB and paste in these three columns of data. In MATLAB, type " x = [ ", cut and paste the column of x values in from Excel, type " ] " and hit return. This creates an n x 1 vector with the x values. Repeat this for the y, and xx values in the Excel file.
Once you have x, y, and xx defined in MATLAB, type "yy = spline(x,y,xx)" and hit return. This will create a new vector yy with the interpolated monthly y values we're looking for. Each entry in yy will correspond to one of the x values you specified in the xx vector.
Copy these yy values from MATLAB, paste them into Excel, and we're done. We now have an estimated monthly Personal Income series.
Here's an Excel file summarizing the above example for splitting quarterly Personal Income data into monthly using MATLAB. Also, here's a MATLAB file with the x, y, xx, and yy vectors from the above exercise.
Note: For Stata users, here's a "do" file with an example that performs the above cubic spline interpolation in mata.