Optimization resources for econometrics

Several interesting econometric methods, including empirical likelihood and MPEC ("Mathematical Programming with Equilibrium Constraints") approach to structural estimation, can lead to constrained nonlinear programming problems with thousands of independent variables and constraints. Modern nonlinear programming packages can solve these problems quite quickly. Unfortunately, these methods are not that widely used in econometrics, and they often require the use of software other than standard statistical packages. This page is an attempt to collect some information and links, with a focus on AMPL and Matlab.

Large-scale nonlinear programming prerequisites

A good book on modern nonlinear programming techniques is Numerical Optimization by Nocedal and Wright. For a shorter introduction, check out the slides from the Chicago-Argonne Institute for Computational Economics, particularly the introduction by Ken Judd.

Sparse matrices: While there are many important features for large-scale nonlinear programming, one critical aspect is that the software should support sparse matrices (matrices in which only nonzero elements are explicitly stored). It is common to find that the vast majority of the elements in the Hessian of your objective function are 0. In empirical likelihood estimation, for example, the Hessian of has a block of size N x N (where N is the number of observations in your sample), but only its diagonal elements are nonzero. So, for a sample of 10000 observations, a sparse matrix algorithm will only look at the diagonal, doing about 10000 times less work than a "dense" matrix algorithm.

So, if you come across a nonlinear programming package not listed here, at least check to see that it supports sparse matrices for the Hessian and the Jacobian of the constraints. If not, it probably won't be a good fit for models like MPEC and empirical likelihood.

Matlab has particularly good sparse matrix support (see the official docs). Unfortunately, not all Matlab-based optimization codes exploit sparsity.

Computing derivatives: Most of the methods discussed below can take advantage of explicitly-computed first and second derivatives for both the objective function and any nonlinear constraints. While it is always possible to use finite difference methods to compute derivatives numerically, analytical derivatives often lead to much faster and more reliable solutions. Calculating and coding second derivatives of complex functions is painful to say the least. However, there are some good systems available for smart "automatic differentiation" (AD).

AMPL has excellent AD built-in, and you don't need to do anything to exploit it. The TOMLAB package for Matlab includes MAD (Matlab Automatic Differentiation), which can perform AD for first derivatives. See its manual for more information. Note that TOMLAB and MAD are commercial products.

If you don't use TOMLAB, you can download INTLAB for free. While derivatives are not the focus of INTLAB, it actually contains a great, fast AD package that can take second derivatives and exploit sparsity. Basically, after you've loaded the package, you can write code like this:

  x1 = ones(5,1);
  x1_ad = hessianinit(x1);
  myresult = sum( x1_ad .^ 2 );
  H = myresult.hx;
and then H will contain the second derivative of myresult, evaluated at x1 (a 5 x 5 matrix). When you download the package, look in the demos directory for "dgradient.m" and "dhessian.m" for examples.

Links:


Matlab optimization

There are several worthwhile options for optimization in Matlab, although the best ones all require additional (expensive) software:

Optimization toolbox: The "official" toolbox from Mathworks isn't free, but it's very cheap for academic use and included in some student editions. I believe that the version you can download from Harvard FAS includes the toolbox. The main function for solving constrained optimization problems is fmincon.

Before version 4.0 of the toolbox (Matlab R2008a), the constrained optimization solvers were very limited. It was not possible to provide an analytic Hessian for a model if nonlinear constraints were also present. Starting in R2008a, fmincon includes a new "interior-point" algorithm, which does allow you to include both nonlinear constraints and an analytic Hessian. The Matlab documentation includes an example using the interior-point solver which could be helpful. More documentation on the algorithm can be found here

The interior-point algorithm looks like a modern approach that should be a good fit for the types of large-scale problems considered here. In my limited experience (applying fmincon to empirical likelihood problems), I've found it to be slower and less reliable than the other methods listed below. I would appreciate feedback and tips from people who have had better luck with it.

Note also that the Matlab installation on the econ servers (e.g. econws3) is an older version that does not include the interior-point algorithm. The HBS grid (researchgrid.hbs.edu) does include R2008a, however.

TOMLAB: There are dozens of different nonlinear solver algorithms available, most of which are only available in Fortran or C, and each of which has its own confusing interface. TOMLAB provides a wrapper around many of these solvers, allowing them to be used within Matlab, all with the same confusing interface. It is a commercial package, available from tomopt.com (not to be confused with tomlab.com, a record label that is home to the band "The Books.")

I don't know of any servers at Harvard that have TOMLAB already installed. However, you can download a free 21-day trial here. They will ask you to choose which products to download from a large list. Make sure you include the base module, MAD, and KNITRO, or just download everything. Installing TOMLAB is not too painful, but be sure to read the installation instructions, as you will have to set some environment variables to help your system find the TOMLAB installation. Before running TOMLAB in any given Matlab session, remember that you need to run the "startup.m" file in the directory of your TOMLAB installation.

The TOMLAB functions can be somewhat error-prone, as they typically require you to pass in a dozen or more arguments, and it's easy to get two out of order. You can find the user's guide here. The most important section is this one for nonlinear problems. Also remember that you can type doc conAssign at the Matlab prompt to pull up documentation on the main function used to set up constrained nonlinear optimization problems in TOMLAB.

Because TOMLAB includes many different solvers, you will have to choose which one to use for a particular problem. The solver name is the first argument to tomRun. If you have it installed, I recommend KNITRO as a first line of attack. It is a very powerful nonlinear solver with several different algorithms, support for sparse Hessians, etc. You can find the TOMLAB/KNITRO manual here. Much more documentation on KNITRO is available from its developer, Ziena, here

TOMLAB's integration with the MAD automatic differentiation library is one of its most helpful features. MAD can automatically take one level of derivatives for you, so if you want it to generate a Hessian, you need to provide the gradient yourself. I have a very simple example using MAD with TOMLAB that you might find helpful.

KNITRO/ktrlink: The Matlab optimization toolbox in R2008a includes the function ktrlink, which allows the use of the advanced KNITRO solver. It requires you to have a licensed copy of KNITRO from Ziena. You can get a trial copy of KNITRO for Matlab here. I have not had a chance to try out ktrlink, although I've found KNITRO to be excellent. Please let me know if you have tips or experiences with ktrlink to add here.

IPOPT Most sophisticated nonlinear solvers are commercial products, but IPOPT is free and open source. It is written in C++ and is a particularly good option if you're coding in that language and want something free. I have used IPOPT for empirical likelihood estimation and a large-scale conditional logit problem with good success. It's not as flexible as KNITRO, though, since KNITRO includes several different algorithms and can switch between them easily.

There is a free Matlab interface available for IPOPT, which you can find here. Be forewarned: IPOPT and the Matlab interface can be difficult to compile and install unless you're very familiar with building C++ code on your system. I've gotten a version to work on the econws3 econ servers. If you're interested in experimenting with it, please let me know.

AMPL

The AMPL modeling language is popular among operations research types. Like TOMLAB, it relies on external solvers, such as KNITRO, to solve optimization problems, and most high-quality solvers are available for AMPL. The AMPL programming language was designed specifically for optimization problems, so it can express problems very concisely. AMPL integrates excellent automatic differentiation capabilities and sparsity support.

AMPL is available from its main developer (ampl.com). They provide a free "student" version that supports up to 300 variables and constraints and also free trials of the full version. To make AMPL work, you'll need to download both the main AMPL system and a solver that works with it. The KNITRO solver also has a free 300-variable student version that you can download here.

Using AMPL without installing it:AMPL is already installed on the Harvard econ server econws3. It includes the KNITRO solver, so this is a great option if you don't want to deal with installing it on your own system. There is also a free web interface to AMPL on the "NEOS server". It allows you to upload an AMPL model and get back the results a minute later. Obviously it's not a good option if you have a huge dataset, but it can be good for learning. Here is a direct link to the NEOS page for AMPL+KNITRO.

Learning AMPL: The best AMPL resource is the AMPL Book. Unfortunately, only the first chapter is available online. There is a short tutorial from a Berkeley class that isn't bad. I'll try to post a couple of simple econometric AMPL examples here soon.

Note that AMPL does not include any kind of graphical interface like Matlab's desktop. You edit your AMPL files in the text editor of your choice, then run the ampl program on the command line. If you're not comfortable working this way, learning AMPL can be a frustrating experience.

AMPL "gotchas": There's is no room here for a full guide to AMPL, but here are a few tips to help get you started.

File structure: AMPL models are frequently split up into three files. A model file with the ".mod" extension includes the mathematical core of the problem, defining variables, constraints, and the objective function. A data file with the ".dat" extension contains the data for the problem or loads the data from somewhere else (e.g. a database). Finally, a command file with an extension like ".ampl" or sometimes ".txt" sets options for the solver, points AMPL to the model and data files and displays the results. You could put everything into a single file, but this split-up structure can be helpful. For instance, it's easy to try several different datasets without modifying the model file.

Finding the solver: The command file typically contains a line like: option solver "c:/ziena/knitroampl/knitroampl"; This line tells AMPL where to find the solver that you want to use (Knitro in this example). You need to modify it to reflect the location of the solver on your computer. (On econws3, this should be option solver "/usr/local/bin/knitro".)

Other software

Empirical likelihood: The direct estimation of of empirical likelihood (EL) models leads to a very elegant, but large, constrained maximization problem with at least one parameter per observation in the dataset. Many transformations and alternatives have been proposed to reduce the dimensionality of the parameters, although these tend to turn a large, simple objective function into a medium-size, difficult objective. With modern constrained optimization approaches, we can instead solve the EL problem directly. This method is robust even for complicated nonlinear moment conditions, and it provides a unified system to solve other models in the Generalized Empirical Likelihood class. You can download my Matlab empirical likelihood package matElike and also short document here describing its use. The package uses INTLAB for automatic differentiation of nonlinear moment conditions and works with either IPOPT or a built-in solver.

MPEC / Methodology

MPEC is a general methodology for solving structural econometric models, particularly those with a large number of nuisance parameters. Many of these (such as BLP) can be solved with a nested fixed point approach, in which a fixed point procedure solves for the nuisance parameters for each "guess" of the structural parameters. MPEC instead rephrases the model as a constrained nonlinear program, where some equality constraints hold for the nuisance parameters. In BLP demand estimation, for example, the nuisance parameters are the "xi" unobserved qualities, and the market share equation is a set of equalities defining xi.

See
Constrained Optimization Approaches to Estimation of Structural Models (Su and Judd, 2008)

See also the Chicago ICE slides from Judd and Su, which include AMPL examples.

Che-Lin Su has also posted an AMPL example of the BLP (Berry-Levinsohn-Pakes 1995) procedure for demand estimation and a version of Rust's 1987 bus engine replacement example.

General optimization tips

Problem scaling: Bad scaling is one of the most common problems in optimization. If the derivatives of your function are very large or very small, significant rounding error can enter and make the problem difficult to solve. Try printing out the gradient of your objective at some "reasonable" points. If the elements tend to be smaller than about 0.01 or larger than about 100, rescale the problem by multiplying the objective by a constant. (This tip comes from the Ipopt tips page.)

If the variables are of very different scales, you can run into even worse problems, as some of the variables could be nearly ignored in the optimization. You can often improve this situation by changing the units of one or more variables so that their estimated coefficients will be closer to the same order of magnitude.

FAQ

No questions yet...