m.htm

Closed-Form Maximum Likelihood Estimates for Spatial Problems

R. Kelley Pace

LREC Endowed Chair of Real Estate

Department of Finance

E.J. Ourso College of Business Administration

Louisiana State University

Baton Rouge, LA 70803-6308

OFF: (225)-388-6256, FAX: (225)-334-1227

kelley@pace.am, www.spatial-statistics.com

James P. LeSage

University of Toledo

Department of Economics

Toledo, OH 43606

jlesage@spatial-econometrics.com

September 15, 2000

Abstract

This manuscript introduces the matrix exponential as a way of specifying spatial transformations of the data. The matrix exponential spatial specification (MESS) simplifies the log-likelihood, leading to a closed form maximum likelihood solution. The computational advantages of this model make it ideal for applications involving large data sets such as census and real estate data. The manuscript demonstrates the utility of the techniques by estimating a model for housing prices across 57,647 census tracts. Amazingly, the MESS autoregression can take under a second to compute, despite the large sample size.

	JEL: C29, R15
	KEYWORDS: spatial statistics, spatial autoregression, nearest neighbor, maximum likelihood, sparse matrices, log-determinants, matrix exponentials

1 Introduction
Recent technology has increased the ability to analyze data, but has simultaneously increased the amount of data available for analysis. Spatial data technologies such as global positioning systems (GPS), geographic information systems (GIS), and address geocoding have created an explosion in the size of these data sets. For example, commercial vendors sell data on millions of housing sales across the US with address information that can be easily geocoded using GIS software to produce very large spatial data sets. Analysis of real estate transactions for even a single county may yield more than one hundred thousand annual observations. Not surprisingly, such data or functions of such data (e.g., regression residuals) exhibit a high degree of spatial dependence (e.g., Bell and Bockstael (2000) and Pace and Gilley (1997)). Although spatial location is important when analyzing these data, direct estimation via maximum likelihood of spatial models requires computation of a determinant involving an

matrix. Brute force implementations of maximum likelihood methods become prohibitively expensive for these large data sets.

One approach to overcoming these problems was proposed by Kelejian and Prucha (1998,1999) who set forth a generalized-moments (GM) estimation technique. Bell and Bockstael (2000) compare this GM estimation methodology to maximum likelihood methods concluding that GM estimation may represent a low-cost means of obtaining estimates that are comparable to those from maximum likelihood.

However, spatial maximum likelihood may not prove as difficult as initially thought. For the particular case of nearest neighbor spatial dependence Pace and Zou (2000) provide a closed-form solution that produces maximum likelihood estimates and illustrate their technique on samples sizes of up to 500,000 observations. As an alternative approach also leading to closed-form maximum likelihood estimates, this paper adapts the matrix exponential covariance specification introduced by Chiu, Leonard, and Tsui (1996). Specifically, this paper investigates the use of matrix exponentials for spatially transforming the dependent variable. Amazingly, common ways of specifying the spatial transformation ensure the determinant of the matrix exponential transformation identically equals 1, eliminating the log-determinant term from the log-likelihood. Elimination of the log-determinant term reduces maximum likelihood estimation to minimizing a quadratic form subject to a polynomial constraint. Further, this minimization problem has a unique, closed-form interior solution. Thus, maximum likelihood for this specification reduces to a particularly tractable form of non-linearly constrained least squares.

This approach to spatial estimation which we label the matrix exponential spatial specification (MESS) possesses several outstanding advantages. First, the matrix exponential spatial specification can exhibit an operation count as low as O(

), the same as OLS. Second, the usual diagnostics and other useful tools associated with least squares easily transfer to spatial maximum likelihood estimation. Finally, the availability of the likelihood greatly facilitates both classical and Bayesian inference (see LeSage 1997, 2000 for Bayesian variants of spatial models). Hence, users do not need to adopt another inferential paradigm to overcome computational difficulties arising during analysis of problems involving large samples.

To illustrate the efficacy of these techniques, the MESS is estimated using nationwide housing data from 57,647 census tracts. Any individual MESS autoregression takes under a second to compute. The ensemble of finding the neighbors from the locational coordinates, calculating 203 spatial autoregressions (to estimate hyperparameters), and computing the likelihood ratio tests associated with variable deletions takes under four minutes on a 600 Mhz PC compatible machine.

Section 2 provides the theory underlying spatial estimation with matrix exponentials, section 3 applies the MESS model to US census tract data, and section 4 summarizes the key results.

The first author gratefully acknowledges research support from Louisiana State University and the University of Connecticut. In addition, the authors would like to thank Ming-Long Lee, Carlos Slawson, and Dongya Zou for their comments. Please Do Not Quote Without Permission

Created by MicroPress TeXpider.