m0000003.htm

2.2 Computational considerations and comparisons
If the magnitude of the elements of the powers of

do not rise with the power, the power series converges rapidly. Note, row-stochastic, non-negative

can have a maximum of 1 in any row. Hence, the magnitude of the elements in the powers of

does not grow with the power. Given the rapid decline in the coefficients in the power series, achieving a satisfactory progression with six or seven terms seems feasible.

Straightforward implementation of the definition of

by a power series expansion truncated after

terms with

as the highest degree term would require

operations for dense

, as matrix multiplication requires

operations for dense matrices. This would not be practical and computing the matrix exponential using eigenvalues and eigenvectors would also be impractical for sufficiently large dense matrices, since this requires

operations.

If the graph of

is strongly connected, meaning that a path exists between every pair of observations, then

will be dense (all non-zeros) for positive

(Horn and Johnson, p. 361-362). Hence,

will be dense. Computing

separately would thus require prohibitive amounts of memory for large

. Fortunately, one does not need to compute

separately, as

always appears in conjunction with

This allows computation of

operations for dense

by sequential left-multiplication of

to form

element vectors, (i.e.,

, and so on).

For sparse

, that would result from a spatial weight matrix based on nearest neighbors computed using Delaunay triangles, or a sparse covariance structure based on the spherical variogram, the number of operations required to compute

declines dramatically.

The number of operations required drops to

, where

denotes the number of non-zeros. For the nearest neighbor spatial weight matrix approach with

non-zero entries in each row, the operation count associated with computing

would decline to

. This results in an operation count for computing

in nearest neighbor specifications of

that is linear in

To illustrate this computation approach in detail, we define the

matrix

comprised of powers of

times

in ( 5 ).

(5)

Note that, this simple form of sparse matrix-vector multiplication can be implemented without explicit use of sparse matrix multiplication algorithms. Given a list of

neighbors for observation

(i.e.,

and given equal weights for the neighbors, the product

for observation

is merely (

. That is, the sparse matrix-vector products needed to compute

in ( 5 ) require only indexing and addition, two of the fastest operations possible on digital computers. This means that the MESS model can be implemented in a variety of computing languages such as FORTRAN or C, and various statistical software environments. In fact, to demonstrate the feasibility of estimating MESS using standard software we coded the estimator in Fortran 90 as well as in Matlab. Employing a compiled language (Fortran 90) plus using indexing as opposed to sparse matrices increased the speed by more than 6 times.

We provide a comparison of maximum likelihood estimates and the time required to solve the MESS model and a traditional spatial autoregressive model (SAR) taking the form:

, with a spatial weight matrix based on 30 nearest neighbors making it an asymmetric row-stochastic matrix. These comparative results were based on a dependent variable representing housing prices across 57,647 census tracts along with five explanatory variables described in section 3 . This comparison makes the point that the parameter estimates for

in traditional spatial econometric models as well as inferences regarding the magnitude and significance of these parameters can be replicated using the MESS model. In this application the MESS model took the form:

, where the explanatory variables matrix

was not transformed for comparability with the SAR model.

The estimates from both models are presented in Table 1 along with computed deviances that would be used to draw inferences about the statistical significance of each variable. These deviances reflect the change in the log likelihood arising from sequential deletion of each variable from the model.

>From the table we see that the parameter magnitudes are very similar from both models. Since the model is in log form, the coefficients can be interpreted as the percentage change in housing prices that would result from a one percent change in the explanatory variables. The deviances would produce the same inferences regarding the significance or lack thereof for all variables. (In this case, all of the explanatory variables are significant at the 0.01 level.)

The table also shows a timing comparison for computing estimates using the two methods, where we see a dramatic improvement in speed associated with the MESS model which is over 1,000 times faster than the more traditional SAR model. This speed gain was obtained in Matlab. The Fortran coding accelerated this already fast implementation by over a factor of 6.

In section 2.5 we motivate that the computational speed of the MESS model allows quick hypothesis testing of alternative specifications for the functional form taken by the spatial weight matrix, as well as more traditional specification tests used in regression models. These testing methods are illustrated in an application presented in section 3 .

Sidje (1998) has also used this as a point of departure in the computation of matrix exponentials. In addition, Sidje provides other algorithms for computing the matrix exponential right multiplied by a vector. Finally, Sidje provides software implementing these algorithms in both Matlab and Fortran 77.

Myers (1997, p. 276) indicates the spherical model is the one most often used in geostatistical practice. See Barry and Pace (1997) for more about the use of sparsity with the spherical model.

The use of an asymmetric 30 neighbor spatial weight matrix poses a substantial computational challenge to computing the log-determinant term used in maximum likelihood. The times reported in Table 1 could be improved to around 400 seconds by using symmetric spatial weight matrix. The times would improve further if fewer neighbors were used or if the Monte Carlo Log-determinant estimator proposed by Barry and Pace (1999) were employed to compute the SAR estimates.