1/24/2007

Problems with the latest Miller, Hemenway, Azrael study on guns

The New York Times reports yesterday that a new study from Miller, Hemenway, and Azrael claims: "States with the greatest number of guns in the home also have the highest rates of homicide, a new study finds. . . . " Well, I have just spent a short time looking at the study, but there are some of things that are pretty obvious: 1) They excluded the District of Columbia without any explanation, 2) they use other crime rates to explain the homicide rate (by the way, they don’t use anything like an arrest or conviction rate, nothing to do with law enforcement), 3) they use purely cross-sectional data that never allows one to properly control for what may cause differences in crime rates, and 4) data from different years is used without any explanation (for the sake of argument I will use what they did, but it is weird to have the unemployment rate from 2000 to explain the homicide rate from 2001 to 2003, etc.). The data for a panel test on this is readily available from the sources used in their paper, though I have only collected the data to redo the estimates for 2001 that they use (why is it that these papers where one can put together the data in an afternoon get any serious attention). Why they only looked at the CDC data for 2001 when it is available for many other years is a bit of a puzzle. Since Miller and Hemenway have refused in the past to let me look at their data, I didn't bother this time and simply put the data together myself.

The bottom line is that their results comes from two factors: the exclusion of DC and the use of other crime rates to explain the murder rate. Changing these two factors causes their result to go from positive and significant to negative and significant. I also decided to run these regressions on the robbery rate and doing so produced a statistically significant negative effect whether or not DC was excluded. Using arrest rate data, not shown, also caused the results to be more significantly negative. If I had the necessary panel data handy, my strong presumption is that would also reverse with their result whether or not DC was included.

It is problematic to include the other crime rates in these regressions, particularly since they must believe that guns cause robbery as well as homicide. The results below indicate that more guns mean fewer robberies (again this is using their flawed set up, though I believe that this would continue to be observed with panel data).

The general issue when you are doing this type of empirical work is to use all the data available. When I have done my empirical work on guns I have used all the data available for all jurisdictions for all the years available. In this case, the CDC survey data is available for many years after 1995, not just 2001, and they are not using all the jurisdictions. If you selectively pick years or places one should have a good explanation for why you are doing that, and I don't see any such explanations in the paper. The regressions reported by Miller et. al. are also not the type of regression estimates that any economist would run. What I try to show below is how sensitive the results are to what I would consider to be the most obvious corrections. Including all jurisdictions and make the estimates slightly more consistent with the way an economist would look at it without even having to add new variables.

In any case, noting that this is purely cross-sectional data and not very useful, here is an attempt to redo their estimates looking at the homicide rate from 2001 to 2003 on the gun ownership rate from the CDC and the other variables that they use (I wasn't able to find their gini coefficient, but that is the only variable that they used that wasn't included). Here are some very simple linear regressions that I put together fairly quickly:

DC excluded (used all their variables in their Table 3, except for the gini coefficient)

Homcide01to03 = average homicide rate from 2001 to 2003.
I think that the other variables should be clear.

. reg Homcide01to03 gunownershiprate2001 percenturban medianfamilyincome1999 percentbelowpovertylevel percentblack percentsinglefemaleparenthouseho unemploymentrate2000census percentdivorced percentpop18342001 aggrivatedassaultrate2001 robberyrate2001 southerncensusregion alcoholconsumption2001 if notDC==0

Source | SS df MS Number of obs = 50
-------------+------------------------------ F( 13, 36) = 21.98
Model | 275.288226 13 21.1760174 Prob > F = 0.0000
Residual | 34.6827793 36 .963410535 R-squared = 0.8881
-------------+------------------------------ Adj R-squared = 0.8477
Total | 309.971006 49 6.32593889 Root MSE = .98153

------------------------------------------------------------------------------
Homcide01~03 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gunowne~2001 | 6.158754 2.575103 2.39 0.022 .9362022 11.38131
percenturban | -1.20992 2.421382 -0.50 0.620 -6.12071 3.70087
medianf~1999 | .000102 .000079 1.29 0.205 -.0000581 .0002622
percentbel~l | 40.05939 19.33717 2.07 0.046 .8417922 79.27699
percentblack | .1185185 .0484017 2.45 0.019 .0203554 .2166816
percentsin~o | -3.773734 39.70597 -0.10 0.925 -84.30117 76.75371
unemployme~s | -26.08681 26.27778 -0.99 0.327 -79.38061 27.20699
percentdiv~d | 27.83938 17.55642 1.59 0.122 -7.76669 63.44544
per~18342001 | 12.88474 13.88689 0.93 0.360 -15.27917 41.04865
aggriva~2001 | .0016147 .0016653 0.97 0.339 -.0017627 .0049922
robbery~2001 | .0243026 .0056717 4.28 0.000 .0127999 .0358053
southernce~n | -1.351635 .599814 -2.25 0.030 -2.568114 -.1351559
alcohol~2001 | .0742161 .3756206 0.20 0.844 -.6875778 .83601
_cons | -14.5245 5.782964 -2.51 0.017 -26.2529 -2.796107
------------------------------------------------------------------------------

DC excluded (did not include their variables for other crimes)

. reg Homcide01to03 gunownershiprate2001 percenturban medianfamilyincome1999 percentbelowpovertylevel percentblack percentsinglefemaleparenthouseho unemploymentrate2000census percentdivorced percentpop18342001 southerncensusregion alcoholconsumption2001 if notDC==0

Source | SS df MS Number of obs = 50
-------------+------------------------------ F( 11, 38) = 14.32
Model | 249.711 11 22.701 Prob > F = 0.0000
Residual | 60.2600055 38 1.58578962 R-squared = 0.8056
-------------+------------------------------ Adj R-squared = 0.7493
Total | 309.971006 49 6.32593889 Root MSE = 1.2593

------------------------------------------------------------------------------
Homcide01~03 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gunowne~2001 | 2.69241 3.090395 0.87 0.389 -3.563767 8.948587
percenturban | 5.193162 2.623195 1.98 0.055 -.1172174 10.50354
medianf~1999 | .0000198 .0000975 0.20 0.840 -.0001776 .0002172
percentbel~l | 25.22912 24.1867 1.04 0.303 -23.7343 74.19253
percentblack | .2104145 .0536538 3.92 0.000 .1017981 .3190309
percentsin~o | 10.48135 48.55617 0.22 0.830 -87.81547 108.7782
unemployme~s | 1.005869 32.85402 0.03 0.976 -65.50361 67.51534
percentdiv~d | 50.45611 21.41619 2.36 0.024 7.101307 93.81091
per~18342001 | 6.999652 17.28577 0.40 0.688 -27.99356 41.99286
southernce~n | -1.131898 .7236749 -1.56 0.126 -2.596902 .333105
alcohol~2001 | .0678944 .4816396 0.14 0.889 -.9071341 1.042923
_cons | -13.31319 7.321042 -1.82 0.077 -28.13387 1.507483
------------------------------------------------------------------------------


Same as above, but DC is included
. reg Homcide01to03 gunownershiprate2001 percenturban medianfamilyincome1999 percentbelowpovertylevel percentblack percentsinglefemaleparenthouseho unemploymentrate2000census percentdivorced percentpop18342001 southerncensusregion alcoholconsumption2001

Source | SS df MS Number of obs = 51
-------------+------------------------------ F( 11, 39) = 31.88
Model | 1620.08306 11 147.280278 Prob > F = 0.0000
Residual | 180.146769 39 4.61914793 R-squared = 0.8999
-------------+------------------------------ Adj R-squared = 0.8717
Total | 1800.22983 50 36.0045966 Root MSE = 2.1492

------------------------------------------------------------------------------
Homcide01~03 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gunowne~2001 | -9.199294 4.729762 -1.94 0.059 -18.76614 .3675525
percenturban | -3.598846 4.131027 -0.87 0.389 -11.95464 4.756945
medianf~1999 | .0000194 .0001664 0.12 0.908 -.0003172 .000356
percentbel~l | 39.06187 41.19014 0.95 0.349 -44.25305 122.3768
percentblack | .4766173 .0751993 6.34 0.000 .3245123 .6287222
percentsin~o | -201.1131 71.71166 -2.80 0.008 -346.1636 -56.06257
unemployme~s | 98.52408 52.70362 1.87 0.069 -8.079052 205.1272
percentdiv~d | 94.91258 35.49413 2.67 0.011 23.11892 166.7062
per~18342001 | 95.1942 23.88845 3.98 0.000 46.87524 143.5132
southernce~n | -3.159236 1.169235 -2.70 0.010 -5.524236 -.7942356
alcohol~2001 | 1.496186 .7727291 1.94 0.060 -.0668065 3.059178
_cons | -25.89853 12.24821 -2.11 0.041 -50.67287 -1.124194
------------------------------------------------------------------------------


DC excluded, not using their selective set of control variables
. reg Homcide01to03 gunownershiprate2001 if notDC==0

Source | SS df MS Number of obs = 50
-------------+------------------------------ F( 1, 48) = 0.00
Model | .00402852 1 .00402852 Prob > F = 0.9802
Residual | 309.966977 48 6.45764536 R-squared = 0.0000
-------------+------------------------------ Adj R-squared = -0.0208
Total | 309.971006 49 6.32593889 Root MSE = 2.5412

------------------------------------------------------------------------------
Homcide01to03 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gunownershiprate2001 | -.0743955 2.978593 -0.02 0.980 -6.063259 5.914468
_cons . . . . . . . . . | 4.707644 1.0878 4.33 0.000 2.520475 6.894813
-

Same with DC included
. reg Homcide01to03 gunownershiprate2001

Source | SS df MS Number of obs = 51
-------------+------------------------------ F( 1, 49) = 5.18
Model | 172.063659 1 172.063659 Prob > F = 0.0273
Residual | 1628.16617 49 33.227881 R-squared = 0.0956
-------------+------------------------------ Adj R-squared = 0.0771
Total | 1800.22983 50 36.0045966 Root MSE = 5.7644

------------------------------------------------------------------------------
Homcide01to03 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gunownershiprate2001 | -14.46889 6.358312 -2.28 0.027 -27.24639 -1.69138
_cons . . . . . . . . . | 10.34603 2.299427 4.50 0.000 5.725162 14.9669
------------------------------------------------------------------------------




What it means. Again, this uses purely cross-sectional data, but accepting that: their result depends on excluding DC and including other crime rates to explain the murder rate. This would mean that more guns, less homicide. Even when DC is excluded, the simple correlation using cross-sectional data is negative, though not at all statistically significant.

Just for the sake of argument, I did the same regressions for robbery (though I only took the time to put together the robbery rates for 2001).


DC Excluded
. reg robberyrate2001 gunownershiprate2001 percenturban percentdivorced medianfamilyincome199
> 9 percentbelowpovertylevel percentsinglefemaleparenthouseho percentblack southerncensusregion
> percentpop18342001 unemploymentrate2000census alcoholconsumption2001 if notDC==0

Source | SS df MS Number of obs = 50
-------------+------------------------------ F( 11, 38) = 14.80
Model | 151143.145 11 13740.2859 Prob > F = 0.0000
Residual | 35287.596 38 928.620948 R-squared = 0.8107
-------------+------------------------------ Adj R-squared = 0.7559
Total | 186430.741 49 3804.709 Root MSE = 30.473

------------------------------------------------------------------------------
robbery~2001 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gunowne~2001 | -148.547 74.7843 -1.99 0.054 -299.9399 2.845877
percenturban | 220.1914 63.47854 3.47 0.001 91.68583 348.697
percentdiv~d | 940.7374 518.2491 1.82 0.077 -108.4031 1989.878
medianf~1999 | -.0024856 .0023595 -1.05 0.299 -.0072621 .0022909
percentbel~l | -425.7565 585.2927 -0.73 0.471 -1610.62 759.1066
percentsin~o | 99.18109 1175.008 0.08 0.933 -2279.498 2477.861
percentblack | 3.950401 1.298365 3.04 0.004 1.321999 6.578804
southernce~n | .8315924 17.51217 0.05 0.962 -34.61994 36.28313
per~18342001 | -100.722 418.2974 -0.24 0.811 -947.5208 746.0768
unemployme~s | 892.2601 795.0325 1.12 0.269 -717.1991 2501.719
alcohol~2001 | -.6820588 11.65517 -0.06 0.954 -24.27672 22.9126
_cons | 11.46862 177.1615 0.06 0.949 -347.1761 370.1133
------------------------------------------------------------------------------


DC included
. reg robberyrate2001 gunownershiprate2001 percenturban percentdivorced medianfamilyincome199
> 9 percentbelowpovertylevel percentsinglefemaleparenthouseho percentblack southerncensusregion
> percentpop18342001 unemploymentrate2000census alcoholconsumption2001

Source | SS df MS Number of obs = 51
-------------+------------------------------ F( 11, 39) = 34.80
Model | 468437.017 11 42585.1833 Prob > F = 0.0000
Residual | 47727.1118 39 1223.7721 R-squared = 0.9075
-------------+------------------------------ Adj R-squared = 0.8815
Total | 516164.128 50 10323.2826 Root MSE = 34.982

------------------------------------------------------------------------------
robbery~2001 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gunowne~2001 | -269.6794 76.98545 -3.50 0.001 -425.3971 -113.9616
percenturban | 130.6335 67.23995 1.94 0.059 -5.372167 266.6391
percentdiv~d | 1393.584 577.7313 2.41 0.021 225.0122 2562.156
medianf~1999 | -.0024894 .0027086 -0.92 0.364 -.007968 .0029893
percentbel~l | -284.852 670.4441 -0.42 0.673 -1640.953 1071.249
percentsin~o | -2056.182 1167.237 -1.76 0.086 -4417.142 304.7783
percentblack | 6.662021 1.224005 5.44 0.000 4.186237 9.137804
southernce~n | -19.81946 19.03141 -1.04 0.304 -58.31413 18.6752
per~18342001 | 797.6534 388.8279 2.05 0.047 11.17482 1584.132
unemployme~s | 1885.609 857.8469 2.20 0.034 150.45 3620.768
alcohol~2001 | 13.86693 12.57757 1.10 0.277 -11.5736 39.30746
_cons | -116.7293 199.3618 -0.59 0.562 -519.9766 286.5179
------------------------------------------------------------------------------

For Robbery whether you included DC or not there is a statistically significant negative relationship between the CDC's measure of gun ownership in 2001 and robbery rates in that year.

Sorry about the typos. I was working on this pretty late.

Labels: ,

8 Comments:

Anonymous Anonymous said...

I wonder who funded that study?

1/24/2007 3:46 PM  
Blogger John Lott said...

Dear Anonymous:

They were funded as they frequently are by the Joyce Foundation, but I think that it is a big mistake to make this an issue.

1/25/2007 12:44 AM  
Anonymous MaverickNH said...

In industry, the marketing folk often compare their product to the competition in 100 ways and then publish the 5 ways their product is better, ignoring the 95 ways it's not. Perhaps the authors did statistics on all the factors they could think of and could only make a case for robbery w/o DC?

When I looked at their 9 states with > 1SD more guns, I see they comprised less than 6% of the US population. Even if they believe themselves to state a truth, their truth is a pretty small one ....

1/25/2007 7:21 PM  
Anonymous Anonymous said...

Jon you are the best!!!

1/25/2007 11:48 PM  
Anonymous Chuck Bloomer said...

Dear Mr. Lott,
I am not a statistician or research specialist. Can you provide an interpretation of your findings in simplified terms that even I can understand?
Thank you,
Chuck Bloomer
www.libertycall.us

1/26/2007 12:28 PM  
Anonymous Anonymous said...

Why of course they would exclude Washington D.C., they know that guns are prohibited there, so there cannot be any gun related deaths in D.C.

1/26/2007 6:18 PM  
Anonymous blcjr said...

Chuck Bloomer said: "I am not a statistician..."

Well, I am, of sorts, but the formatting -- or lack thereof -- makes the stats John posted all but indecipherable, without a lot of effort.

Plus, I've taken a look at the CDC site that John links to, and I cannot find any meaningful data on firearms usage, at least not for 2001. And since I don't have access to the study, or their data, I really feel frustrated by this. I'd love to crunch the numbers for myself, but I don't know what the numbers are, and John's description doesn't really tell me what they are in any sense that I can independently verify.

1/26/2007 10:14 PM  
Blogger John Lott said...

Dear blcjr:

The data is on the CDC website, though it takes some time to put it together. Of course, that is only one part of this data set. Overall, it didn't take more than a few hours to put the "data set" that they used together. In any case, there is no reason to be frustrated by this. I would be very happy to email you the data set in STATA format if you send me an email.

As to the results I showed, if you took a minute, my guess is that you would see what there is to see. There are spaces between the columns and they line up with the column headings. This is typical regression output.

1/27/2007 2:01 PM  

Post a Comment

Links to this post:

Create a Link

<< Home