The Significance of Changes in the Gender Happiness Gap

A
misunderstanding. I suspect that the
claim that happiness did not significantly change from 1972-2006 comes from the
fact that we did not include stars when reporting the implied gender gaps in Table
1 of our paper. Thus, the claim that

the ordered probit analysis found that the "Gender
happiness gap" was not statistically significant, either in 1972 or in
2006, even at the 0.10 level

is simply untrue. Here’s the relevant part of Table 1, which is
an ordered probit regression, of happiness on time trends by gender:

The right way to test for whether women
were, on average, happier at the start of the sample is to look at the “Female
dummy”, which is clearly significant. The right way to ask whether this gender gap has changed is to look at
the difference in trends, which is also clearly significant. The last two rows are regression-based predicted
values, so we didn’t think we should put stars next to these numbers.

Statistical
mischief: When you want to make a result go away, throw away enough data,
and a result will become insignificant. For instance pooling all of the data gives us a useful 46,303
observations. Analyze any specific year,
and you are left with only 1,500-3,000 data points. Even so, let’s analyze only data from 1972
and 2006:
- %Very happy = 28.7 + 3.1*Female +1.6*(Year2006)
  – 2.4*(Female in 2006)
- %Not too happy = 18.1 -3.2*Female – 5.5*(Year2006)
  + 4.1*(Female in 2006)

In the first case, no coefficients are
statistically significant, and in the latter, all are. In both cases, the estimates say that women
were once a fair bit happier than men, and this is no longer true. Comparing this regression with those in our
paper, we simply learn that a smaller sample yields similar estimates, but they
are less likely to be statistically significant.

Looking for
a masterpiece, when we are doing collage. Sometimes studying social
phenomena is hard, and one draws on many data sources to put together a collage
of evidence. Our paper finds declining
happiness among women relative to men in: the General Social Survey (n=46,303
from 1972-2006); the Virginia Slims Poll (n=26,701 from 1972-2000); among U.S.
12^th graders (Monitoring the Future; n=433,906 from 1976-2005); in the
United Kingdom (British Household Panel Study data from 1991-2004; n=121,135);
in Europe (the Eurobarometer analysis has n=636,400 from 1973-2002, covering 15
countries), and across developed countries (the International Social Survey
Program contains surveys 35 countries from 1991-2001 yielding n=97,462). The only dataset that does not yield clear
results of a decline in women’s happiness relative to men’s is the World Values
Survey, and even there, the data do not speak clearly.

Let me try to give a particularly transparent description of the data,
simply splitting the GSS data into two periods, 1972-1989 v. 1990-2006. There was a clear gender happiness gap in the
earlier period (34.3% of women were very happy v. 31.8% of men). This difference is clearly statistically
significant (t=4.1). In the later
period, 30.9% of women were very happy, compared with 31.1% of men. This recent gender happiness gap is
insignificant (t=-0.3). The decline in
the share of women who were very happy (34.3% v. 30.9%) is clearly significant
(t=5.9), while the corresponding changes for men were not (t=-1.1). The decline in the share of women who were
very happy relative to men is also significant (t=-3.1). Analyzing the share who are “not too happy”
yields a roughly similar pattern (but in reverse): an insignificant “unhappiness
gap” in the earlier period, but a significant gap emerged in the latter period. Interestingly, the “unhappiness gap” emerged
because as men became less likely to be unhappy, as women’s unhappiness
remained largely stable. The ordered
probit is a regression technique that allows one to make these happiness and
unhappiness comparisons all at the same time; these regressions tell us that
there was a gender happiness gap favoring women in the earlier period, and it
now favors men. For the
regression-heads, if your library subscribes can download the GSS data from the
ICPSR here. I’ll post some stata code in the comments.

This post only deals with whether the effects we
describe in
the paper are statistically significant. The other complaint is that
our results are too small to matter. Later today, I’ll turn to how we
think
about whether these are large or small effects.

[Written jointly with my coauthor Betsey Stevenson]

UPDATE: See discussion of "economic significance" here.