« Where do our beliefs come from? | Main | A very well-read waiter »

The age-performance profile in baseball

[Economist Ray Fair] asked... Which players have exhibited the most unusual age-performance profiles? Specifically, are there any players who got better with age?

Over the entire period between 1921-’04, Fair found only 18 hitters who appear to have defied Mother Nature, logging four or more seasons after the age of 28 in which their OPS (on-base plus slugging average) exceeded their age-specific expected level by more than one standard error.

And you know what?  Except for good ol' Charlie Gehringer (1939), they all come 1987 or later.  Goodness gracious!  Who would have thought?  Check here for the list and further discussion, and thanks to John de Palma for the pointer.

Posted by Tyler Cowen on December 14, 2007 at 09:55 AM in Sports | Permalink

Comments

Obviously one should look at steroids as a cause of the performance. But other medical advances, such as in training, diet, exercise, surgery, and so on would mean that players of more recent vintage would have a better chance of good performance as they got older, just as we've seen lots of older non-athletes continue to perform in other fields.

Posted by: Dennis Mangan at Dec 14, 2007 10:38:59 AM

Tyler, please read the comments on that thread (it's been up for a while). In particular, note Guy's comment near the front. I'll republish it here:

The impact of age on performance is is an important and interesting topic, and it would be great to have someone of Fair’s talent provide a good analysis of it. However, he has failed to account for two factors — one that he acknowledges, the other not — that explain most of the post-peak over-performers.

First, there are a few extreme ballparks that have such a large impact that you can’t ignore it. One is CO, and another is AZ. Four of your 15 “suspects” posted their suspicious seasons in these parks. Fair notes the possibility of a park impact, but with the easy availability of park adjustments there’s really no reason to ignore this.

More importantly — and unfortunately, this really destroys the whole paper — Fair has not adjusted for scoring levels by year. There was a huge increase in scoring that began in 1993. Any player who played both before and after 1993 is therefore likely to have a tendency to over-perform in the later part of his career, simply because his later years came in a high-OPS period. Not coincidentally, nearly all of the “suspects” fall into this category.

Similarly, Fair’s ranking of pitchers is dominated by pitchers with the good fortune to have pitched in low-scoring periods, while pitchers like Clemens and Randy Johnson should be ranked higher.

Someone should suggest to Fair that he rerun his analysis using Baseball-Reference’s OPS+ and ERA+ metrics. These conveniently adjust for run scoring context and park. Then, we might learn something.

(Fair also treats a percentage change in all sports metrics as equivalent. So, because pitchers decline 9.5% in ERA at age 37, but hitters decline 5.6% in OPS, he concludes pitchers decline more quickly. However, a 5.6% drop in OPS is atually equivalent to about a 10% drop in runs scored, so the declines are comparable. He really should convert everything to SDs.)

Because Fair did not adjust for park or era, his rankings are biased such that we would expect them to be filled with players who overlapped a large increase in the run-scoring environment. This is exactly what we find.

What's frustrating is how easy this would be to correct. OPS+, a statistic freely-available for all major league player-seasons in history at baseball-reference.com, adjusts OPS for both era and park. If after making those corrections, Fair still came up with the same list, that would be very notable.

Posted by: Kyle S at Dec 14, 2007 10:41:02 AM

Which economists have exhibited the most unusual age-performance profiles?

Posted by: GVV at Dec 14, 2007 10:49:21 AM

But wait. We have the expected (mean) peak performance
age (27/28), and an expected rate of decline. But
what's the variation in that? What's the s.d. of peak
performance age? Looking at Fair's paper, I had immense
difficulty extracting information on the variation in
peak age or in performance decline.

Fair has data on 441 hitters. Let's suppose the s.d.
of peak performance age is 2.5 years. Then about 16%
of hitters (about 70) would have peak performances at
about ate 30.5; about 5% (20) would peak after age 33,
and about 1% (4) would peak agter age 35/36. So we
have some late peaks. Isn't that what a normal
distribution would tell us? Why are we surprised that
some players have late peaks? Why do we immediately
attribute that to performance enhancing drugs?

Just asking.

Posted by: Donald A. Coffin at Dec 14, 2007 10:55:19 AM

Kyle S is right,
There was such a jump in offensive levels between 1992-94 preceeded and followed by the usual small amounts of year to year variation around these different levels that PED's are not a very good explanation unless all PED users started using them at once. If you look at the list, there are only a couple of players that did not have some "bar setting" seasons before the jump, and have their exceeding expectations seasons after the jump. Given that this was the single largest jump in run scoring environment in baseball history, these are exactly the players you would expect to have the largest increase in unadjusted performance.

As for the other players on the list, Bob Boone at age 40 had a fairly lucky season where he his .310 on balls in play without his peripheral numbers changing at all. The next year he went back to being craptacular, and the next he was out of baseball. This season probably only makes the list because not many players are usually around to have flukey seasons at age 40, and most players drop significantly each season in their late thirties while Boone was able to eek out a season slightly better than before.

The other two non-Gehringer entries are both from 1987, a year where both leagues experienced a large jump in run scoring out of line with anything between WW2 and 1994(I wonder if it wasn't a test run). Again, you'd expect the largest "oveperformance" in a year like that.

I believe if some kind of park and era adjustments were made, Bonds and Sosa's seasons would still probably rank as out of line, but I would like to know exactly how much of this list is exactly what we should have expected.

Posted by: josh at Dec 14, 2007 11:56:58 AM

Here's some league average data. Notice how on-base and slugging vary randomly in each league, except that both leagues experience a big single season jump in 1987, and then a permanent jump to a new level in 1994.

NL AVG OBP SLG
1980 .259 .320 .374
1982 .258 .319 .373
1984 .255 .319 .369
1986 .253 .322 .380
1987 .261 .328 .404
1988 .248 .310 .363
1990 .256 .321 .383
1992 .252 .315 .368
1994 .267 .333 .415
1996 .262 .330 .408
1998 .262 .331 .410
2000 .266 .342 .432
2002 .259 .331 .410

AL
1980 .269 .331 .399
1982 .264 .328 .402
1984 .264 .326 .398
1986 .262 .330 .408
1987 .265 .333 .425
1988 .259 .324 .391
1990 .259 .327 .388
1992 .259 .328 .385
1994 .273 .345 .434
1996 .277 .350 .445
1998 .271 .340 .432
2000 .276 .349 .443
2002 .264 .331 .424

Posted by: josh at Dec 14, 2007 12:13:00 PM

This may be indicative of steroid use, but we should at least consider the alternative explanation that older players are willing to take up an extremely rigorous training regimen just to stay in the game because the financial incentives nowadays are enormous. And perhaps in a few cases this newfound seriousness and sense of purpose can lead not just to maintaining skills but actually improving them.

Baseball salaries have gone up far, far faster than inflation. Tens of millions of dollars are at stake if a top player can remain employed for even a year or two longer.

In 1965, the minimum salary for a baseball player was a mere $6000. In 1969, Curt Flood was offered a mere $5000 salary increase and held out for more. Free agency was not introduced until 1975. [link] Or see another link for a history of average player salaries.

In 1987, the top annual salary was $2 million. The 2007 top salary is $23 million.

Posted by: at Dec 14, 2007 12:19:21 PM

I can't believe Charlie Gehringer used steroids. I'm totally devastated.

Posted by: David at Dec 14, 2007 12:44:35 PM

I can't believe Charlie Gehringer used steroids. I'm totally devastated.

Posted by: David at Dec 14, 2007 12:45:09 PM

Of that list, though, one of the more recent ones was B.J. Surhoff, whose numbers might be explained by the fact that his early career was replete with examples of failing to live up to expectations.

And I don't think even 'roids could explain Julio Franco. That's either just divine providence or a deal with the devil, one or the other.

Posted by: Michael Fisk at Dec 14, 2007 12:58:52 PM

It's somewhat interesting to note that some of the players at the bottom of the list did not play in "full seasons" in their biggest outlier years. For example, Molitor played in only 118 games in 1987 (in most of these seasons under study a "full season" is either 154 games or 162). Molitor was also oft-injured early in his career so his established early career levels are likely not indicative of his actual ability, especially once they took him off the field and made him a full time designated hitter. Gehringer in only 118 in 1939, Surhoff only 117 in 1995, Chili Davis played in 108 in 1994, and Julio Frano played in 125 (but only 360 or so plate appearances) in 2004 (plus the fact that I can't believe one could credibly establish any level of play for a 45 year old player). It's possible, at least for these particular seasons, that the players just didn't have time to return to their historical norms. It's like they were riding a hot streak and then were hurt before turning cold.

That basically leaves the Coors field effect guys (Larry Walker and Andres Galarraga) and the top couple of guys on the list (Bonds, McGwire, Sosa, Luis Gonzalez who no one ever mentions, and Ken Caminiti who admitted using steroids during 1996). Even Bonds' biggest outlier year is a semi-"fluke" because he had 120 plate appearances where he had no control (at least during those at-bats - he had control over what happened in previous years) over what happened (he was intentionally walked 120 times), which in a way will "artificially" drive up a player's OPS. As for Gary Gaetti, Bill James (I think it is in the New Baseball Historical Abstract - the book where he ranks the top 100 players at each position by his Win Shares method) noted that he had a remarkably slow decline over the course of his career, way slower than just about anyone who had his initial skill set.

I would also throw in the fact that there has been an acceptance of the usefulness of walks as part of player's offensive repetoire, thus possibly leading to player's who walk a lot staying in the game longer, especially if they have power or some other skill. Rickey Henderson, by the "traditional" metric of batting average, may have been finished in the mid-late 1990s because he was batting about .240. But his OBP was up around .400, and even around .370 in his last few years, and he could still run, and managers saw value in that. Also in that Bill James book, on the Roy Cullenbine entry, there was a story about Cullenbine being the laziest SOB on the planet because all he did was wait around for a walk. He had the 3rd highest OBP in the AL that year and the 2nd most walks (behind Ted Williams) and even received 3 points in MVP voting despite batting just .224, yet he played no more after that year (he was only 33). The fact that it is recognized that these players with high walk totals are helping the club (call it the Moneyball effect if you want even though it happened before Moneyball) may lead to some of them staying in the game longer and having "abnormal" seasons after their 27th or 28th birthday. I really don't understand how Albert Belle is on the list with his biggest outlier year at age 28 given that 28 is supposed to be a hitter's peak year. Maybe I'm missing something.

And I'd really like to see Hank Aaron's numbers in this study - his two highest OPS numbers were 1.079 and 1.045, which happened in 1971 and 1973, when he was 37 and 39 years old. He also had his career high in homeruns at age 37. Most of this information you can find from the stats on www.baseball-reference.com Actually, in 4 of his 5 years from age 35-39 he has an OPS over .950, and in the other year it is over .900. I can't possibly see how he didn't make this list, even given his established level of play early in his career.

Posted by: AZ at Dec 14, 2007 2:48:52 PM

AZ,

Aaron played his whole career in a fairly uniform run scoring environment, so even if he had an actual career curve similar to a guy like Palmeiro, or Sosa, it wouldn't show up as having as large a magnitude. Palmeiro gets credit for not only maintaining a high level of performance with age, but also for moving from a neautral ball park to an extreme power hitters ball-park, and for the introduction of juiced baseballs in 1993.

http://www.insidethebook.com/ee/index.php/site/comments/the_juiced_ball/

http://highboskage.com/juiced-ball.shtml

Just eyeballing the actual versus predicted performance data, all of these players seem to have negative risiduals just about every year before 1993-94, and positive one afterward (some of the players also had positive residuals in 87 as well, for reasons I noted earlier). This holds just about perfectly for every player whose carreer spanned those years except for Larry Walker, Luis Gonzalez, Steve Finley, Gary Gaetti and Barry Bonds.

Larry Walker had large negative residuals before moving to Coors and positive residuals afterwards. Luis Gonzalez positive residuals start the day he arrived in Arizona as do Finley's. Gary Gaetti had an up and down injury riddled carreer. He always had good power and his big year on this list didn't come from a power surge, but from a flukey high batting average. That leaves Bonds, but since he's the only data point left, we can't really extrapolate anything about how to determine a steroid user (only Bonds seems to have had such a large surge in production out of this group).

Posted by: josh at Dec 14, 2007 3:46:54 PM

Interesting that age 27 or 28 seems to be players' peak. I think this accords well with some of Bill James' work. What does that say about free agent contracts, which get pretty big around that time? What does it say about the frenzy over getting Johann Santana - age 27 - in trade for good young prospects?

Posted by: Bernard Yomtov at Dec 14, 2007 9:51:03 PM

Economist Ray Fair writes: "No attempt has been [made] in this study to adjust for different ball parks."

This is another example of the arrogance of freakonomizing economists who parachute into a field well-studied by noneconomists, crunch some numbers, and then proclaim a finding that overlooks a basic factor, such as park effects, that the experts in the field all know about. Baseball statistics analysts have developed sophisticated techniques for adjusting for park effects, and the adjusted data is readily available. Just use the *OPS+ column at BaseballReference.com to get park-adjusted Onbase Percentage + Slugging relative to league norms.

I'm the world's least sophisticated baseball analyst, but I at least knew enough about sabremetrics to use that statistics back in 2004 to show that Barry Bonds's age profile of hitting could only be explained by juicing:

http://isteve.blogspot.com/2004/12/barry-bonds-batting-by-age.html

Posted by: Steve Sailer at Dec 15, 2007 1:20:28 AM

Babe Ruth's *OPS+ was slightly higher from age 35-38 than it was from age 27-30. The reason was that after Ruth's disastrous 1925 season at age 30 caused by his Britney Spears'-level hedonism, he hired a personal trainer and, to the surprise of everybody, worked out impressively during the off-season for the next decade.

Honus Wagner's four highest *OPS+ seasons were from 32-35 (1906-1909). Wagner was one of the very few players of his day to stay in shape by lifting weights.

The moral is that it can be difficult to tell the difference between a player who works out and a player who works out and uses steroids. Since economist Ray Fair is doing a statistical analysis that threatens to impugn the reputations of individuals by implying they are cheating, he owes it to the players and to the world to use the best analytical tools available, even if that requires him to study the work of non-economists.

Posted by: Steve Sailer at Dec 15, 2007 1:32:31 AM

Here's a classic example of the arrogance of freakonomists: When baseball statistics-savvy commenters pointed out that economist Ray Fair didn't use the best available data, economist Dave Berri replied:

"So often non-economists commenting on research think they have discovered something that only a “true fan” of the sport could have seen. More often than not, though, the non-economist simply failed to understand something."

http://dberri.wordpress.com/2007/06/04/rocket-science-clemens-and-%E2%80%98roids/

You don't have to be a fanatical fan to know that implying, as the inaptly named Ray Fair did, that the reason Larry Walker had better hitting statistics after he moved from Montreal to Denver was because of juicing is a travesty. I have no idea if Walker was on the juice, but at least I know that there's a massive park illusion inherent in comparing raw statistics from the cold and damp Montreal ballpark to high and dry Coors Field.

Posted by: Steve Sailer at Dec 15, 2007 1:48:23 AM

Why economists so often pronounce such complete emotional nonsense?

Immigration, IQ and race, now baseball - and that just on this blog.

There is got to be a systemic error.

Perhaps economics adequately describes much smaller domain than economists think.

Posted by: mik at Dec 15, 2007 7:34:26 PM

Economists are the hot social scientists of the moment, just like cultural anthropologists (e.g., Margaret Mead) were in the 1950s and 1960s. Nobody pays attention to cultural anthropologists anymore, even though it's an important field that we need to know about. Economists need to combat the symptoms of smugness within their profession or they'll go the way of the now-lowly anthropologists.

Posted by: Steve Sailer at Dec 16, 2007 5:15:59 AM

I'll agree about the arrogance of economists. But when it comes to performance-enhancing drugs in sports, the numbers, or for that matter the look of the bodies, make it pretty clear Park effects, or whatever, don't make it go away. I think some of you are confusing good debating points with an effective response on the substance.

Posted by: Tyler Cowen at Dec 16, 2007 7:41:31 AM

Tyler,
Of course performance enhancing drugs are effecting the game. I'm only trying to say that this particular study doesn't help us figure out who or how. The idea of comparing aging trends would have been interesting if done right. If the study had taken into account park effects (Coors field in the 90s [they keep balls in a humidor now] inflated run scoring by 50%! 50%! It helped high walk/high power type hitters even more!) and era (there was a 15% increase in run scoring between pre-1994 and post-1994 and nothing that can't be categorized as random variance within these eras!)and still found unusually high number of players in the modern era, that would really tell us something. However, it's not clear, given the data that the study does provide, that this era would necessarily come out as unusual, so we really haven't learned anything at all.

As for more debating points, you say that its clear from "the numbers" that park effects don't make the effect of PEDs go away. However, it's not clear that we know how steroids have effected the overall level of play or how to identify a user based on performance alone. If PEDs increase home runs or extra base hits, shouldn't we have seen a gradual increase in the rates of occurence of these events, not a one time jump?

From an individual performance standpoint, if you admit that something non-steroidal and permanent happened to increase run scoring in 1994, you're not left with a whole lot of very unusual performances. There's really just Barry Bonds, but Bonds' best years were farther ahead offensively of the next best guy in the league than any other player has ever been, so you really have to wonder why PEDs seem to help Bonds so much more than other players.

Posted by: josh at Dec 16, 2007 1:51:43 PM

last post, I promise. Larry Walker at career at Coors field
Avg obp slg
.381 .462 .710
and carreer on the road
.278 .370 .495

The year in question:
.461 .531 .879 at Coors
.286 .375 .519 on the road

Good gracious, that even shocked me.

(Also Todd Helton career .367 .465 .663 at Coors, .295 .394 .502 on the road)
I think we can stop using Walker as a data point.

Luis Gonzalez, 2001 hit the hell out of the ball on the road and home, but the weird thing is, his performance in 2002 is essentially the same as it was in 2000. It was really just one fluke year. Maybe steroid-enabled, but why just the one year?

Posted by: josh at Dec 16, 2007 3:37:39 PM

"I'll agree about the arrogance of economists."

is it possible to apply some of that new found humility to this blog?

Posted by: mik at Dec 16, 2007 7:09:18 PM

Tyler, it's not "debating points."

For example, I've been complaining that steroids impact sports statistics since the 1990s. And I did exactly the same type of age-profile analysis several years ago on by blog and came to the same conclusion: that Barry Bonds had started juicing. But at least I had enough respect for the accomplishments of sabremetricians to use a statistic they had invented to overcome league and park effects.

The effects of drug cheating are so massive that you can see them even using the wrong statistics, but to accuse individual players of cheating based on using the wrong statistics because you are too arrogant and lazy to learn what noneconomists know about what tools you should use for the job is disgraceful.

Unfortunately, it's symptomatic of the whole freakonomics fad, going back to its seminal publicity coup in 1999 when the newspapers breathlessly reported that Steven Levitt had showed that legalizing abortion in 1970-1973 had cut the crime rate in 1997 versus 1985, which were the two data points he looked at. He had taken his theory around to seminars at prestigious economics departments, and apparently nobody knew enough about criminology to point out that the homicide rate among 14-17 year olds had tripled between 1985 and 1993 (i.e., between the last cohort born before legalization and the first cohort born afterwards).

The moral of the story is that economists can't keep parachuting into other fields and expect to do good work without making an major effort to learn what the experts in the field already know.

Posted by: Steve Sailer at Dec 16, 2007 7:24:05 PM

Can somebody please do an analysis on the total amount of productivity being wasted in the U.S. on the discussion of steroids in Hit-a-ball-with-a-bat, the game?

If you weren't aware of the steroids in sports fifteen years ago, you should just be ashamed of your general observational skills, and take note that you're not going to educate anyone about anything much when you comment in blogs.

Posted by: infopractical at Dec 17, 2007 12:30:04 AM

Steroids (which are a form of synthetic sex hormones) are an interesting and important topic because they shed much light on the differences between the sexes, and thus on feminist controversies such as whether should women be allowed in combat units of the military.

For an analysis of how steroids affected the gender gap in Olympic running, and the implications for the women-in-combat controversy, see my 1997 National Review article "Track and Battlefield:"

http://www.isteve.com/gendrgap.htm

Posted by: Steve Sailer at Dec 17, 2007 1:03:45 AM

Post a comment