Hot Stove U: WAR: What is it Good For?

The Setup

It’s generally pretty easy to tell who is good at baseball. You don’t need to be a rocket scientist to realize that Joe Mauer’s .365 batting average last year was tremendous, especially for a catcher. Likewise, pretty much anyone can recognize greatness in Prince Fielder’s 46 home runs, Zack Greinke’s 2.16 ERA or Tim Lincecum’s 261 strikeouts.

However, as baseball fans, we were born with the desire to argue over whether one player is better than another, and these numbers do not lend themselves to easy comparison. Mauer doesn’t have an ERA, because he’s not a pitcher. The Giants don’t care that Lincecum failed to hit a home run last year. Even comparing offensive players to other hitters can be a problem; Fielder would be a disaster at shortstop, so stacking his numbers up against Troy Tulowitzki’s is comparing a massively large apple to oranges.

Thankfully, we now have a metric that allows for comparison among players across positions, and even between pitchers and hitters, totaling up all the things each does to help a team win, no matter what his particular skill is. Hitters, defenders, pitchers — everyone is graded on the same scale. This is why we love Wins Above Replacement.

The Proof

WAR, as it is often abbreviated, is fairly simple in theory. The idea is to take a player’s total contribution in creating runs (hitting and baserunning), as well as preventing them (pitching and defense), and then compare those totals to what a team would have expected to get if they had spent the league minimum on some randomly available Triple-A player (the so-called “replacement player”).

By measuring all contributions by the run value they create (or save), we can measure widely different things, such as strikeouts and home runs. For example, a single is worth, on average, about half a run, a stolen base is worth about 0.2 runs, and a strikeout takes away approximately 0.3 runs. So, if Derek Jeter is 2-for-4 with two singles, a stolen base and two strikeouts in a particular game, he has created approximately 0.6 runs on offense.

Because every action on the field affects run-scoring to one degree or another, we can then compare that total to other players’ performances, even if they didn’t have any singles, stolen bases or strikeouts. For example, if Mark Teixeira went 1-for-4 with a home run in that same game, he would create a very similar offensive value to Jeter’s, even though he had one fewer hit and made an extra out. His long ball was more impactful than any one thing that his speedier teammate did, and the trade-off between quantity and quality essentially cancels out.

We can apply this concept to all aspects of the game, not just offense. Each out created by a pitcher or defender also saves runs, and once we translate their numbers into a total of runs saved, we can then compare those numbers across positions. (Due to the particular challenges of quantifying a catcher’s defensive value, all catchers are assumed to be equally average behind the plate, so your favorite good defensive catcher will be underrated by WAR. This is the stat’s biggest flaw.)

Without getting into all the of the calculations — you can find a 14-part, in-depth series on how WAR is calculated in the glossary at FanGraphs if you’re curious — WAR then takes those total values of runs saved and created, adjusts for relative scarcity between different positions, and converts runs into wins over what a team would expect to lose if that player got hurt and had to be replaced by some veteran minor leaguer or journeyman bench guy.

That guy is the baseline because he represents the expected value that could be had for no real cost. For instance, a year ago, the Mariners signed Mike Sweeney to a minor league contract and gave him a part-time job as their designated hitter against left-handers. He made no real money, produced just a fraction of a Win Above Replacement, and is now looking for work again. At this point in his career, Sweeney is the epitome of a replacement-level player. He costs nothing, produces at a level good enough to hang around without being overly useful, and bounces from one club to another looking for work each year.

In reality, WAR could be named “Wins Above Mike Sweeney,” because players just like him are the baseline against which all players are compared.

The Conclusion

Bill James once said that if a metric always gives surprising results, it is probably wrong, and if it never gives surprising results, it’s useless. WAR succeeds marvelously on this account. In 2009, for example, it matches quite well with the players we would expect to have been the best (Zack Greinke, Albert Pujols, Tim Lincecum, Joe Mauer) and worst (Yuniesky Betancourt, Jose Guillen, Aubrey Huff), while surprising us enough to be useful (Ben Zobrist’s outstanding season, Jermaine Dye’s decline). Of course, given the small difference in WAR between Zobrist, Pujols and Mauer, along with the catcher-defense flaw in the stat, it is reasonable to conclude that Mauer was the most valuable every-day player.

WAR is not perfect, but it does a very good job of grading an individual player’s contribution, crediting him for what he produces on the field. Replacement level is a good baseline that accounts for how the baseball market actually works, and it enables teams and fans to better evaluate contracts and trades. It takes into account all aspects of a position player’s game rather than just his obvious strength or weakness. And finally, it is measured on the scale of wins, which every fan can understand is the whole point of playing the game in the first place.





Matt Klaassen reads and writes obituaries in the Greater Toronto Area. If you can't get enough of him, follow him on Twitter.

One Response to “Hot Stove U: WAR: What is it Good For?”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Scott says:

    Nice Seinfeld reference. Definitely got my attention.