Saturday, September 22, 2012

PTOLEMAIC MVP UPDATE 9/20/12

Let's wade in hip-deep with what will be the raging question of the early off-season...the question of the AL MVP.

But while we get our waders on, let's discuss what the Ptolemaic MVP method is and what it isn't, just to clarify for those of who might want to know.

The method is intended, in a manner similar to what Bill James was doing with Hall of Fame voting, to predict who the actual voters for the MVP will do when they cast their ballots. It is a work in progress, because we haven't been able to take it back through history yet to determine how well it lines up with actual MVP voting.

The method doesn't try to factor in defense, baserunning, positional adjustment, or any of the other nuances that it could, for two reasons: first, these tend to be transient aspects of MVP voting in the real world, and second, because these nuances are not nearly so well-defined as their proponents claim (often falling victim to our cautionary observation that these nuances are being wielded like sledgehammers).

So, for those who are still here, let's recap how the Ptolemaic MVP data is collected and assembled. Every 5-7 days after we get two months into the season, we start taking snapshots of batting performance. (We can do this thanks to David Pinto's Day-By-Day Database. Thanks again, David, for all that you do.) From May 31st to September 20th, we've got 19 two-month snapshots that have been "graded" in seven offensive categories: four rate stats (OPS, OBP, SLG, and BA) and three counting stats (R, RBI, HR).

Mike Trout: Don't worry, we'll be getting to him soon...
We add up the point totals in each category, add the categories together, and voila--a Ptolemaic MVP score for each "epicycle" (see our earlier essay on Ptolemy for a more detailed explanation of that term). We then add each snapshot total together over the course of the season, and the MVP race is on.

Above you've got the current update for the National League. (Yes, we're still putting those waders on...) What we see in the Ptolemaic MVP data is that players with a solid run of two-month snapshots can linger on the list even after they either experience a performance decline and/or suffer an injury (e.g. Carlos Gonzalez and Joey Votto). That can work against the customary ranking methods in a number of ways, and will produce one or two results in a season that will make some folks scratch their heads. For example, Matt Holliday, who rode a half-season hot streak (in 83 games from April 23 to August 1, he went .354/.442/.608, with 67 RBI), scores well in the Ptolemaic MVP sweepstakes.

So one shortcoming in the method is that it doesn't adjust downward when players subside. What they've earned they keep. There is no "negative" function to adjust for such a phenomenon. But we could argue that the sustained two-month peak is something worth taking into account; exactly how to do that without creating some distortions is a matter to grapple with.

Miguel Cabrera: the man whose run at the Triple Crown
may well provoke some sustained tantrums from
numberologists come the middle of November...
We've also (for the first time) broken out the totals by category and displayed them. This adds another shape component to the data (and those who used to read BBBA know that we have always defended "shape" stats against those who would focus solely on "value"; the new componentizing of WAR at least attempts to address the idea of "shape," but its current application runs a grave risk of distortion and apples-oranges conundrums).

We're seeing more of that than we'd like in the AL MVP race--particularly as it's being framed by WAR proponents, whose goal is to change the landscape of the voting to match their numerical results. (They would argue that what they are really doing is to better reflect reality, to account for defense, baserunning, positional adjustment; the question is whether their methods do that in a sufficiently reliable way to supplement or supplant more traditional methods.)

At this point in time, the answer to that question is a resounding "no." This is not the place for a detailed examination of the reasons for arriving at that conclusion, but defensive metrics are plainly and simply a mess--inconsistently thought out, peculiarly applied, with confounding results. (The progenitors of the method, in attempting to defend it, have been trying to give it "oomph" by putting into a framework comparable to the Pythagorean Winning Percentage--which is showing some of its own anomalies again in 2012. They've also suggested that fielding results are as variable on a year-by-year basis for individuals as is the case for batting results--an assertion that is laughable on its face. Translations into runs above average for fielding results use interpolations of events that can't be measured in anything remotely as precise as what's used for offense or pitching, and the attempt to tease out the "significant" fielding events--those that are crucial either contextually or in terms of degree of difficulty--has resulted in overstating the level of difference between best and worst.)

So when a system such as WAR incorporates a defensive method that has a built-in "overstatement danger", it can produce an aggregate result that makes a very fine season (like the one Mike Trout is turning in this year for the Angels) into one that seems as though it is "historic" (a favorite term of those following in the footsteps of the neos: it's one of their uses of "narrative" as they look to control the rhetoric of baseball discourse in the Icarus-like arc of the "consultancy culture" that's producing increasingly mixed results in MLB's cauldron of insiderdom).

Trout is having a great year, no doubt. And it is "historic"--at least in the sense that he's added his name to the list of stellar 20-year old players who've entered the pantheon of wunderkinden. He looks certain to be the seventh player of that age to post an OPS+ of 160 or higher (the other six, order of accomplishment: Ty Cobb, Mel Ott, Ted Williams, Mickey Mantle, Al Kaline, and Alex Rodriguez).
But guess what--none of those players were MVPs in their respective 20-year old seasons. (Yes, two of them--Cobb and Ott--played at times when such an award didn't exist.) Williams finished fourth; Mantle third; Kaline and A-Rod finished second. Now you can argue that all of these results were incorrect; but in a couple of cases where that argument actually holds, there's a player with at least as much of an argument for the award as the 20-year old phenom (in 1939, it's Joe DiMaggio and Jimmie Foxx; in 1952, it's Bobby Shantz and Larry Doby--remember there was no Cy Young Award in 1952); in 1955, it's Mantle; in 1996, it's Ken Griffey, Jr.).

And in 2012, Miguel Cabrera is the potential obstacle keeping Trout from joining Fred Lynn and Vida Blue as the only players to win the MVP and the Rookie of the Year Award in the same season. Going by offensive data, it's an extremely close race between the two.

Proponents of WAR will spend a good bit of time between now and mid-November arguing for the efficacy of those ancillary measures, which appear to propel Trout into a level of achievement virtually beyond the realm of understanding. We suspect, however, that the voting will look a good bit like what the numbers in the AL Ptolemaic MVP race are currently displaying.

It will be close--despite any separation in WAR claimed by its proponents on the basis of other aspects of the game (aspects that have been dubiously quantified). Anyone who expects Mike Trout to win the MVP in a runaway from Cabrera is smoking something that has more than mere medicinal value.

We're not quite satisfied with the way the Ptolemaic MVP is deployed, and some amount of thought will be applied to it over the off-season. We'll post periodically on it over the off-season, and we'll follow through with the final totals early next month, along with a few other variations on it that may prove to be of some interest, particularly to those with a penchant for "shape" as well as "value."