Wednesday, October 5, 2011


Those of you familiar with Ptolemy, the Roman-Egyptian scientist from the first century AD, will already know that a Most Valuable Player selection method that is named after him will at least be overly complicated (if not also overly ingenious).

For this MVP method we are recasting Ptolemy's notion of epicycles not just to explain a mysterious and seemingly nonsensical physical phenomenon (the apparent retrograde motion of celestial objects in the nighttime sky), but to add contextual detail with the statistical data available to us within a single baseball season in order to more fully explain the level of achievement that the candidates have demonstrated.

Of course, given all the counting and rate stats that already exist and that are available for use in generating "scientific" results for baseball awards, we might already conclude that this data constitutes its own form of epicyclical misdirection. All manner of statistics are manufactured, manipulated, and championed above all others in the frenetic discourse that wells up during the two months after the regular baseball season concludes; in recent years, there has been an escalation of hostilities between the BB-WAA (the group officially sanctioned to actually confer post-season awards) and those armchair experts whose first words when confronted with the efforts of this "esteemed panel" of voters are the same as those uttered by John Wilkes Booth when he leapt to the stage at the Ford Theatre after fatally wounding Abraham Lincoln.

String theory? No, the encroaching epicycles in the
cross-hatched "discourse" during baseball's "award season"...
Such a chronically bloody state of affairs, and the intensification of posturing between the two camps, makes one think that the process is itself is as hazy and inchoate as the epicycle itself (as revealed in the artist's rendering at right, depicting the true chaos we cannot see, a version of the "fog" that only Bill James and Errol Morris seem not to underestimate).

Even simplifying baseball's myriad of stats into summarizing surrogates--on-base plus slugging (OPS), league-adjusted OPS (OPS+), and Wins Above Replacement (WAR, perhaps finally good for something...), doesn't erase the waves of loud talk and hazy assertion, privileging and marginalizing, and just plaining shaking and baking that insinuates itself into the bloodstream and makes the award season into a protracted St. Vitus Dance of popped eyes and pointed fingers.

We are here to transcend that. And, of course, the way to achieve such transcendence is to plow right into the sticky mire of the epicyclical method itself, refashioning it for what these backward, overlapped enclosures can reveal to us. Within a season there are components that simply don't get examined in the rush to use the final stats as the only legitimate form of evidence for assessing value.

In the first, more mercifully short section in this series (full disclosure: there will be one more, unless the planetary orbits as manifested in all those epicycles close in on each other and collide...), we showed you the data for hitters in August-September. That should have been a clue to how a Ptolemaic MVP method would work (and it should also have been a clue for you to head for the hills, but since you're presumably still here, hang on to your Shetland ponies and let's all aim for the windmill on the left).

The Ptolemaic MVP method looks at five "in-season" snapshots of hitting data, each comprising two months of the season (April-May; May-June; June-July; July-August; August-September). It has a kind of "retrograde" forward movement; this is to capture slices of data (each capturing about a third of the season) that can isolate peak performance and characterize consistency levels over the course of a season.

Each two-month snapshot is evaluated and graded according to a point system, and the totals are accumulated over the five measurement points beginning at the end of May, with the points added up and combined with a select group of full-season stats to produce, via the magic of epicycles, a Ptolemaic MVP. (Which may or may not be the same player who's selected by the BB-WAA or its mostly disloyal opposition.)

We will start with the National League, since its MVP race seems to be less fraught with other complicating issues in 2011--such as whether a pitcher can be MVP, a topic that may continue to be in play if the Detroit Tigers can go deep into the post-season (even though the voting is already over).

The first NL 2011 epicycle (April-May):

We've added a few extra players here just to give you a sense of perspective. Normally we wouldn't go as low as the OPS values shown by Ryan Howard, Troy Tulowitzki, and Albert Pujols for April-May; but these are folks who've been MVPs or strong candidates in the past, or who will emerge into the picture more prominently in later epicycles (you can call 'em "snapshots" if you like).

The chart shows the hitters in descending order of OPS. We've color-coded OPS twice--orange for 1.000 and above; yellow for .950-.999. All OBPs over .400 are shown in yellow, as are all SLGs over .600. In 2011, at least, these are not plentiful even at the two-month snapshot level, so any ideas that this data would be egregiously distorted seem dismissible.

The points work like this. First column (Pts) gives a point for OBPs at .400 or more (to be charitable, flexible, and, yes, epicyclical, we round up from .395); a point for SLG at .600 or more (see previous parenthesis); and one point for OPS from .950-.999, two points for OPS at 1.000 or higher. That's a possible total of four points in this category, and the only NL hitter to do that in April-May 2011 was Lance Berkman.

Second column (Pts2) revisits this idea and gives a point for all OBPs at .400 or higher and a point for all OPS values at .900 or higher. In keeping with the idea that OBP is more valuable than SLG, we double-weight it in the point totals, and we give some props to people in the .900-.950 OPS range, which is sort of the minimum that a credible MVP candidate should be achieving consistently throughout the season.

Third column (Pts3) is, yes, a sop to traditionalists: it awards points for high HR and high RBI totals, regardless of one's OBP, SLG, or OPS values. While RBIs are a tainted stat, someone having agreat run in either of these two stats over a concentrated in-season period (let's call it an "epicycle, shall we?) is still a powerful indicator of value, and these stats deserve a tertiary role in the overall MVP selection scheme. (The highest possible total for this category in any "epicycle" is two, one for each stat.

Here is the data for the "epicycle" of May-June 2011:

Some interesting data in this snapshot--players you are probably not thinking of as having much business in the MVP discussion, such as Mike Morse and Brian McCann, both displaying some impressive peak hitting. Of course, to win the Ptolemaic MVP (or the real one, for that matter), they will have to keep this up, as you're probably already guessing will be the case for two of the other top-end hitters here, Matt Kemp and Prince Fielder.

Note the shift in position for Berkman--this despite the fact that the data sets have a common month (May).

Also note that amazing, singular triples total for Jose Reyes.

The point totals in each epicycle are collected, collated, combined and--actually, that's all that happens, we were going to say that they were combobulated, but that word doesn't actually exist without its prefix.

The data for the NL epicycle of June-July 2011:

Facial follicles = fired-up Freddie!!

Albert Pujols assumes what most would consider to be his customary position on the NL hitting charts in this snapshot. Note the big homer/isolated power performance from Lance Berkman in this time frame (and, further down the chart, a similar profile for Aramis Ramirez), and note the emergence of Justin Upton and Ryan Braun.

Check out the fine stretch of hitting by Braves' rookie Freddie Freeman.

But mostly note the "max-out" point value achieved for the second straight "epicycle" by Kemp.

The NL epicycle for July-August 2011:

Witness the ascendance of Troy Tulowitzki, Joey Votto and Ryan Braun. See Dan Uggla reverse what had, to this point, been an ugly season.

Note the remarkable similarity, in shape and value, displayed by the stat lines of Pujols and young Marlins slugger Mike Stanton.

Notice the presence of a third big hitter in the Brewers lineup in addition to Braun and Fielder--right fielder Corey Hart.

Kemp and Fielder sort of "lay out" in this epicycle, but they'll be back. Prince may have been sidetracked by all those donuts...

We displayed the August-September epicycle in Part One, but heaven forbid that you should have to do anything more than twitch involuntarily, so we will wantonly waste bandwidth and reprint it here:

We can see how the Brewers were led down the stretch by their two MVP candidates,  with some solid assistance from Hart.

Sid Bream, checking out real estate listings on behalf
of James Loney and his impending move to Pittsburgh...
And, yes, the Dodgers' resurgence in September was not just due to the fine MVP stretch drive by Kemp...he got some help from a player whom most of us thought had been left for dead...which is why he's now known as James Freakin' Loney. The lanky first sacker was trying to save his job, and he just might have done that. Either that, or he's about to go the way of Sid Bream...

How does it all look when we add it up? Well, naturally, since this is the Ptolemaic MVP method, it can't possibly be that simple...we also add in two other full-season statistical measures (which was mentioned previously, in what must seem like the last century for you ADD champions out there). Those are (again...) OPS+ and Wins Above Replacement. These are awarded in 10-9-8-7-6-5-4-3-2-1 fashion in order of the players' rankings in each stat, and then added to the Ptolemaic totals.

When we do that, the final standings in the NL Ptolemaic MVP method are as follows:

Surprised? Disappointed? Relieved? Ready for the obligatory picture of the scantily clad babe? (Yes--well, we knew that you've been ready for that for the last two thousand words or so.)

The fearsomely hot Hindu goddess Kali
would be scantily clad, if she'd just doff
all of her "trophy dead." Hey, babe, let me
help you take that off...
Since Ptolemy's epicycles are very much aligned with OPS+ in the way that the points are assigned, it's probably not all that surprising that there's a reasonable correlation between the rankings in those two methods, while its relationship to WAR is less direct.

What the Ptolemaic MVP promises to do is to provide a framework for plotting and forecasting the MVP race as it unfolds during the season; since its results are quite reasonable overall, while  also identifying some players who have notable in-season peaks (such as Pujols, Morse, Upton, and--lower down in the voting but still notable--guys like Uggla, Loney, McCann and Stanton), we're pretty sanguine about using it as an ongoing projection in future seasons.

Coming soon, the acid test: the freakin' American League...