Monday, September 17, 2012

MAPPING THE RATE OF CHANGE IN BASEBALL

Typecasting and scapegoating go hand in hand in the post-postmodern age, thanks to the swiftness that folks can take umbrage at virtually anything. The Internet sometimes seems like nothing more than an invitation to a hissy-fit, creating information in the place of knowledge and privileging the one-liner as the ultimate instrument of distortion.

Now we are not really complaining, merely observing--and we are trying hard not to cross the line that virtually everyone in the little world of baseball has done over the past couple of decades. We are content  to be misunderstood, even maligned, for in that there are true seeds of freedom--the freedom to engage anyone, anything, everyone, everything. Of course our negativity is exaggerated; we don't mind that folks are more comfortable characterizing it that way.

And so of late we've been a bit more sympathetic to the situation that Bill James has found himself in--though apparently Bill isn't particularly perturbed about it. (There has been a lot of flak directed at him about Joe Paterno, but much of this is due to his loyalty to Joe Posnanski, the beleaguered author of Paterno's biography--a volume whose sycophancy was almost as bare-assed as a Playboy centerfold. Not a good plan in the midst of the Jerry Sandusky firestorm...)

Bill decided a few years ago to put on a public face (aka front himself at his own web site...) and risk these extremes of snark and sycophancy, and he's been pretty darned indefatigable about it, even if the results are predictably mixed. His web page is almost worth what it costs to join, and that's more than one can say for any other so-called "oasis of advanced baseball theory" that's out there trying to separate you from your money.

Best of all is that Bill does occasionally write some fascinating articles about baseball--most of them utilizing his new "Lego-land" approach to modeling topographical patterns that collide with his still-restless mind. He is deeply into the continuous sequence modeling approach, having created a useful (and entertaining) look at who the best starting pitcher in baseball is at any one single point in time.

But there is more breadth to Bill than just numbers. (This is something that his hopelessly sycophantic friend Joe P. is actually right about.) And that more all-encompassing approach to history is evident in his essay about determining eras in baseball history (an essay written back in late July).

It's a fascinating approach, built around identifying key events, key players, key rule changes, key points of change in various subelements (such as the benchmark value for leading the league in various offensive categories), and assigning them a point value (similar, in fact, to how we do some of the work in our QMAX system for starting pitchers).

To give you sense of what's involved in that, we've excerpted a piece of Bill's description of these "items of change."

Once he does that, Bill can then identify the years where the greatest amount of change is/has been occurring, and then massage that into a grouping of eras.

What we were struck by, however, was the fact that such a great idea was being put to the service of such a nebulous historical construct. Particularly when that idea could be used to characterize, if not exactly measure, the actual rate of change occurring in baseball from its infancy as a professional sport to the present day.

Now, it turns out that Bill's method of quantifying the elements of change can be displayed as a separate sum for each year. Those peak single-season sums are what Bill uses to determine eras. But if you plot them on a graph--and if you cluster that data into multi-year totals--you can see what the rate of change is in baseball across time.



We're looking at the rate of change by summing two years of Bill's data: this captures the ongoing effects of the changes as they overlap. This is a more dynamic approach to the data than what Bill did with it, and while it's still simply an abstraction, it's a powerful visualization.

The big peaks (1880s, 1893-4, 1920, 1961-62, 1969, 1994) make sense in the context of what Bill has identified as the elements of change.

The slow period in the 1980s certainly reflects Brock Hanke's notion that the owners recoiled from the particular changes on and off the field in the 60s and 70s and tried to "freeze-dry" the game for a good long while.

We can smooth this data out a bit more by rendering it in four-year totals. This tends to put the "peak" of change a bit further back from the point in time where the individual yearly peaks occurred, but it might be a better representation of the "perceived rate of change," since people don't tend to catch up to change right when it happens.

Bill is still coming up with highly creative ways to think about baseball. While we're a bit surprised he didn't think of this application, we are grateful for the opportunity to demonstrate once again why he has been a singular mind in this strange little field for so long.