How small is the sample size that might tell us something about the quality of team performance? We know that the fixation on the first five-ten games of the baseball season has led to a good deal of silliness (though probably not all that much more than the subject of baseball in general...), but just what number of games played can provide a viable framework for some type of predictive vantage point?
We need to drive this idea beyond some of the usual "big number" perspectives. So, to get the ball rolling, let's use the 2011 Boston Red Sox as a jumping-off point. Some folks were doing some "jumping off" when the Sox began the year 0-5, 2-9--and finally 2-10 before reversing course and winning eight of their last nine games. They are now at 10-11 and lots of folks--even those who know all about small sample sizes--are breathing sighs of relief.
Can we get some kind of handle on the Red Sox now that they've righted themselves? Can these small units of the season--let's just bite the bullet and decide to use eleven games as the measure--provide us with a different kind of glimpse into team performance?
The answer: yes, to a much greater extent than is currently believed. The table at the right shows us what the Red Sox' basic data is for the first two eleven-game units of the 2011 season (yes, I know we don't quite have a complete grouping in the "game 12-22" set: try not to let that bring your blood to a boil--at least not yet). What we see in the basic stats is that the Sox' reversal is predominantly in pitching: their hurlers got pummeled in the first 11 games (6.5 runs/game) but have shut down the opposition in the second 11 (2.3 runs/game).
What we want to know about those two eleven-game performances is: how extreme are they in context of the entire set of eleven-game units? Now we're going to "cheat" and use only sequential eleven-game units: 1-11, 12-22, 23-33, etc., with an overlay at the end of the year (152-162) that partially repeats the last sequential eleven-game unit (144-154). Purists will want to argue that we should count all the intermediate eleven-game units, but having looked at this it seems clear that we can get something useful from the 450 sequential measures.
So--how extreme were these? Looking at the 2010 data, we can see that the Red Sox' runs allowed (RA) in games 1-11 (6.5) would have ranked 441st in MLB last year. Stopping time a game early for the 12-22 unit, we can also see that the Sox' RA for this group (2.3) would have ranked seventh best. So--literally from the bottom ten to the top ten in adjacent 11-game units: that's one of the most extreme reversals you're ever likely to see.
OK, so far so good, but what does that mean in terms of overall performance within a year? Is a great eleven-game performance from a pitching staff indicative of a higher rate in reaching the post-season? Or is that simply random?
To test that, we gathered up all of those 450 sequential 11-game measures for each major league team in 2010 and broke them into four categories: the first quartile is 11-game units where the teams averaged 3 runs allowed (RA) or less over that number of games. The second: 11-game units where the RA is 3.01-4. The third: 11-game units where the RA is 4.01-5. The fourth: where the RA is 5.01+.
And we sorted them by team, classified teams into three groups: playoff teams, other teams at .500 or higher, and teams under .500. Each team has fifteen 11-game units (see above for the caveat), and as you can see, the World Champion Giants had the most 11-game units where they allowed 3 runs or less per game, with five.
As you can see, all playoff teams had at least one such 11-game unit: as a whole, they had more than 50% of those 11-game performances in 2010 (21 of 39). Such units represented 18% of their total number of 11-game groupings--a figure that is more than twice as frequent as is the case for non-playoff .500+ teams, and more than four times as frequent as teams with an overall losing record.
In fact, what seems to distinguish playoff teams from their competitive also-rans is the ability of their pitchers to have at least some such high-performance streak during the year. Of the seven "also-ran" teams (.500+ but no playoffs), only two of these teams didn't have a single such high-performance unit in 2010: the Red Sox and the Blue Jays.
So the good news for the Sox is that, barring some kind of meltdown in game 22, their pitchers will have demonstrated an ability to string together a truly dominating performance cluster, something that seems to correlate well with reaching the post-season. Despite their agonizingly slow start, they've now done something that they weren't able to do all of last year.
There is a more comprehensive way to view all of the aspects of 11-game units (involving looking at the runs scored data), but that will have to wait just a bit. For now, let's just note that the Red Sox have actually done more than simply reverse course--they've signaled that they just might be ready to give all the other teams in the AL East a real run for their money in 2011.