Sunday, July 31, 2011


One of the grey areas still not well elaborated in baseball's statistical breakdowns is the issue of the quality of a team's part-timers and their contribution to offense. Forman et fils has a breakdown in their stat set for hitters that separates "Starters" from "Subs" (which, come to think of it, sounds more like a trendy post-modern menu than anything else), but it is actually breaking out the players who start the game vs. those who come in as replacements.

What we're looking for are the bench players who actually start some games and put up less than 50% of the plate appearances needed to qualify for the batting title.

How does this group hit as a whole? As "lesser players," they might (if we are lucky) give us some sense of where that elusive concept of "replacement value" actually resides.

To look at this in a quick and dirty way, we can query the Play Index to give us various lists of players with less than half of the current (2011) PA total necessary to qualify (not 502 right now, more like 350). Such a list is naturally imprecise and requires some scrubbing, because you have players like Ike Davis showing up on it because they were injured early enough in the season that they appear to be part-timers when they actually are not.

What we get, however, is still an interesting enough set of data for all 30 major league teams. Of course, there is a range of performance, some of which is due to the distortion mentioned above, but is mostly due to the actual distribution of part-time talent.

The percentage of plate appearances given to part-timers in 2011 is around 17%. This contrasts markedly from the percentage of plate appearances given to players who enter the game after it started, which is only about 5%.

We see a marked difference between teams. The range is from the Yankees at the lowest (just over 300 PA thus far) to the Padres at the highest (over 1100 PA).

One reason for measuring this right now is that the effects of mid-season "tactical trading" (just starting to hit as the deadline looms later today) will make accounting for all that much more problematic and time-consuming. (Hint to Forman: here's yet another type of stat breakout to put on the endless enhancement list.)

There's no sense that a team with bad part-time hitting is doomed to a losing season. The teams with the very worst part-time offense (Brewers and Angels) are still in the pennant race, though one would expect that having such poor bench performance isn't going to enhance their chances.

Ike Davis: not a part-timer, dammit!
Despite the potential glitches in the data (full disclosure: we didn't remove Ike Davis), it's clear that part-time hitters just ain't anywhere near as good as the regulars. That aggregate .629 OPS compares unfavorably to that achieved by the regulars, who are posting a .746 OPS. (Yes, the pitcher hitting data has been removed from that value: pitcher OPS in 2011 is .353.)

Using those two data points (.629 part time, .746 full time), we see that part-timers hit about 16% worse than full-timers. That's a value that ought to be able to enter into the ongoing discussion of the eternally murky concept of "replacement value." One question that needs to be examined before plugging this value into any discussion of that concept, however, is whether that value has differed in the past, when the offensive bench was larger and platooning strategies were more prevalent. It could be higher or lower  as a result of how true platoon players get defined into the data set--if they have enough PAs to get over the "50% of 502 PA", then they aren't technically "part-time" and the data set for the players we've examined here would be smaller (and presumably a good bit more offensively feeble).

We'll revisit this idea later on when time permits. The data exists, however, to measure this breakout over  the long haul of baseball history, and there is some potential within it (if it is handled properly) to create a more comprehensive idea of what the basis for "replacement level" ought to be.

Tuesday, July 26, 2011


There is a lingering mysteriousness about losing streaks that transcends the simple, prosaic string of "L's" that accumulate ominously downward in the schedule column. This mystery, this strange allure, operates much in the same way that we are fascinated by those car wrecks that occasionally materialize on the side of the road. Despite the fact that the end result is plainly obvious, there is some odd fascination with imagining just what happened.

Baseball, with its longer season, has more opportunities for such strangely alluring pileups. And some seasons seem to have more of these odd clusters of futility than others (we looked at that a bit earlier this year, when the Florida Marlins took a precipitous tumble in June).

Now it's the Seattle Mariners. The M's have been the focus of much neo-sabe confuffulation over the past several seasons, beginning in 2009, when their purported "defensive strategy" led them to an 85-win season that was nearly ten games above what the ratio of their runs scored to runs allowed suggested.

In 2010, they fell apart and lost 101 games.

This year, with five starting pitchers performing well (Cy Young winner Felix Hernandez, the solid-but-often-injured Eric Bedard, Doug Fister, Jason Vargas, and rookie Michael Pineda), the M's edged into the pennant race in June. As of July 6, they were at .500 (43-43) and were only 2 1/2 games out of first place in AL West.

While no one had really gotten on the M's bandwagon (their offense continued to be the most anemic in the AL), there were still some rumblings about the possible efficacy of General Manager Jack Zduriencik's "pitchin'n'defense" (not to be confused with "chicken'n'waffles") approach to "team building."

What happened on July 6? Well, the M's lost--and then proceeded to start a string where they played a series of good teams, including the four best teams in the AL East. Our chart, broken down by month and by record of opposition, provides a stark picture.

This is what's sometimes referred to as "finding one's level." The M's starting pitching had been good enough to keep the team in close games with good teams over the first three months of the season. They were within range of first place because the other two top contenders in the AL West (Rangers and Angels) were struggling.

Things sort out across a season, but they rarely do so with such a dramatic certainty as what's displayed above.

And what's the "why" for all this? It's pretty simple. The M's starters hit the wall. Over the past sixteen games, they have a 6.12 ERA. Eric Bedard got injured (again...someone won a Seattle-based lottery by picking the exact date that he landed on the disabled list) but his replacement, Blake (Bleak) Beavan, actually pitched better than any of his fellow M's starters over this stretch. Thanks (again) to David Pinto's Day-By-Day Database, we can look at a snapshot of the M's pitcher stats for this time frame.

Note that HR total. 23 HRs in 16 games. Doesn't seem like much, but if that pace were maintained for a season it would extrapolate to 233 for the year. That's a big part of the problem.

What also gives pause is that the M's haven't really even been unlucky in the midst of this. Usually such losing streaks involve a cluster of one-run losses. The M's have only lost two one-run games in this stretch.

Now, with all that said, the fact still remains that the M's are still doing better than they did last year, and these pitchers are better than what they've shown in the past three weeks. It's not certain that Jack Z. has any magic formula, but he does understand that his ballpark rewards a good pitching staff if he puts an athletic defense behind them. The problem is that such an approach often leads to sub-par offense, and in such a scenario it only takes a short flame-out from the pitchers to create...well, just what you're seeing by the side of road at this very moment.

Saturday, July 23, 2011


You will read that headline, and think that we have ingested some mind-altering (make that mind-destroying) substance.

Ron Swoboda, post-baseball, pulling a "Cal Worthington" at one of
Bud Selig's used car lots...
Ron Swoboda. A star? Clearly some kind of in-poor-taste joke served up by someone named Shirley. Ron Swoboda? The guy nicknamed "Rocky" (and this was well before Sly Stallone gave that name some cachet--"Rocky" was a nickname for the state of Ron's career). He was up there, shining in the firmament?

Clearly, booby-hatch time for Ye Olde Kinge of Vitriol.

But consider this chart of the best hitters in baseball (compiled via David Pinto's Day by Day Database). It captures about five-eighths of a season, across the time frame that spans from July 15, 1967 to May 10, 1968. It lists all the hitters in both leagues in descending order of OPS. We've color-coded the OPS ranges because...well, because that's what we do here.

There, on that list, sitting in the seventeenth position, is Ron Swoboda.

Yes, it's one of those small sample-size wonders. Yes, Rocky is the only real "mystery guest" (in retrospect) to be found in the top twenty hitters here.

But it was this period of time, and this performance, that caused a number of folks to think that the 23-year old Swoboda had a chance to be a solid major league slugger.

It's an interesting list for a few other reasons. We are dipping into the Valley of Death for hitters in the time frame represented here, so the stars in the game are putting up OPS values (and counting-stat totals) that look shockingly modest. Only one guy (Carl Yastrzemski) over 1.000 during the time frame: only four guys over .900; just four guys with more than 20 HR.

It was still possible to be a great hitter with an OPS driven by a high batting average (look at Curt Flood's performance: yes, it's a fluke for him, but the point is that someone could hit like that and be an elite hitter). That hasn't been possible in baseball for nearly twenty years.

There are also some guys here who are hitting a lot of triples. Roberto Clemente, Lou Brock, Vada Pinson. Of course they're not going to be putting up Chief Wilson-type triples totals, but the totals here indicate that the three-bagger was plentiful and possible enough to produce a solid swatch of hitters whose totals were at least in the teens.

So, now, you too can remember when Ron Swoboda was a star. 'Twas a Warholian moment, maybe an Andy-esque parsec, in fact. But there it was--a glimmer in the gloaming.

Wednesday, July 13, 2011


No doubt about it--five mini All-Star games in one night would have beaten the barely engaging game played last evening between a pair of teams who couldn't quite be bothered to put all of their best players on the field. Another manifestation of the cataclysmic moral vacuum that's existed in the Age of Budzilla...the escalating "you scratch my back" ethos that comes from selling one too many used cars.

But in the midst of it, we found ourselves gravitating over to the usual completist mania so often found at Forman et fils, where there is a page that lists the entire roster of All-Star batters and their stats. This list, as you'll see when you peruse for yourself, contains 1577 players (though that should increase a bit sometime soon, as the 2011 All-Stars have yet to be added).

A nifty feature in that data is the number of All-Star games in which the player was selected to play (regardless of whether they started--there's a separate column for that, even niftier--or whether they even got into the game at all).

That got me wondering. All All-Stars are not created equal: from an off-and-on examination of the WAR tables in the Forman et fils listings, it's clear that some guys on the squad don't even come close to measuring up to the method's back-of-the-envelope rule-of-thumb than an All-Star should bring at least five wins above replacement to the park in order to be on the squad.

The only good jerk is a "knee jerk"...
But rather than list all of that discrepancy--though it's a tempting side-project, there's literally no time for such an effort in the foreseeable future--we thought it might be just as entertaining to look for what the title colorfully calls the "knee-jerk" All-Stars. (Insert your own "onanistic" reference here.)

These are the guys put on the All-Star team year after year, in a reflex action, kind of like a dog with fleas scratching himself.

It's doubtful that there are many "knee-jerk" All-Stars to be found in present-day baseball. There are way too many teams that need to be represented for this to happen--but back when there were only 16-20 teams, it's possible that a certain class of player (probably over on the left side of the defensive spectrum) who'd established a name for himself would get named as the "extra player" at a particular position simply due to name recognition.

How do we measure this? Simple. We take a player's number of All-Star games, and divide it into his career WAR figure. Given that few players are good enough to make the squad every year, the value that's created--career WAR per number of All-Star Game selections--is still generous enough to produce a robust number.
Mel Ott: finally, a "knee jerk" that actually works!!

Unless, of course, you are a "knee-jerk" All-Star.

The type of player we're looking for is someone who has very few actual starts in the ASG, but gets named to a lot of ASG squads. We're talking position players here, of course, as pitchers just don't get the nod to start the ASG all that often.

Looking at that data, though, it's kind of surprising to note that Mel Ott actually started only four ASG out of his total of twelve ASG appearances. In the case of Frank Robinson (just 6 starts out of 14 times on the squad) this is understandable--Willie Mays and Hank Aaron were dominating the starting lineup--but just who was keeping Ott on the bench? We'll have to look that up sometime.

Anyway--that formula for knee-jerkiness again is: (Career WAR/# of ASG squads). As a fun benchmark, Yogi Berra (61.8/18) grades out at 3.44--but the Yoge's total is distorted a bit by the fact that baseball had two All-Star games in the same year for awhile there. (Proof that Budzilla does not have a stranglehold on all the questionable ideas....)

Let's look at some of the players that we find below Yogi's KJA ("Knee-Jerk Average").

GEORGE KELL 33.6/10 = 3.36

One drawback at Forman et fils is that while they tell you how many games a player started in the ASG, there's no way to tell which games they are. Still, we'll report a few facts associated with this data as best we can.

Kell started six games. His seasonal WAR for the years he was selected to the ASG squad: 4.2, 0.6, 5.0, 4.4, 3.1, 2.4, 3.5, 0.3, 1.3, 1.3.

That's four out of ten years where his WAR was below 1.5, three in a row at the tail end of his career.

Can a Hall of Famer be a "knee-jerk king"?

Fear not, we are going much, much lower.

NELLIE FOX 44.4/15 = 2.96

To be fair, Nellie wouldn't be on this list if not for the ASG double-whammy in the late 50s/early 60s. But he racked up two ASG appearances in '61 with a -0.4 WAR for the season. So let's consider him an honorary member.

BILL MAZEROSKI 26.9/10 = 2.69
See what we meant about the left side of the defensive spectrum? Maz is another guy whose totals are inflated by those extra ASGs, but he did have an All-Star appearance in a year (1959) when his total WAR was -0.5.

GEORGE McQUINN 18.6/7 = 2.66

Our first first baseman. Good player, but got a couple of ASG nods as a token pick for the Browns (1940, 1942, and even 1944, the year the Browns won). His selection in 1948 (after his trade to the Yanks the year before) was odd: he was hot in May but began to fade sharply and by the ASG he was in free-fall: he was benched shortly after the ASG, never recovered his old form, and was released at the end of the year.

HARVEY KUENN 24.3/10 = 2.43

Two extra ASG here as well, but WAR really doesn't like Kuenn's defense. In his ROY season in '53, his defense at short is purportedly so poor it almost cancels out his hitting. In 1957 and 1958, however, Kuenn was in KJ territory, averaging 1.5 WAR over those two years and plunked on the squad anyway.

DEL CRANDALL 26.7/11 = 2.43

Again, Del's average is deflated for the reason already noted. In his case, though, it's clear that despite just middling offensive production in a number of his ASG years, he was simply the best hitter at his position anyway, as evidenced by the fact that he was the starter in eight of those games.

FRANKIE HAYES 14.2/6 = 2.37

Back on the ASG in 1944 and 1946, for no discernable reason...except for the Crandall scenario above. That would explain '44, but not '46.

ELSTON HOWARD 28.2/12 = 2.35

Ellie's first year on an ASG squad is inexplicable unless there was an injury--he's got a whopping -0.9 WAR to show for that year (1957). 1958 was a legit year, but then he was grandfathered in for the next two years despite mediocre totals. Things stay kosher from 1961-64, but in '65 he's on the squad despite hitting just .221 in the first half of the year and winding up at 0.7 WAR for the year.

"No, no, Thurman, I don't want to come fly with you..."
Interestingly, despite having a number of really good years, Howard only started once out of the twelve times he was named to the squad. Another research project to see who was keeping him on the bench in those years.

Fear not, we're not close to the bottom of the barrel...

SANDY ALOMAR 13.2/6 = 2.20

There's a noticeable pattern in catchers--they have a good hitting year or two and they seem to get lumped into the ASG pool automatically no matter what they hit. Oddly, though, Sandy's first ASG selection came in a year when he didn't hit all that well--it wasn't until the next year that he actually got his act together with the bat (though that didn't last, fitting in with the pattern).

Malzone: poser or poseur? Dust or lint? Wop or
spaghetti-bender?? Epithet or slur???
FRANK MALZONE 14.5/8 = 1.81

Malzone made his first ASG in '57 because of a hot first half (he hit .327), and he had a bit of a run as the top third baseman because Brooks Robinson was not quite ready for his close-up. Odd fact: in 1963, Malzone was the starter at third base, and he batted cleanup in the ASG. Some of these starting lineups for the ASG (check out the 1963 box score) will definitely give you pause.

Now to the really good stuff....

DON KESSINGER 5.0/6 = 0.83

"You mean I'm supposed to take the
donut off before I hit??"
Kessinger played in six All-Star games. Selected for the first time in 1968, he started for the NL despite winding up with a 67 OPS+. He actually had a good year in '69 and was the starter again. After that they just kept putting him on the team, despite several years with negative WAR totals. This is where a look at a player's WAR during the actual years he was on the ASG might be of some contextual value...but the key word there is "might," as Don's total WAR for those six years is only 8.2. Clearly he had some very bad years, including a few where he didn't make the All-Star team...

At least it's not on velvet...
And, finally, the "Knee-Jerk" king hisself....


Bobby is actually lowered by that duelling ASG thang, but we won't let that get in our way. Bobby had two passable hitting seasons which accounted for all of his career WAR--the rest of his career grades out almost exactly at what those of us with a callous heart and a penchant for arcane jargon like to call "replacement level." How Bobby got on the 1957 All-Star team is probably something for a mystery novelist to tackle--or a French farceur. But it all seems to be strangely related to Elston Howard...

Is a knee-jerk reaction
positive...or negative?
Meh, it was probably Dr. Norman Vincent Peale stuffing the ballot box...

...but the good news is that Bobby only started one ASG. It happened at the New York World's, actually, it happened in Shea Stadium on July 7, 1964. And, to show you how much positivity had worn off on Ralph Houk after so much exposure to the pious soon-to-be-Reverend Richardson, the Major actually batted Bobby seventh, ahead of Elston Howard in the AL batting order.

Hey, Bobby got a hit...and Elston didn't.

Sunday, July 10, 2011


Marc Carig, typing away for the Newark Star-Ledger, is trying to get down with the "sabermetric revolution." He really is.

But he's trying way too hard. His column this morning suggests that Moneyball (that goes-good-with-everything fashion accessory masquerading as deep thinking) has raised the level of "walk-consciousness" through the roof, and is encroaching on the possibility that anyone will ever reach the 3,000-hit milestone ever again (now that Nate Silver's backgammon partner, Derek Jeter, has crossed over).

What do they put in the water back on the East Coast, anyway? Or is it that Marc is just rushing to be "hep" (as they used to say before the invention of be-bop)?? A glance at the three-year averages for walks per game over the past hundred or so years will rapidly point out that walk totals were a good bit higher in the 1990s, well before Moneyball was a gleam in the eye of Michael Lewis's bank account.

(And, of course, if only Carig could have been there in 1949 to see what real walking was all about--especially in the American League. Bill James, showing admirable restraint when characterizing the style of play embodied by late 1940s baseball--"the baseball of the ticking time bomb"--didn't jump on the bandwagon and declare the short-lived walk mania in this time frame as a "golden age" of baseball. (No, a different set of New York-based sportswriters did that, mostly because the teams they wrote about were dominating the game to an extent that was unprecedented.)

Carig would have clearly seen the demise of the 3,000 hit player in 1949, when walks really were catching up to hits. Our pumpkin-colored area chart (not meant to be any kind of veiled reference for the consistency of the matter inside Carig's skull...) demonstrates that the hit-to-walk ratio has been a good bit more consistent over the past fifty years once the walk spike ran its course. And the chart also demonstrates that we are in no danger of walks encroaching on hits (even as hits become a bit more scarce overall, as batting averages experience a downturn over the past several years).

Of course, if we'd been looking at this trend in 1949, we might have jumped to the same conclusion that Carig has--and the direction of the chart at that time would certainly seemed to have supported such an interpretation. Such a conclusion would have been proven wrong by subsequent events--strike zone adjustments, the rise of power pitching, etc. Baseball history suggests that an extreme condition will eventually relent and move back toward the center (though this may never happen with some features in the game, such as complete games).

We might have thought in 1949 that there could never be a 3,000-hit player again, what with the relative decline in hits and the ascending importance of walks. As our chart of 3,000 hit players shows (above), there was a curious lull in the lively ball era, a kind of paradox where rising batting averages produced fewer 3,000 hit careers. Despite a rise in the number of players with 1600+ hits in their career through the age of 31, no one was crashing through to 3,000 hits any more. Some of this was exacerbated by World War II, of course. But at the end of the 40s, only one of 34 such players (1600+ H at age 31) had reached the 3,000 figure--Paul Waner.

And though Stan Musial broke through in the 1950s, the incidence of 3,000-hit players remained scarce (only 5% made it out of what we might call the "eligible population").

Suddenly in the sixties, however, this all changed. Over the next four decades, eighteen players joined the 3,000 hit club. The percentage of the eligible population making it to 3,000 hits reached its all-time high at 28%.

Now, interestingly enough, we are at a point where a lot of eligible players made the list in the last decade (17 players had 1600+ hits by the age of 31), but their follow-through, like those players in the 1920s and 1930s, has not been good at all. Whatever the reason for this, one thing is clear: it has absolutely nothing to do with drawing too many walks.

The summary totals by number of hits by age 31 tell an interesting tale in themselves. Players with 1900 or more hits have three times the chance of reaching 3,000 hits (24%) than those less than 1900 hits (just 8%).

Some trivia questions--who is the player with 3000+ hits who had the fewest number of career hits at the age of 31? How many more hitters are out there who got 3000+ hits but who had less than 1600 hits by the age of 31?

Answer to question one: Cap Anson had only 1290 hits at the age of 31.

Four other players had less than 1600 hits at the age of 31 and made the 3000 club: Paul Molitor (1557), Dave Winfield (1568), Honus Wagner (1576), and Wade Boggs (1597). You could make a pretty good infield out of that squad (yes, Winfield played eight games at 1B!).

To get back to our indiscriminate bashing, let's conclude by noting that Marc Carig is no historian. If he were, he might understand that the adjustments made in the game of baseball in the 1960s helped a series of players with less clearly defined superstar skills (read: power) develop long, productive careers and add to the ranks of 3,000 hitters. Players such as Lou Brock, Pete Rose, Paul Molitor, Tony Gwynn, and Rickey Henderson.

While folks will inevitably carp about how a singles
hitter can be the "best" in the game,  it wouldn't
hurt to have a time frame where someone like
Rod Carew could legitimately be considered as
"the best"--baseball needs that variety...
There's a chance that this type of player will develop over the next twenty years. With the game moving toward a greater reliance on balanced hitting after a surfeit of home runs, "scientific singles hitters" may yet reemerge. Of course, they are not any more valuable than power hitters with 30-40 less hits per season (more walks and greater isolated power), but if they reemerge, they will start racking up 200+ hit seasons.

And that is the surest way to produce more 3,000 hit players.

Exciting "middle way" players have become exceedingly scarce: it's time they made a comeback. These guys have been more than just popular--they have been downright lovable (OK, not Pete--but he's the exception that proves the rule). A little of that quality can go a long way. Paging the next Rod Carew...

Thursday, July 7, 2011


'Twas exactly three years ago today that we tossed up the Five All-Star Games In One! idea over at the Hardball Times. It made a rather imperceptible "thud" on the canyon floor below when it landed, but it's just a fun idea that would make a lot more sense than the present approach (at least to this aberrant mind).
Beware, gentle reader: all this verbiage is yet another
malevolent Malcolm plot to get "Mighty Maicer" Izturis
on the All-Star team.

If you read the article, you'll know the concept, but let's do a quick-run through here for those who are link-averse. Basically, All-Star teams are voted on by division, so there are three squads for each league that play in three-inning games. The two divisions with the best overall won-loss records in their leagues get a bye in the first round: this year, that would be the two East divisions. 

So we'd start AL West-NL Central, NL West-AL Central. The winners there would move on to play the East squads; the winners there would go to the championship round.

Yes, it's possible that two teams representing different divisions in the same league will face each other; it adds some spice. You can still keep Budzilla's inane "winning league gets home-field advantage in the World Series" rule.

THE two big reasons for this approach (and we still see it that way three years beyond the mast...) is 1) too many deserving folk get left behind under the current system and 2) this is a helluva lot more fun for the fans, who essentially get five mini-All Star games in one. We wrote about this in greater detail in the earlier essay--all of what was discussed there still holds.

So here is a quickly-eyeballed look at the Divisional-based squads. (The players in each league who eere elected or named to the league-based All-Star teams are shown in red.) We wind up with twice as many All-Stars, and while lots of people will knee-jerk that this is "watering down" the idea of All-Star, I submit that the 25-man roster concept first applied to a game with 16 major league teams; to keep the proportions the same in the present day, there should be at least 50 on the team. A 50-player team is not feasible, but three divisionally-based squads of 18-19 players will work, and it's not that much of a stretch quality-wise.

Keep in mind that the operative phrase here is "quickly-eyeballed." There will unquestionably be quibbles available to those who wish to do so...I simply picked players based on various statistical shortcuts accessible at Forman et fils.

One objection to this approach has to do with the unequal number of teams in divisions (AL West, NL Central). We'll just have to work around that for now.

Sergio Romo--K'ing 13 per nine innings, and his
beard doesn't look like a dead squirrel painted black...
The AL Central poses an interesting question for the strategies possible in three-inning All-Star "mini-games." Due to the relative paucity of quality starting pitchers in the division, it's possible to load the team up with a group of relievers (as has been done above, before the "objection, your honor!" cadre rears its tousled head when seeing non-SPs in the SP columns...) and play more of a one-inning-at-a-time strategy. 

Over in the NL, it's clear that Bruce Bochy loaded up his squad with players from his own team (a time-honored but world-wearying tradition), but jeez, Bruce--Brian Wilson over Sergio Romo? Are you watching your own games?? And as much as we all love Tim Lincecum, he's just not having an All-Star season, so we need to make room for some of those Arizona pitchers, whether we want to or not.

It would be a hell of a lot of fun for the fans and the players if the All-Star game got turned into a five-ring circus.

Ye Olde "local/global" juxtaposition-continuum applies here: we don't have anything in the game that gives divisions a sense of shared identity outside of being "buckets" in which teams have been semi-arbitrarily placed in order to create a framework for deciding who goes into the post-season.

Why not create a niche in the game where divisions have at least a bit more meaning than that?

Do it my way, see? And you can have five pieces of cake and eat them all in one night. Maybe you'll have the grand-daddy of all stomach aches the next day--but O what a night it will have been. 

Sunday, July 3, 2011


Craig Robinson, the British graphic genius whose cheeky charts at Flip-Flop Fly Ball have won him deserved acclaim, hit the "big time" the other day as the New York Times welcomed him into their pages just a few days before the book version of his web site is published (officially available on Tuesday, July 5).

How disappointing, however, that one of his least useful diagrams in terms of actual analysis was chosen by the Times to represent his work.

Now don't start thinking that we are trying to downgrade the graphic talents of Mr. Robinson. His chart asks a specific question and answers it with a stylish, imaginative design.

The answer to his question "How often does the most expensive franchise win the World Series" can be gleaned with relative ease from the chart.

But there is so much more information available in the chart that it's a cryin' shame we have no way of accessing it for the rest of the insights that can be found in it.

Oddly, the very thing that makes this chart such a good design makes it virtually impossible to use it for purposes of examining the overall correlation of team payroll and appearance in the post-season.

Fortunately, Mr. Robinson provides us with the information so that we can do our own digging. Here are some of the questions that can be answered by "repurposing" the data he's seen fit to include in his diagram...

We already know from the chart that the most expensive franchise has reached the World Series six times in the last sixteen years (38%) and has won four times (25%).

But what about something like: 1) How often do teams who are in the top 50% of payroll reach the post-season? 2) How often have teams in the bottom 50% of payroll appeared in the World Series? 3) How many times have such teams won the World Series?

These questions probe below the "sound bite" that Mr. Robinson has focused upon. They might tell us a good bit more about the shape of "competitive balance."

The answers: 1) Of the teams that have reached the post-season from 1995-2010, 73% of were in the top 50% of payroll (94 out of 128). 2) Of teams in the bottom 50% of payroll, four have reached the World Series (three of them in past four years). 3) Only one "bottom 50%" team has won the World Series over this time frame: the 2003 Florida Marlins.

So, since 1995, 93% of all World Series have been won by teams in top half of team payroll.

What if we break this down into thirds--convenient enough in this scenario since we get ten teams in each third--and look at post-season/payroll again?

Of the overall post-season population, 59% of it comes from teams in the top third (top ten) of team payroll. 31% of it comes from teams in the second third (slots 11-20). 10% of it comes from the bottom third (slots 21-30).

Another thing we can't glean from a glance at Robinson's graph is how that data may have shifted over time. (No way to eyeball all that in this design--you'd need to take a much different visual approach to do so.)

Turns out we can use a fairly straightforward bar chart to see how this data is changing over time...when we break this data into three time frames (1995-99, 2000-05, and 2006-10), we see that teams in the lower two-thirds of payroll have been making strides over the past ten years. The 70% stranglehold on the playoffs held by top-third payroll teams in 1995-99 was down to just over 50% in the past five years.

So here is some tangible evidence that something is in the air (whether it is the murky mythos of Moneyball or something else entirely) that is giving lower-payroll teams a way into the playoff picture.

Another question that we can't answer from Robinson's chart but that flows from the data he's provided is the percentage of times that the top payroll teams have actually made it into the World Series...though this question shades a bit more toward the "alluring trivia" that dominates Robinson's work, since it doesn't address "competitive balance shift" questions in a truly substantive way. It's still an interesting comparison, however, if only for how it stacks up against Billy Beane's famous mantra ("the playoffs are a crapshoot.")

Teams in the top third of payroll have produced 67% of the World Series teams from 1995-2010. However, that figure was 90% from 1995-99, and was only 40% over the past five years.

That means that the playoffs are more of a crapshoot than ever, at least as related to payroll issues.

Has sabermetrics been part of this shift? Trying to measure that beyond a few sets of percentages is an extremely daunting proposition. The number of years being studied aren't sufficient to draw any conclusions. The fact that more low-payroll teams are making the post-season is probably a reflection of two factors: more objective application of statistical information (in whatever format) and greater uniformity in the population of high-payroll teams. The greater the concentration of payroll in specific divisions, the greater the chance for low-payroll breakthrough.

Final question--how often have the teams in the Top 10% of payroll (teams 1-2-3) made it to the post-season? The overall average from 1995-2010 is 56%, but again, that figure has dropped over time. In 1995-99 it was 75%; in 2000-05 it was 62%; in the last five years (despite the Yankees and the Red Sox) it was only 31%.