Monday, March 18, 2013


OK, really swamped with non-hardball projects of late, so March has done more than half its lionizing thing without us...thus we'll give you something big, bold, brawny and arduous (you can pick up the internal association therein without our usual application of the sledgehammer, n'est-ce pas?)

We were struck recently (just a glancing blow, fortunately) by an off-hand remark about Nolan Ryan. The utterer attributed it to Bill James, though we've not had time to verify the reference. (No matter: despite what might otherwise seem to be the case, there's no intent in this essay to "bash" anyone...just making that clear in case someone erstwhile Jack Kruschen-type alerts Rob Neyer to the goings-on here.)

That remark, boiled down to its most prosaic formulation, suggested that when Ryan had his control he was unbeatable, and when he didn't he wasn't. Sounds like good advice (but that didn't stop Linda Ronstadt from loving some sweet-talking heartbreaker in a what seemed like an endless series of lachrymose ballads...) as well as a large dollop of early sabermetric common sense.

There were numbers in the formulation that went something like this: 5+ walks in a game, struggle; 3 or less walks, unbeatable. But as with many of the binary formulations that informed early efforts (and that are still, how shall we say..."psychologically influential" even today) to deconstruct baseball statistics, this one doesn't really stand up to a full sniff test.

But it does lead us into some interesting areas that aren't quite as settled as the heterodox orthodoxy would have us believe. (Cue up the music,'s QMAX time again!)

It turns out that when we break out Ryan's career starts (all 773 of them, available one by one at Forman et fil), the idea that he was unbeatable when he had better control is roughly three-fourths true.

The data is all here, compiled into your basic counting stats, well-known rate stats (ERA, H/9), and then some variously recondite calculations (the basic QMAX "S" and "C" values; QMAX's ERA predictor--named QERA; and, finally, the fabled FIP value). A lot of intriguing stuff here, so let's get right to it.

First, there's the raw interest in knowing exactly how many games with various amounts of walks allowed--Ryan has been retired for twenty years now, but he's still one of the most indelible presences on the mound (even if he's nowhere near the level of the all-time greats).

It's amazing to find out that he had 232 starts in which he walked five or more batters; that he had only 27 starts (just 3%) where he didn't walk anyone. And it's very interesting to note that his ERA in games where he walked five batters isn't all that different from his ERA in games where he walked only three.

In fact, it's downright weird to discover that Ryan's ERA in games where he walked five or more batters is lower than it is in games where he walks three or four batters.

And it's very interesting to note the divergences in Ryan's QERA and FIP through this sequence of breakouts.

Now, we know (even before it's pointed out to us by our super-modeling brethren...) that QERA and FIP aren't attempting to measure the same thing. But the predictive qualities that are claimed for FIP (that its reliance only on the so-called "three true outcomes" to fashion a massaged model of ERA is a truer picture of future performance than anything else) run into a few thorny issues when we look at how it handles cluster of starts where the pitcher has high walks and low hits.

And there's no better place to examine that discrepancy between the predicted and the actual than in the region of the QMAX chart (as you'll see in the many matrix breakouts that show the shape of Ryan's start distributions by the number of walks/game) in the upper right corner.

That is what we've taken to call the "power precipice": the area where pitchers give up a good bit fewer hits than the league average per nine innings, and a good bit more walks than the league average per nine innings. It is a range that is shockingly close to the level of success that pitchers achieve in the upper left corner of the QMAX chart, where they have similar success at hit prevention and have lower than average walks per nine.

Nolan Ryan may be the king of the power precipice: we'd be extremely surprised if there is any pitcher (other than possibly Bob Feller) who has more starts in that region. His total of 184 "power precipice" starts represents just under one-fourth of his career total (24%, to be exact). That number encompasses five seasons' worth of starts.

In those games, Ryan's won-loss record is 101-57. (OK, you don't like won-loss records.) His ERA is 1.87--and this is happening in games where he's walking an average of nearly six-and-a-half men per nine innings! He's allowing just over four hits per nine innings (which works out to about three-and-a-half hits per actual start, since his starts in these games last about seven-and-a-half innings).

FIP's assumption that the variability in hits on balls in play is low enough to simply ignore the extremes in performance creates a situation where the method predicts that Ryan's ERA will be nearly 90% higher than what it actually is in these games (referring back to the big chart above: 3.55 vs. 1.87).

QERA suggests that Ryan has gotten some breaks in terms of what that ERA ought to be as well, but it's nowhere near that divergent. This is because QERA, using the QMAX "S" and "C" values to calibrate the relative importance of hit prevention and walk prevention, does not throw out ninety percent of the hits based on a modeling assumption or sixty percent of the outs that involve a fielding play.

FIP makes an assumption about how baseball works and applies it monolithically to a model that suggests that the weighted average of the "true outcome" events is sufficient to characterize quality. That is not without some value, but it's clear that certain combinations of those events produce serious discrepancies with the actual results in the games where those event combinations occur.

That doesn't completely invalidate it, but it points out that these mega-modeling methods are not nearly as robust or as granular as they have been claimed to be.

The chain of QMAX charts that have been running down the right side for awhile now give us a glimpse as to how the shape of performance is distributed across walks/game. Ryan proves a general rule that has been jettisoned in the FIP concept: the more walks a team draws, the fewer hits they will make. (Of course, there are clearly exceptions in individual games; but the available data for this is now vast and as you move rightward on the QMAX grid--even in those regions where the pitcher is being hit hard, in the 5, 6, 7 "S" areas--the hits/9 IP declines.

Ryan has one anomaly in his data: the 7BB/G group, where his H/9 rises. But the rest of the progression, once you get past the very small number of starts where he allows no walks at all, is linear.

The QMAX range summaries tell us a bit more in this regard. Note, for example, how consistent Ryan's "top hit prevention" (the S12 rows from left to right across the QMAX chart) are all the way across the walks/game spectrum. The fact that he gets into that range--even if it's over on the right of the QMAX diagram (and above you can see the rightward drift as his walks/game rises)--is what allows him to remain a successful pitcher even when he is having major control problems.

Note that Ryan is hit hardest when he walks three men in a start (24%). Note that his Power Precipice percentage jumps sharply in the 3-5 walks/game range--it increases nearly fivefold.

This is why Ryan's ERA in games with five or more walks per start is not dramatically different from his ERA in games where he walks three or fewer per game (3.29 to 3.03). FIP predicts that ERA to be a lot farther apart (4.15 to 2.85).

Finally, here are the ERA values for each cell on Ryan's QMAX chart. (This is for his entire career, all 773 starts). You see how there's a strong tendency for him to sustain success in the upper right corner, while he struggles more in the middling regions of the chart. He is clearly a below-average pitcher when he pitches in the outer reaches of the success square (the 3,3-3,4-4,2-4,3 area): pitchers with less "stuff" and more "control" have less of sharp break there. Also, he's only intermittently successful in the "Tommy John" region (the one at lower left, where control pitchers manage to thrive despite giving up more hits than innings pitched). When he's giving up hits, he's really in trouble, even in those areas of the chart where most other pitchers manage to be successful.

The other sharp break here is between "2S" and "3S." Ryan fades as badly in the rightward movement across the "3S" and "4S" zones as anyone we've seen.

So what's clear from all this? Pitchers like Ryan, who are hard to hit and who have an intermittent kind of wildness, can be just about as successful in the outer reaches of wildness as they are in their more conventionally "great" performances (the ones in the yellow four squares at top left, the region we call the "Elite Square"). Once Ryan moves into more conventional hit/game regions, he becomes much less effective when his control deserts him--much more like a normal pitcher. One of these days we'll put the FIP values up for a chart like this--and that might turn out to be the best way to spot-check the value of that highly-ballyhooed stat: we just might find that there's a pattern to what it predicts well in terms of actual results, and what it does not. Stay tuned.