Thursday, July 19, 2018

JUST WHEN YOU THOUGHT YOU WERE SAFE: Q&D QMAX IS COMIN' TO GETCHA!

QMAX and FIP are odd stepbrothers, working overlapping areas of analysis from diametrically opposed rationales. FIP's strength came from two things:

1) a huge push from the "fielding-independent" fad that took over sabermetrics, pushing it headlong into its first and most intense "neo" phase (prior to its series of calcifications that have left analysis marooned in a cul-de-sac);

2) the ability to be calculated quickly and easily without recourse to granular data.

QMAX is a better tool for starting pitchers (it was never meant to address relievers), but: it began with a counterintuitive perspective that grated on the 1990s community--it refused (still does, actually) to work with runs and run adjustment to generate its suite of stats, including the base measures (hit and walk prevention values, and a probabilistic QMAX winning percentage--abbreviated QWP, which if we say so ourselves is witty as hell--or at least Hieronymous Bosch's depiction of hell, a place where selected neo-sabes (they know who they are..) and a certain Orange Menace may yet spend eternity.

It was hampered by the fact that you had to calculate and adjust the grades by individual game, and while a math whiz like Sean Forman could automate that (and did, way back in the primordial ooze of what became Forman et fils--yeah, yeah: baseball-reference.com), it could not be done by the so-called "littery man" of baseball analysis.

That made QMAX difficult to transmit to the masses--not that said masses were exactly clamoring for it, mind you.

But now...that's changed. One evening after watching Fille du Diable, the French noir we'll be showing in San Francisco a week from tonight (featuring "Goth girl" Andrée Clément in a simply astonishing performance), we had something akin to an epileptic seizure that lead to a sweaty breakthrough regarding how to calculate QMAX with just basic adjustments of mainstream rate stats. (Or did the sweaty breakthrough lead to the seizure? As the soothsayer sayeth: pick a card, any card...)

So here's how it works. (We provide some pitching leaders over the past two months, with their stats from the May 15-July 15 snapshot, again courtesy of David Pinto's Day-by-Day database.) Our first table (above) shows the basic data for the eleven pitchers with the best QWPs for the time frame in question. We see ERA, K/9, BB/9, and David's concoction, Cy Young Points (a fun measure, which we abbreviate CYP, which--of course--rhymes with QWP).

We added a rate stat version for David's stat just to "double down" on the fun (and won't we all be glad when we never hear that term again...) which shows that while Trevor Bauer had a great run over the past couple months according to CYP, he's not the leader in CYP/9 (contrary to popular belief, CYP/9 is not a sci-fi "binge" series on the Android Channel). That distinction belongs to the mysterious lefty Blake Snell, the last starter standing in Tampa Bay.

QMAX will show up in the next table, but if you can intuit that it will show us these same pitchers in the same order as presented above, then you'll know that Chris Sale has the best QWP over the past two months, which is why he's passed Justin Verlander as the #1 starter in the AL.  So QMAX will pass the "smell test" of those stat-adjusting ideologues who trash ERA as almost as useless as batting average--though this is not quite a ringing endorsement of anything. What's clear is that ERA is transitory, particularly in smaller sample sizes (the ones that neo-sabes rail against until they start using them for their own purposes), and our hope is that a Quick & Dirty QMAX (...jeez, Malcolm, just now working in that acronym? talk about your buried ledes...) will provide a more robust way of looking at the performance value of starters, particularly in sub-season snippets.




So here's the rest of the stats involved in how we get to an "indicated QMAX" (best estimate without applying the grading method manually to each start) score. We need to adjust H/9 to the QMAX "S" value (hit prevention). Our "bathtub gin regression" shows that .44 of the H/9 is a good first cut at this base value. That value is shown in the Q/Si column. (Remember: the lower, the better).

The adjustment to Q/Si that's needed is to account for extra-base hits and HRs. As was posited by the perpetrators of FIP, the distribution of doubles and triples is uniform enough to set aside, leaving a HR adjustment for QMAX in order to have the "S" value reflect as much of the "total base" effect in "hit prevention" as possible. We get that together by two relative measures of HRs allowed by pitchers: the first, the straight HR/9 rate; the second, HR TB as a function of overall TB allowed. These two modify each other and can be expressed in a formula with a multiplier that then adjusts upward for the "total base" effect, giving us a more realistic QMAX "S" score (the left-most column in green, the one called QSihr (indicated QMAX adjusted for homers).

As you can see, this doesn't make much difference for some hurlers (Sale, Aaron Nola, Jacob deGrom, and Bauer), but it does alter the values for those with HR issues (in this two-month snapshot, that's be Justin Verlander, Mike Foltynewicz, Corey Kluber). Their adjusted "S" scores really do adjust upward.

With adjusted "S" in place, we can perform an analogous procedure for the "walk prevention" component (QMAX "C") which adjusts for a couple of minor idiosyncracies that the relationship of the matrix approach to grading "walk prevention" and the range of BB/9 sometimes conspire to distort...and when we do that, we get a good approximation of what QMAX "C" would look like if it were calculated by hand. (The process is similar to curve-fitting, except we're doing it--as usual--in the bathtub.)

Finally, we can generate QWPs for the separate "S" and "C" functions, and create a formula to blend them into a final aggregate, indicated, Q&D QWP (also sometimes referred to as the quicker picker-upper...though we think most of those so inclined will prefer the cotton-pickin' picker-uppers shown at right).

What we see above is that over the past two months, Chris Sale is sailing along, Ross Stripling has done a reasonably passable impersonation of Greg Maddux circa 1997, Blake Snell has been the "power precipice" starter par excellence, and Aaron Nola has become the ace of the Phillies. It's wonderful to be able to do Q&D QMAX under the supervision of a partially house-trained dachshund and a semi-monitored regimen of directed medication (see above), and--despite the levity--this is pretty damn cool after all these years of doing it by hand. Summertime, and the QMAX is easy: really, now, what more could one ask for--except for "regime change," that is?

POSTSCRIPT: A reader suggested that it might be useful to provide an example of how close the Q&D QMAX is to the official method. In other words, how close does quick-and-dirty do?

That answer, to be sufficient, would require a lot more work than it's possible to do (at least for awhile). But we can at least look at an example that provided the most challenging amount of work to bring the QMAX matrix mechanism into sync with more conventional statistical distributions.

The challenge lies in the possibility of distortion between the "C" value and its grading process within the QMAX matrix with the more straightforward BB/9 stat. Adjusting BB/9 for the QMAX structure can get tricky in extreme cases (see, for example, the QMAX discussion of Tommy Byrne). Even someone whose walk totals exceed their innings pitched will never approach the QMAX maximum value, so we either need a separate "actuarial table" that cross-references these values or we have to develop a conversion formula that replaces the need for such a table.

Fortunately, we were able to do the latter, and here you can see the results of that when we apply it to Blake Snell, 2018's poster boy for high BB/9 averages. (It's actually not THAT high--4.2 over his eleven starts from May 15-July 12--but an ordinary conversion would produce a QMAX "C" value higher than the BB/9, which can only happen on the opposite end of the spectrum.)

Snell has become a "power precipice"
pitcher in 2018, usually characterized by
having a lower "S" score than "C" score.
The end goal, of course, is to get the QPW value (QMAX Winning Percentage) as calculated by the Q&D version as close to the hand-calculated value in the start-by-start method. In Blake Snell's case, we find that his Q&D QWP is .640.

What's the hand-calculated QWP? It's .639.

Now, mind you, not every one of these is going to be that close--not hardly. But the overall deviation in the Q&D QWP values for the ninety-one pitchers with at least 50 IP during the May 15-July 15 time frame is 2.4%.

That's encouraging enough to be able to present more QMAX info using the Q&D shortcut...so expect to see more in-season data using it in the future.