Monday, July 23, 2018

RETURN OF THE SMALLFRY?

Ortiz:"You're batting leadoff? I
thought you were the batboy!!"
Mookie Betts is having a great season, and that's not just great news for Red Sox fans. Mookie is not a "big man"--he's only 5'9", 180lbs. It has become extremely unusual in recent times for players of such (relatively) diminutive stature to have such a dominant offensive profile. So far, however, Mookie is flying high with a 193 OPS+ and is very much in the MVP discussion despite missing the better part of three weeks due to injury.

It turns out that Mookie is one of nine hitters who are 5'10" or shorter having at least robust offensive years (defined as a 120+ adjusted OPS) in 2018.

Now, nine may not sound like a lot--and it's not. But it's a helluva lot better than the numbers we see for "smallfry" in the recent past. Our table below shows that in recent years, hard-hitting little guys were a seriously endangered species.

Clearly two forces have been at work historically to chip away at the short-statured player. First, the general trend that people are getting taller. Second, the pervasive typecasting of small players as bereft of power, and a selection bias that favored middle infielders who had little or no chance of developing it.

What may be pushing things the other way is the perception that all players should possess a higher degree of power (measurable in ISO), opening up the search for short players capable of hitting for power. Of course, OPS+ is not driven only by ISO or SLG--but having that in addition to a solid OBP supported by a higher-than-average BA might just be the combination that has produced a sudden "bumper crop" of good-hitting smallfry.

A decade ago, neo-sabes had predicted extinction for such players--and the numbers you see in the 2000-09 time frame would certainly have made such a prediction seem plausible. But we see here at the least the possibility of a counter-trend. Remember these are hitters whose overall offensive profile (BA, OBP, and SLG) is lifting them to prominence--most of them are not hulking low-average power hitters relying on ISO to boost their SLG. There's a chance that some of these smallfry (read: big little men) will be Hall of Famers one day...to which we say, "Hallelujah!".

Some think Mookie is the second coming of Willie Mays. Time will tell if he really has enough power to make that comparison more plausible, but size-wise, personality-wise, speed-and-defense-wise he's the same breath of fresh air that we got when the Say Hey Kid was in the prime of his golden youth. We dug out our YEPS (Year-End Projection System) spreadsheet to see what it projected for Mookie at season's end: as one might expect, it projects a dropoff over the next two months, but the overall projection is still for a season with an OPS+ in the 170 range.

That is a good first step toward being a Mays-like player.

Here are the nine "smallfry" posting 120+ OPS+ seasons thus far in 2018--if you're a "smallfry" yourself, light a candle in the window for these guys. It would be great if all nine stayed in the 120+ zone...

Ozzie Albies, Jose Altuve, Andrew Benintendi, Betts, Khris Davis, Eddie Escobar, Scooter Gennett, Jose Ramirez, Jean Segura

Saturday, July 21, 2018

VITAL SIGNS JULY/2: JULY 20 UPDATE

Here's the latest: run scoring and hitting in general is up, but HRs are down--just the first piece of information refuting Joe P.'s  recent "defense" of what a growing cadre of disillusioned statheads are calling the "take and rake" philosophy.

We'll get back to undressing Joe's argument at a later date, but suffice it to say that it's OBP that drives offense. And there are two ways to increase OBP--draw more walks or make more hits. That's what's happening in July. Run scoring is at its highest rate because BA and OBP have recovered, while HRs are still down.

It's a sign that a sizable number of hitters are setting aside the "all or nothing" approach after they watched their collective BA push itself into 1960s levels for nearly six weeks during May-June.

It's also a sign that there's a something of a starting pitcher crisis occurring this month--but not in terms of HRs allowed. No, the symptom seems to be more basic--and sabermetrically inconvenient. It appears (as shown in the table at left) that many teams' starters have suddenly become more hittable. Batting average is up, and it is strongly correlated with the often sharp rise in starting pitcher ERA thus far in July.

Sixteen teams have their starters posting July ERAs at least ten percent higher than the overall team ERA. Nineteen teams have starting pitchers who are generally more hittable in July than they've been over the course of the season to date.

(As is likely the case with most of you, we don't quite know what to do with the Tampa Bay data, since their starters are still "unto themselves." But what's clear is that the Rays are not giving up very many HRs in July, and that's how they've lowered their starters' ERA even though they are giving up more hits.) More evidence of a slow but steady adjustment from the single-minded "take and rake."

Thursday, July 19, 2018

JUST WHEN YOU THOUGHT YOU WERE SAFE: Q&D QMAX IS COMIN' TO GETCHA!

QMAX and FIP are odd stepbrothers, working overlapping areas of analysis from diametrically opposed rationales. FIP's strength came from two things:

1) a huge push from the "fielding-independent" fad that took over sabermetrics, pushing it headlong into its first and most intense "neo" phase (prior to its series of calcifications that have left analysis marooned in a cul-de-sac);

2) the ability to be calculated quickly and easily without recourse to granular data.

QMAX is a better tool for starting pitchers (it was never meant to address relievers), but: it began with a counterintuitive perspective that grated on the 1990s community--it refused (still does, actually) to work with runs and run adjustment to generate its suite of stats, including the base measures (hit and walk prevention values, and a probabilistic QMAX winning percentage--abbreviated QWP, which if we say so ourselves is witty as hell--or at least Hieronymous Bosch's depiction of hell, a place where selected neo-sabes (they know who they are..) and a certain Orange Menace may yet spend eternity.

It was hampered by the fact that you had to calculate and adjust the grades by individual game, and while a math whiz like Sean Forman could automate that (and did, way back in the primordial ooze of what became Forman et fils--yeah, yeah: baseball-reference.com), it could not be done by the so-called "littery man" of baseball analysis.

That made QMAX difficult to transmit to the masses--not that said masses were exactly clamoring for it, mind you.

But now...that's changed. One evening after watching Fille du Diable, the French noir we'll be showing in San Francisco a week from tonight (featuring "Goth girl" Andrée Clément in a simply astonishing performance), we had something akin to an epileptic seizure that lead to a sweaty breakthrough regarding how to calculate QMAX with just basic adjustments of mainstream rate stats. (Or did the sweaty breakthrough lead to the seizure? As the soothsayer sayeth: pick a card, any card...)

So here's how it works. (We provide some pitching leaders over the past two months, with their stats from the May 15-July 15 snapshot, again courtesy of David Pinto's Day-by-Day database.) Our first table (above) shows the basic data for the eleven pitchers with the best QWPs for the time frame in question. We see ERA, K/9, BB/9, and David's concoction, Cy Young Points (a fun measure, which we abbreviate CYP, which--of course--rhymes with QWP).

We added a rate stat version for David's stat just to "double down" on the fun (and won't we all be glad when we never hear that term again...) which shows that while Trevor Bauer had a great run over the past couple months according to CYP, he's not the leader in CYP/9 (contrary to popular belief, CYP/9 is not a sci-fi "binge" series on the Android Channel). That distinction belongs to the mysterious lefty Blake Snell, the last starter standing in Tampa Bay.

QMAX will show up in the next table, but if you can intuit that it will show us these same pitchers in the same order as presented above, then you'll know that Chris Sale has the best QWP over the past two months, which is why he's passed Justin Verlander as the #1 starter in the AL.  So QMAX will pass the "smell test" of those stat-adjusting ideologues who trash ERA as almost as useless as batting average--though this is not quite a ringing endorsement of anything. What's clear is that ERA is transitory, particularly in smaller sample sizes (the ones that neo-sabes rail against until they start using them for their own purposes), and our hope is that a Quick & Dirty QMAX (...jeez, Malcolm, just now working in that acronym? talk about your buried ledes...) will provide a more robust way of looking at the performance value of starters, particularly in sub-season snippets.




So here's the rest of the stats involved in how we get to an "indicated QMAX" (best estimate without applying the grading method manually to each start) score. We need to adjust H/9 to the QMAX "S" value (hit prevention). Our "bathtub gin regression" shows that .44 of the H/9 is a good first cut at this base value. That value is shown in the Q/Si column. (Remember: the lower, the better).

The adjustment to Q/Si that's needed is to account for extra-base hits and HRs. As was posited by the perpetrators of FIP, the distribution of doubles and triples is uniform enough to set aside, leaving a HR adjustment for QMAX in order to have the "S" value reflect as much of the "total base" effect in "hit prevention" as possible. We get that together by two relative measures of HRs allowed by pitchers: the first, the straight HR/9 rate; the second, HR TB as a function of overall TB allowed. These two modify each other and can be expressed in a formula with a multiplier that then adjusts upward for the "total base" effect, giving us a more realistic QMAX "S" score (the left-most column in green, the one called QSihr (indicated QMAX adjusted for homers).

As you can see, this doesn't make much difference for some hurlers (Sale, Aaron Nola, Jacob deGrom, and Bauer), but it does alter the values for those with HR issues (in this two-month snapshot, that's be Justin Verlander, Mike Foltynewicz, Corey Kluber). Their adjusted "S" scores really do adjust upward.

With adjusted "S" in place, we can perform an analogous procedure for the "walk prevention" component (QMAX "C") which adjusts for a couple of minor idiosyncracies that the relationship of the matrix approach to grading "walk prevention" and the range of BB/9 sometimes conspire to distort...and when we do that, we get a good approximation of what QMAX "C" would look like if it were calculated by hand. (The process is similar to curve-fitting, except we're doing it--as usual--in the bathtub.)

Finally, we can generate QWPs for the separate "S" and "C" functions, and create a formula to blend them into a final aggregate, indicated, Q&D QWP (also sometimes referred to as the quicker picker-upper...though we think most of those so inclined will prefer the cotton-pickin' picker-uppers shown at right).

What we see above is that over the past two months, Chris Sale is sailing along, Ross Stripling has done a reasonably passable impersonation of Greg Maddux circa 1997, Blake Snell has been the "power precipice" starter par excellence, and Aaron Nola has become the ace of the Phillies. It's wonderful to be able to do Q&D QMAX under the supervision of a partially house-trained dachshund and a semi-monitored regimen of directed medication (see above), and--despite the levity--this is pretty damn cool after all these years of doing it by hand. Summertime, and the QMAX is easy: really, now, what more could one ask for--except for "regime change," that is?

POSTSCRIPT: A reader suggested that it might be useful to provide an example of how close the Q&D QMAX is to the official method. In other words, how close does quick-and-dirty do?

That answer, to be sufficient, would require a lot more work than it's possible to do (at least for awhile). But we can at least look at an example that provided the most challenging amount of work to bring the QMAX matrix mechanism into sync with more conventional statistical distributions.

The challenge lies in the possibility of distortion between the "C" value and its grading process within the QMAX matrix with the more straightforward BB/9 stat. Adjusting BB/9 for the QMAX structure can get tricky in extreme cases (see, for example, the QMAX discussion of Tommy Byrne). Even someone whose walk totals exceed their innings pitched will never approach the QMAX maximum value, so we either need a separate "actuarial table" that cross-references these values or we have to develop a conversion formula that replaces the need for such a table.

Fortunately, we were able to do the latter, and here you can see the results of that when we apply it to Blake Snell, 2018's poster boy for high BB/9 averages. (It's actually not THAT high--4.2 over his eleven starts from May 15-July 12--but an ordinary conversion would produce a QMAX "C" value higher than the BB/9, which can only happen on the opposite end of the spectrum.)

Snell has become a "power precipice"
pitcher in 2018, usually characterized by
having a lower "S" score than "C" score.
The end goal, of course, is to get the QPW value (QMAX Winning Percentage) as calculated by the Q&D version as close to the hand-calculated value in the start-by-start method. In Blake Snell's case, we find that his Q&D QWP is .640.

What's the hand-calculated QWP? It's .639.

Now, mind you, not every one of these is going to be that close--not hardly. But the overall deviation in the Q&D QWP values for the ninety-one pitchers with at least 50 IP during the May 15-July 15 time frame is 2.4%.

That's encouraging enough to be able to present more QMAX info using the Q&D shortcut...so expect to see more in-season data using it in the future.

Thursday, July 5, 2018

VITAL SIGNS JULY/1: THE FIRST FOUR DAYS

Offense and homers spiked over the first three days of July, but things cooled down somewhat on the Fourth...perhaps the hitters didn't want to upstage the fireworks.

After four days in July, runs and HR's are now behind last year's pace, while BB's are running ever so slightly ahead.

Of course, when we say behind last year's pace, we need to remember that the HR pace in July 2017 (1.25 per game) was the sixth highest monthly total in major league history. The current July 2018 pace for HR/G would rank tenth.

A stretch of especially warm weather is supposed to work its way across the country this week--we'll see how it affects offense. As a rule, the higher the temperature, the more runs and HRs. Stay tuned...

Wednesday, July 4, 2018

"TWO-WAY" NUÑO TORPEDOS HIS COMEBACK...

In the tiny-bubble world of baseball innovation, the Tampa Bay Rays continue to give us much. But even Kevin Cash may have gone too far last night when extra-inning desperation forced the Rays' manager to turn our favorite avoirdupois-challenged southpaw, Vidal Nuño, into a two-way player.

Nuño responded to this with the type of underdog intensity that one would expect from someone who is simultaneously marginal and overfed. With the Rays playing an interleague game in Miami, they were already letting pitchers bat; by the fourteenth inning, Cash was out of double switches--so Nuño found himself in the batter's box.

And before you could say boo, Nuño flared one down the left-field line that landed fair and rolled toward the stands in foul territory. Marlins LF Derek Dietrich got to the ball in a hurry and fired to second base--where Nuño, valiantly impersonating a baserunner, was trying to stretch his single into a double. (Vidal was 2-for-26 lifetime when the plate appearance had begun.) The result is telegraphed in our video capture...

So into the sixteenth inning we go, and the Rays have taken the lead again when it is once again Nuño's slot in the batting order. Hoping to save his bullpen, Cash lets him hit again. The pitch from Brett Graves is low and over the plate--right in Vidal's, er, "wheelhouse"--and the portly one uses his modified "sand wedge" swing to slap the ball into the right-field corner.

This one drives in a run, and looks certain to be Nuño's first major-league extra-base hit, but as he "motors" down the first base line, he comes up lame, clutching at his right hamstring. He's stopped at first by the injury and has to be removed from the game.

The Rays go on to win, and Nuño gets his third win of the year (remember, he was 5-21 lifetime when recalled from the minors late in May)--but earlier this morning he was placed on the disabled list.

It's definitely a "story of my life" scenario for Vidal, who'd frankly been startlingly stellar (3-0, 1.50 ERA) as a cog in the Rays' fog-shrouded machine since his return from oblivion. Ironic, of course, that he'd get injured as a hitter, as if some kind of cosmic compensation was needed for having had the audacity to double his lifetime hit total in the space of two plate appearances...

So the sun has set on Nuño's empire, at least for the foreseeable future.

Sunday, July 1, 2018

VITAL SIGNS REDUX 3: JUNE IN THE BOOKS

June's games and their associated data are complete: the early swoon abated in the second half of the month, keeping it essentially on track with the R/G, HR/G and BB/G produced in May. The comparison of June 2018 with June 2017 still remains stark, however, as our differential percentage chart indicates.

Last June, homers were hit at the most frequent rate of any month in baseball history (1.35 per team per game) and run scoring shot up to 4.91 per game. The downturn in June 2018 finished in double digits (4.33 R/G, 1.16 HR/G).

Another way to track this data is to pick a "base month" and measure the monthly deviations from it that occur over time. There are actually two ways to do this--one where you track the changes month-by-month, using the prior month as the basis for the calculation, and other where you measure every month against the "base month" and get a cumulative rate of change.

Both are of sufficient interest to display here. First, the month-by-month changes [at right]. The "base month" we're using here in September 2017, the "cool down" month in last year's long homer siege (4.58 R/G, 1.19 HR/G, 3.25 BB/G). We can then see that April 2018 hit less HRs but drew more walks: the net result was a slight downturn in run scoring. May gained in HR/G over April, but pitchers were much stingier in terms of walks, which caused another downturn in R/G. And our comments comparing May 2018 to its successor month can now be seen in percentage terms, where runs went up slightly despite small declines in HR/G and BB/G.

We get a different view of this when we redirect the comparison to show us the differential of each month from the September 2017 "base month." As you'd expect, the April 2018 data is the same (April is compared to September 2017 in each method).

But in May we see the "cumulative" effect kicking in. We can see that relative to September 2017, May 2018 shows a larger cumulative drop in R/G, brought on by the flip-flops in HR/G and BB/G that show a cumulative decline in each of these measures. The June data shows how this data starts to converge, as the combined cumulative downturn in HR/G and BB/G is now about 80% of the decline in R/G.

What can we expect in July? There's often a decline in HR/G and R/G from the peaks achieved in the previous month; last year was no exception. It's possible that the protracted batting average swoon that occurred in the first half of June might have righted itself, however, and we may see some modest gains in July. Stay tuned...