Throwing a plank in the framing-WAR gap

Thoughts on a tricky frontier of player value.

By Ivan the Great Nov 23, 2018, 10:00am EST

As the pre-2019 offseason gears up, the Braves find themselves with only one major league catcher on their 40-man roster: Tyler Flowers, who agreed to a $4 million salary for 2019 with a team option for $6 million for the 2020 campaign. Kurt Suzuki, his catching tandem partner for the last two seasons, will be playing for a divisional foe next year, signing his own two-year, $10 million pact. Meanwhile, rumors swirl about other major league catching options acquirable in the offseason, including free agents Yasmani Grandal and Wilson Ramos and the Marlins’ J.T. Realmuto.

Over these past two years, the Braves have enjoyed a bonanza of catching production for a bargain. Across the majors, Atlanta’s catchers have been the third-most productive pairing by fWAR in 2017-2018 (combined), behind only the Grandal-laden Dodgers and the Realmuto-wielding Marlins. If you restrict the analysis to “value accrued while playing as a catcher,” the Braves are again third, marginally ahead of the Marlins but falling behind the Giants. (The reason for the flip in rankings is largely driven by the Giants’ Aramis Garcia, a 2018 rookie who raked in about 30 PAs as a catcher while being awful in about 30 PAs as a first baseman — which I bring up only to demonstrate that really there’s a pretty tight cluster including all of these teams as well as the Cubs, with the Dodgers generally coming out ahead by most measures.)

Here’s the thing, though, and this is what this article is about. The previous paragraph used fWAR. fWAR includes most components of what we understand and can measure as player value these days: for position players, this includes hitting, baserunning, and fielding. What it doesn’t include (yet?), though, is something that baseball analysts suspect generally carries a substantial chunk of value: catcher framing.

(None of this diminishes the value the Braves have gotten from their catchers — according to Baseball Prospectus (whose WARP measure does incorporate catcher framing), the Braves had the second-best aggregate pitch framing in 2017, and were above average (though 12th of 30 teams) this past season. This framing boon has largely been due to Flowers, and in spite of the recently-departed Suzuki, as Flowers finished first and third in “framing runs” metric in 2017 and 2018, respectively, despite serving as more of a half-time catcher. Suzuki, meanwhile, finished 75th and 108th out of around 120 catchers, with runs given up on the framing end relative to an average catcher.)

Naturally, when someone mentions that a component of player value isn’t currently being included in the stat meant to capture all aspects of player value, the immediate follow-up question is, “Okay, so how big is the effect that’s not being captured?” For pitch framing, the effect appears to be pretty large. Back in the early part of this decade, recent Braves hire Mike Fast published on the Baseball Prospectus website a summary of his work on pitch framing (which built on and summarized work by other analysts in the topic). I’m not going to re-summarize the whole article (and seriously, if you’re interested, just go read it — though there’s a chance if you’ve read other articles on this topic you may have absorbed some of these findings via general internet osmosis at this point), but I do want to throw out a key quote:

If these numbers accurately represent catcher performance at getting extra strike calls, catchers have the ability to gain or cost their team a win or two over the course of the season in this department. That would make this skill as important as all the other facets of catcher defense that we are currently able to measure (preventing stolen bases, blocking pitches, and fielding balls) put together.

This finding is consistent with numbers we’re still seeing from catcher framing assessment methodologies in 2018: according to Statcorner, the range of runs gained/lost by framing was between +14 (Jeff Mathis) and -23 (Salvador Perez); according to Baseball Prospectus, this range was between +16 (Yasmani Grandal) and -18 (Willson Contreras). If you divide these run totals by roughly 10, under the easy rule-of-thumb assumption that ten runs equal a win, you pretty much see that “win or two” range Fast described back in 2011. Catcher framing has changed through the years, namely in that the gap between good framers and bad framers, as well as good-framing teams and bad-framing teams, has narrowed notably. (See here and here, both highly recommended just as further information on the topic.) Yet, even with the narrowing, the value of framing does seem to be a big deal. As an illustrative example, consider Tyler Flowers. In 2017, Tyler Flowers received 370 PAs and put up 2.4 fWAR, good for 102nd out of the 349 players with 200 or more PAs. While it’s not quite apples-to-apples, by Baseball Prospectus’ WARP stat, which incorporates catcher framing, Flowers was the eight-most-valuable position player in baseball with 6.6 WARP. That was within spitting distance of Kris Bryant and (yes, I know) Mike Trout, both of whom had many more PAs than he managed. You might be reading this and going, “Okay, but this just suggests pitch framing is whack, not that Tyler Flowers was really that good.” But set that aside for a minute, we’ll come back to it, I promise. The point of this illustrative example is just to show that catcher framing can be a big deal. Even if you don’t think it should be as big of a deal as Flowers’ skyrocketing up the 2017 player value rankings, one or two wins is a sizable margin. Transforming an average player to a borderline All-Star, not by doing anything but just by capturing and accounting for the value he already provides, that seems pretty important.

So, what’s the issue? Statcorner has catcher framing value in terms of runs. Baseball Prospectus has catcher framing value in terms of runs. Why doesn’t fWAR? Why doesn’t Baseball Reference? Let’s get those babies loaded up with framing and start accounting for player value more holistically, pronto, right? Well, it may not be quite so easy...

Before we talk about the challenges in doing so, we first just need to talk about WAR (and specifically fWAR) more generally.

Some things about (f)WAR

(Note: All of the below, not just in this section, but in the article, is my interpretation and understanding of WAR, its intent, and how it works. I did not create WAR, I don’t have access to the underlying formulae and databases, and everything I learned came from reading things on the internet. So, I offer you no guarantees that the below is fully accurate and correct, that it’s the best possible approach, or that I have any idea what I’m talking about.)

There’s 1,000 fWAR available in a year.
For the most part, WAR cares about results. This probably needs a little more elaboration. The way I think about WAR is that everything that happens on the field is credited. A batter strikes out, which means the batter loses some value, and the pitcher that struck him out gains some value. A batter hits a single, which means the batter gains some value, and some fielder loses some value for not being able to convert the single into an out. A batter hits a homer or draws a walk, and he gets value, while the pitcher that allowed that result loses value. In that way, WAR is kind of a ledger or a balance sheet, and because on every play someone succeeds and someone fails, WAR captures that. (Incidentally, this is why I’m not a fan of bWAR, which debits both pitchers and fielders when a ball in play is not an out.)

In the context of catcher framing, both of these things cause a potential issue.

First, you couldn’t simply say, “Hey, this catcher is worth +10 framing runs, let’s give him an extra win!” If you did, you’d now have 1,001 WAR, not 1,000. Do this for all of the catchers and you will probably have way more than 1,000 WAR. If you want to keep the 1,000 WAR mark and preserve the old order that we’re all familiar with (e.g., a 2 WAR player is average, etc.), that value has to come out of someone else. It doesn’t really make sense to take it out of hitting or fielding, and it should probably be pitchers to the extent that framing actually helps pitchers retire batters. So, to integrate framing into WAR, you may need to remove value from pitchers.

(As a side note, I can also see an argument made for a framework where catching runs above/below average always sum to zero. In this case, catcher value would be redistributed across catchers, but wouldn’t need to eat into anyone else’s WAR. This might be a pretty good way to integrate framing that’s separate from the rest of this article, and it has decent precedent because right now, UZR and DRS for a position generally sum to zero across all players in a year. But, as teams use better and better framers, the average keeps needing to be updated; in 2018, Baseball Prospectus data indicated a net 33.8 runs above average for catchers while Statcorner had a mind-boggling 2,426 “lost” calls across all catchers, leading to a net 322.3 runs below average. Basically, there’s some work that needs to be done either way.)

Even aside from the “how do we fit framing into 1,000 WAR per season” issue, there’s a bigger conceptual issue in play. Remember above, where we talked about WAR being based on results? Well, pitch framing isn’t necessarily about a result, but about setting up the situation to get a better result later. In 2008, Dan Turkenkopf, writing at Beyond the Box Score, found that each switched call was worth .133 runs. (This parameter was used by Fast in the article linked previously.) But, there’s no analog in WAR currently for “did a thing that helped get a better result later.” To be perfectly explicit: assigning value to a catcher converting a borderline 1-1 pitch into a 1-2 count versus a 2-1 count has some broad, “long-term” value in terms of creating a better outcome later in the plate appearance. If the hitter strikes out later, though, the pitcher will get the credit. If the hitter homers, then the fact that the catcher stole a strike is kind of irrelevant, and the pitcher will get the debit. It’s not that the inherent finding that stealing strikes (or giving them away) is worth runs isn’t meaningful, it’s that it doesn’t jive with the ledger idea of WAR. This also gets more complicated in terms of sharing value or credit. Let’s say that we do want to assign some value to pre-results by stealing strikes. If so, how do we apportion that between the pitcher and the catcher? And what would we do in an edge case where a catcher keeps stealing strike two but the pitcher can never get strike three? And if we do start giving credit for moving from one count to another, shouldn’t we start considering credit for hitters for swinging at strikes and not swinging at balls, since they’re also putting themselves into better positions to succeed by doing so?

If the answers to these questions were straightforward, then integrating framing into WAR would be as well. But, so long as framing is a skill that isn’t directly evident in outcomes, it may have an awkward, incongruous relationship with a value system that uses outcomes as a foundation. It’s this thinking about outcomes that brings me to my offering about how to bridge the current gap between framing and WAR. In short, I’m not proposing an overall solution, there’s no fully-constructed bridge here. Rather, it’s pretty much me throwing a single plank out there, mostly as a thought exercise.

The plank: framing in full counts

Earlier this month, I wrote about how much I loved full counts. It’s not because of that love that I’m focusing on them here, but it’s not like it hurts. Full counts fit very well for approaching this problem, though: when a pitch in a full count is taken, catcher framing has the ability to transform a walk into a strikeout, or vice versa. There’s no need for further set-up or tweaking the odds in your pitcher’s favor. If you steal a strike, you get an out instead of a baserunner. If you give away a strike, your pitcher’s hard work goes for naught.

Luckily for everyone, the data for this phenomenon are easily obtainable. Using a basic query on Baseball Savant, we find the following:

9,748 pitches in 2018 were taken in full counts.
Of these, 1,921 were in the zone. These resulted in 381 walks. (Yes, this is a blown call rate, or a framing fail rate, of about 20 percent. Yikes. Going back to the full count article for a second: look, hitters, even if the pitcher throws a full-count pitch in the zone and you don’t swing, you still have a one-in-five chance of reaching base. Seriously, rethink how much you swing in full counts.)
Of these, 7,811 were outside of the zone. These resulted in 416 strikeouts. (This is a much less egregious but still notable blown call/successful frame job rate of 5 percent or so.)

We can then easily see which pitchers threw these pitches to which catchers, and what results they got. We can find, for example, that Wade LeBlanc got seven strikeouts in 2018 in full counts on called strikes outside of the zone, while both John Gant and Jose Quintana allowed five walks in full counts on called balls that landed within the zone. We can also do the same for catchers: unsurprisingly, we find that Yasmani Grandal ranks at the top of this leaderboard with 14 strikeouts “stolen” for his pitchers compared to only four shouldn’t-have-been walks allowed, while Salvador Perez was nearly dead last in this measure by “stealing” seven strikeouts but allowing a whopping 16 misbegotten walks. (Nick Hundley was slightly worse than Perez in this regard, getting only two extra punchouts for his pitchers while allowing 12 walks on pitches in the zone.)

All of this is well and good, but in order to go from these factoids to WAR, we need to do some conversions. Two methods that ended up being very similar are described below.

Method 1: The FIP-Based One

For pitchers, fWAR is based on FIP (and infield pops). The FIP formula is pretty simple: take homers allowed and multiply them by 13, take walks plus hit by pitches and multiply them by 3, take strikeouts and multiply them by negative 2, sum this together, divide it all by innings pitched, and then add the FIP constant that puts the number on a scale that resembles ERA. Because of this, it’s easy to calculate both a pitcher’s current FIP, as well as what his FIP should have been if not for framing (either good or bad) on his full count pitches. Again, to be clear, this ignores any effects of framing in any other count, because that framing didn’t directly cause any particular outcome.

The end result of a series of mathematical manipulations (described below) yields an effect of about 0.05 fWAR for each switched call. fWAR is gained when a walk should have been a strikeout, and lost when a strikeout should have been a walk. In other words, Wade LeBlanc, who was mentioned above as having received seven strikeouts on pitches outside the zone in full counts, would lose about 0.35 fWAR (call it 0.4) if the value for those were reapportioned to his catchers instead.

Step by step, here was the process to derive this figure.

Convert each pitcher’s fWAR (even relievers) into fWAR/200.
Derive an average relationship between FIP and fWAR/200. This average relationship doesn’t reflect most individual pitchers’ fWAR/200s (or downscaled fWARs) very well, but that’s okay due to steps below.
Calculate the pitcher’s “adjusted for framing” FIP, and convert that to fWAR/200 as well.
Subtract the value from Step 2 from the value in Step 3 to see the delta in fWAR/200 as a result of framing effects in full counts.
Downscale the delta in fWAR/200 back to the pitcher’s actual innings pitched. When downscaling, the variable effect of FIP varying with innings pitched is removed. Observation or calculation of the difference in net Ks/BBs changed against the downscaled delta reveals the factor of approximately 0.05 fWAR per switched call on a full count.

Perhaps not surprisingly, the net effect of this change to any individual pitcher was rather small. The biggest beneficiaries were the quartet of John Gant, Jose Quintana, Jhoulys Chacin, and German Marquez. Three of these guys received five walks as a result of poor framing and gained zero extra strikeouts; Chacin received eight extra walks but also gained three strikeouts for the same net change of five outcomes. The biggest loser was the aforementioned LeBlanc, and there were a handful of pitchers that had gained four net calls due to framing as well. Overall, though, this is not a very large effect. Of the 152 pitchers with the most innings pitched, 86 (so, over half) had zero or one net calls switched; only 11 had net four or more calls switched in either direction.

With this per-call number, it’s very easy to move from the pitchers to the catchers. 94 catchers were responsible (or at least present for) one of these erroneous full count calls. Grandal, for example, had a net of 14 extra strikeouts and four extra walks, or a net of 10 extra calls that, when multiplied by the 0.05 factor, yields an extra 0.5 fWAR for framing. The corresponding laggard in the dataset, Nick Hundley, lost a net 10 calls, so he loses 0.5 fWAR in framing. The net effect is a transfer of two wins from pitchers to catchers this way.

Method 2: The Linear Weights-Based One

Turns out, the above was perhaps too complicated. We already know from linear weights used in calculations like wOBA that a walk is worth around 0.5 to 0.6 runs relative to an out. Since a strikeout is indeed an out, this creates a really easy way to apportion value, which is entirely consistent with the other method — it just requires less finagling with numbers. But, it’s always great when you’re able to confirm the parameters of one approach through another, and that’s pretty much what happened here.

The 2018 linear weights for wOBA had a walk be something like 0.56 runs relative to an out; the first method had a figure closer to 0.52. As a result, there are some fringe differences when moving WAR around between pitchers and catchers across the two methods, but they’re all on the order of 0.1 WAR or less, and not material (though really, little of this is material!).

Results of the Plank

Below, I present some tables of potential interest with respect to these value rejiggerings.

Unfortunately, not even this limited adjustment could make Steven Brault escape the sub-replacement level cellar. Sorry, Steven.

Again, the catcher variations aren’t very large, on the mark of half a win. For comparison, I provided the “framing runs for everything (not just full counts)” values from Baseball Prospectus and Statcorner as the rightmost column. You can generally see that the value I’m attributing to catchers here tends to be a fraction, though often a sizable one, of their overall framing value. There are, however, a few points of weirdness, such as Tucker Barnhart being a poor framer who somehow managed to frame well on full counts, poor full count framing jobs done by Austin Romine, Yadier Molina, and Kevan Smith relative to their overall framing numbers, and some pretty surprising full count framing results from J.T. Realmuto, who had the same net switched calls as Tyler Flowers in 2018, albeit in close to double the playing time.

To wrap up, here’s a table of all Atlanta pitchers in 2018 with 10 or more frames, and their full count calls gained/lost as a result of innings.

Mike Foltynewicz was the biggest beneficiary of framing among the Braves’ hurlers, with five extra strikeouts to only two extra walks as a result of blown full count calls. A.J. Minter and Julio Teheran suffered the most from bad framing in those situations, losing a net two calls each.

In terms of specific pairings, Kurt Suzuki transformed four Teheran in-the-zone full count pitches into walks, while only getting him one extra strikeout. Tyler Flowers was the opposite for Sean Newcomb, getting him three extra strikeouts to just one additional walk.

By the way, five of German Marquez’ and Jose Quintana’s walks came as a result of bad full count framing by Tony Wolters and Willson Contreras, respectively. Marquez only walked 57 batters all season, Quintana only walked 68. That’s a high price to pay in terms of walks directly resulting from one key blown call. Contreras at least did redeem himself somewhat, as six of Jon Lester’s 149 strikeouts on the year came as a result of good framing on full count pitches that landed outside the zone.

Conclusion

As described above, pitch framing is generally considered to be a “big deal” in terms of run value. However, squaring it away with current player value measurements can be tricky, because the principles of current WAR value (especially with regards to fWAR) are somewhat at odds with the principles of pitch framing value, which seeks to credit catchers for setting up a result, rather than the result itself. While there are probably myriad smart and elegant ways to resolve this tension, I chose to focus specifically on a place where there doesn’t need to be any tension: the results of framing in full counts.

I found that a pretty substantial chunk of framing value can be attributed just to these instances, which we know are definitely results and not just build-up or probability adjustments. The aggregate shifts aren’t dramatic, in part because most full-count pitches are swung at, thus precluding the relevance of framing in determining the results of those PAs. But, we can still see that even when we consider a small subset of framed pitches, there’s still some value there. While an individual pitcher won’t tend to throw enough taken full count pitches over the course of the year for his value to depend much on the results of those pitches, this isn’t quite the case for catchers. The best framing catchers can get an extra half-win of value by turning walks into strikeouts; the worst ones can lose that much by saddling their teams with free passes on pitches in the zone.

(As usual, all data available on request.)

Throwing a plank in the framing-WAR gap

Share this story

Share All sharing options for: Throwing a plank in the framing-WAR gap

More From Talking Chop

Loading comments...

Share this story

All sharing options for: Throwing a plank in the framing-WAR gap