In the next week or so, I plan to complete what has become one of my favorite parts of the off-season: finalizing another set of my own projections for the players that will comprise the 2017 Atlanta Braves. I’ve done similar exercises in 2015 and 2016, and have learned a lot about baseball and forecasting in the process.
Before I get into that, though, I wanted to take some time to do something that I think isn’t done nearly often enough, even in analytically-inclined circles: looking back and evaluating past projections. That’s not to say that there are never projection system reviews and evaluations (see Henry Druschel, being awesome), but rather that there’s a bit of a disconnect between the amount of effort spent talking about projections, or using projections as the basis for other analysis, and the amount of effort spent looking back to determine whether they’re doing their jobs and can be improved.
I think part of the challenge regarding the projection narrative is pretty much this: there are lots of projections out there, and some people generally think (or know) that these systems are pretty good at giving a broad overview of the expected production of players in the future. But, because that latter point isn’t demonstrated that often, it opens up a ton of room for a counter-narrative that they’re just guesses, as good as any other. So, in this post, devoted to position players, and in the next, which will cover pitchers, my goal is pretty much to summarize, side-by-side, the two most commonly-used 2016 projection systems (Steamer, ZiPS), and actual results. And, of course, I’ve thrown my own rudimentary projection system outputs (monikered IWAG) into the mix as well. You can draw your own conclusions from the below, and I’m curious if showing this sort of information does any good with regard to possibly changing someone’s mind regarding the effectiveness of projection systems, or a lack thereof.
My own takeaway, though, is that in general, projection systems remain remarkably solid at giving a high-level, on-average overview of future player production. You shouldn’t take them to be hyper-precise, but I think they get the job done. My second takeaway, mostly from my own experiments in trying to create a system of my own, is that it’s really not that hard to look at nothing but a player’s past stats, and figure out with reasonably accuracy what he’s going to do next. This is why simpler projection systems, like Marcel, can hang tough in many cases with more complex systems that take an array of different player factors, skills, outcomes, and histories into consideration.
The table below provides a basic overview of results. Two players, Ryan Lavarnway and Nick Swisher, did not play major league ball in 2016, and therefore aren’t included. Michael Bourn was a Brave in March, but was released and managed to play close to a full season across the Diamondbacks and Orioles, so his projections and results are included. The big omissions, meanwhile, are Dansby Swanson, Matt Kemp, and Anthony Recker; the 15 players listed below constitute about 80 percent of the total PAs the Braves accumulated in 2016, with Swanson, Kemp, and Recker constituting most of the remaining 20 percent, aside from a handful of PAs by guys like Blake Lalli, Brandon Snyder, and Reid Brignac.
The Def and WAR values (fWAR, here) are pro-rated to 600 PAs (450 PAs for catchers), so that each row can be assessed on an apples-to-apples comparison. However, the right table also includes PAs so you can discern how much playing time each player actually got.
Below, I provide a quick overview of the different projection systems and actual results across the three categories shown (essentially hitting, fielding, and total player value).
Hitting: wRC+ and Batting Runs
A primary challenge with assessing projections versus actual performance is that the actual playing time a player gets, whether due to injury, ineffectiveness, or competition, is not necessarily going to be the full-season value, or the specific projected value. While projecting measures like wRC+, which are essentially rate stats, works around this, the problem is that simply estimating the wRC+ error assigns equal weights to every hitter, regardless of how many PAs they actually got in 2016. To work around this, I simply converted wRC+ to “Batting Runs,” which is pretty much the same as Def, but for hitting: it’s just how many runs above or below average the hitter was, but done on an accumulative basis.
Below are charts showing the projections and actual performance of each of the 15 players assessed, in chart form, for both wRC+ and Batting Runs.
Although you probably already knew this, much of the cast of characters for the 2016 Braves was below average or abysmal on the hitting side, hence the bars all extending downward on the Batting Runs chart. Tyler Flowers and Freddie Freeman’s superlative performance were the good surprises, while AJ Pierzynski, Erick Aybar, and Daniel Castro were much worse than projected with the bat.
Some fun summary statistics:
Average projection difference (positive numbers mean the projection system overestimated, negative numbers mean it underestimated):
- Steamer — +9 points of wRC+; -0.7 batting runs
- ZiPS — +9 points of wRC+; -0.5 batting runs
- IWAG — +15 points of wRC+, +2.0 batting runs.
On average, after accounting for the number of PAs each player got, the projection systems were collectively off by a run or two. Essentially, they had offensive performance pretty well nailed.
Root Mean Square Error (this is just another measure of how well the projections worked, which is generally considered better than using a straight average because of how it penalizes large projection errors):
- Steamer — 29 points of wRC+; 8.0 batting runs
- ZiPS — 29 points of wRC+; 8.0 batting runs
- IWAG — 31 points of wRC+; 7.8 batting runs
Closest Projections (ties count as a “point” for both systems):
- Steamer — 7 players
- ZiPS — 6 players
- IWAG — 4 players
Best Projections:
- Steamer — Adonis Garcia and Gordon Beckham, each nailed perfectly
- Special shoutout to Nick Markakis, who basically ended up smack dab in the middle of the three projection systems, with a 98 wRC+ versus projections of 95, 97, and 102.
Worst Projections:
- IWAG — AJ Pierzynski (predicted about two batting runs below average, ended up about 18 runs below average with his terrible 41 wRC+)
- Pierzynski was the worst-predicted player collectively among the group, too, as everyone bought into his 2015 success a bit too heavily, and then he completely collapsed
Fielding: Fangraphs Def (which is fielding runs + positional adjustment)
Fielding is notoriously hard to predict/project, especially when compared to offensive performance. That’s mostly for a specific player, though: on average, an array of players is probably going to look like their collective projected performance, defense included. The results-versus-projections for Def reflect this pattern. Some players fielded way worse than projected (e.g., Jace Peterson), and some outperformed fairly rosy defensive projections (e.g., Ender Inciarte), but in general, the errors on both sides tended to cancel each other out.
Summary statistics:
Average projection difference (positive numbers mean the projection system overestimated, negative numbers mean it underestimated):
- Steamer — +0.3 defensive runs
- ZiPS — +0.8 defensive runs
- IWAG — +1.1 defensive runs.
On average, after accounting for the number of innings each player got, the projection systems were collectively off by about a run. Essentially, they had defensive performance pretty well nailed. (And yes, if it feels like you read this before... it’s because you just did, for offensive, up above.)
Root Mean Square Error (this is just another measure of how well the projections worked, which is generally considered better than using a straight average because of how it penalizes large projection errors):
- Steamer — 5.0 defensive runs
- ZiPS — 5.5 defensive runs
- IWAG — 4.5 defensive runs
Closest Projections (ties count as a “point” for both systems):
- Steamer — 7 players
- ZiPS — 5 players
- IWAG — 4 players
Best Projections:
- Steamer — Michael Bourn, nailed perfectly
- Special shoutout to Adonis Garcia, who despite a lot of on-field adventures and an ill-fated position switch, still ended up pretty much where he was projected
Worst Projections:
- ZiPS — Jace Peterson (expected to be a pretty good fielder, had an awful fielding season)
- The Jace Peterson issue was a common trend across all three systems, and probably one of the most baffling things about the 2016 Braves season, more generally
Overall Player Value: Fangraphs WAR (fWAR)
In the end, we mostly care about projections because they provide a useful, quantified baseline measure of expectations for the season going forward. In terms of WAR, the player projections for the 2016 Braves were pretty much right on the money. This is a useful thing to consider when thinking about other teams, like the 2017 Braves team, and their ability to outperform projections. While a clump of players can potentially all outperform their projections, that doesn’t seem particularly likely, and the 2016 Braves are potentially illustrative of that.
The 2016 Braves had such a dearth of position player talent on their roster that perhaps the results aren’t super-generalizable. The two good players, Freeman and Inciarte, exceeded their projections. A lot of bad players fell well short and accumulated substantial negative value. Erick Aybar, well, that was just awful. And some other players who weren’t necessarily expected to put up solid performances didn’t, like Adonis Garcia, Tyler Flowers, Jace Peterson, Nick Markakis, and Mallex Smith.
Summary statistics:
Average projection difference (positive numbers mean the projection system overestimated, negative numbers mean it underestimated):
- Steamer — +0.0 fWAR
- ZiPS — +0.1 fWAR
- IWAG — +0.3 fWAR
These results mostly speak for themselves. The aggregated projected production, for these 15 players, when mapped over the number of PAs they actually got, maps nearly perfectly to their actual aggregate production.
Root Mean Square Error (this is just another measure of how well the projections worked, which is generally considered better than using a straight average because of how it penalizes large projection errors):
- Steamer — 1.1 fWAR
- ZiPS — 1.0 fWAR
- IWAG — 1.0 fWAR
Closest Projections (ties count as a “point” for both systems):
- Steamer — 5 players
- ZiPS — 8 players
- IWAG — 9 players
Best Projections:
- ZiPS and IWAG — Michael Bourn, nailed perfectly
- Special shoutout to Adonis Garcia and Mallex Smith, who lacked much/any of a track record go on, but pretty much did what they were expected to. Sometimes, even the uncertainty around young/minor league players doesn’t mean they pull a rabbit (or whatever the negative equivalent of a rabbit is) out of their hat
Worst Projections:
- IWAG — Erick Aybar (I projected him to be just about average. Instead, he was awful)
- Not surprisingly, none of these systems projected Aybar’s decline, which is a cautionary tale that even for veterans with lengthy track records, sometimes stuff happens.