The calendar has now flipped to March, and Spring Training action is well underway. That means it’s time for essentially my favorite offseason thing each year: player projections! Things are going to be handled a little differently this year, but I hope this will still be worth the read. And, just to prime you, be aware that at some point, this is just going to be a bunch of images, which I hope speak with more clarity and nuance than any additional words I could write. Before that, a brief FAQ of sorts.
Why do we care about projections, anyway?
Well, for the most part, they’re just really fun. If you think about it, statistics in baseball really have two purposes. One purpose is to allow us to use logic and reasoning to put into words and stories things that have already happened. Another is to give us the opportunity to build speculative stories about what might happen, and then compare that what actually does happen. Pretty much any discussion of baseball statistics is going to be used for one of these two purposes (or both), and projections simply lean hard on the latter.
In addition, it’s important to remember that projections generally tend to be pretty accurate, especially in aggregate. I have been doing retrospective projection reviews for a while now; you can find the 2017 retrospectives here (position players) and here (pitchers). The general idea is that some players will outperform their projections, and others will underperform, but everything might more or less come together in the end. If it doesn’t, that just adds further intrigue to the season as it develops, and something to work on next year. Plus, it’s not like the projection systems always agree, either. That adds even more to think about and discuss.
What projection systems are being looked at?
As I’ve done for the past few years, the only systems being looked at right now are Steamer, ZiPS, and IWAG. Steamer and ZiPS are both featured on Fangraphs, both on their projections hub and also on individual player pages. IWAG is a rudimentary projection system that I’ve built over the past few years, mostly as part of an exercise to better understand, at a basic level, how existing projection systems work, and their strengths and weaknesses.
Why not any other systems?
In short, this is less of an endorsement or rejection of any specific system (though personally, I find ZiPS the most interesting to think about), and more just the case of convenience of interpretation. There are all sorts of other projection systems out there: CAIRO, Oliver, Clay Davenport’s system, CHONE (by Sean Smith), PECOTA (Baseball Prospectus), and so on. However, without integration into the stats conventions used by Fangraphs, it’s hard to make apples-to-apples comparisons across them. Sometimes, these issues are addressed by looking at specific counting stats (see here or here), but (and call this operator error if you’d like), I’m just not at all interested in judging a projection system by how many homers it calls for versus how many homers a player actually ends up hitting. In other words, it’s because Steamer and ZiPS are already translated into high-level aggregate measures like wRC+, FIP/FIP-, and WAR (as used by Fangraphs) that I prefer to examine the outputs of these systems and not others.
Another thing to know is that while all the projection systems differ to some extent, you will generally read something like “the formula for this system uses X years (usually three) of player data with the most recent years weighted more heavily, while also factoring in age and regression to the mean.” While the exact specifications of each system may be more complex than that (and I know ZiPS and Steamer are indeed far more complex), that’s a useful shorthand for how many other systems work. Indeed, that is also an apt description of how IWAG works.
Why should I care about IWAG?
Personally, I care because I’ve spent a lot of time on it, and had a lot of fun while doing it. You should not care. In fact, if you are of that predilection, I highly encourage you to just ignore anything IWAG-related below, and use only the Steamer and ZiPS projections to guide your understanding of the expected performance of Atlanta Braves players in 2018.
Okay, but assuming I do care, what’s different in IWAG this year?
I actually rebuilt IWAG mostly from the ground up this year. While IWAG didn’t really perform worse, there were a few niggling things that I didn’t like about the prior version aesthetically, so I’ve resolved them in this version. Whether or not that leads to better or worse performance, I’m not sure — we’ll find out! But the changes have mostly been to give me peace of mind with regard to certain aspects of the system’s function; I don’t think they’ve had a general effect on how IWAG works, and haven’t changed its basic principles. I still expect it to be a slight overestimate (consistent with past comparisons of IWAG to actual results).
The main things I changed were;
- Lower-minors performance is discounted more heavily than previously. This is generally not noticeable, except for players who shoot through the minors. This sort of behavior wasn’t a huge concern when I started working on IWAG in 2014-2015, but with a few players like Luiz Gohara, Ronald Acuña, and Dansby Swanson jumping multiple minor league levels in a single bound, this became somewhat warranted just for internal consistency.
- Player performance is now modeled holistically, rather than independent modeling of hitting and baserunning/defensive data. Probably my biggest personal thing to get over in prior versions was that I was really using only major league defensive and baserunning data, while allowing minor league hitting data to influence those results. As a result, the hitting estimates and defensive/WAR estimates being presented didn’t always align, especially when the player had limited major league exposure. This has been addressed, and all presented estimates are now fully internally consistent. If IWAG performs notably worse this year, this may be a reason why, although that would raise the question of why the clunky prior system led to better outcomes. (And yes, reading between the lines indicates that some minor league defensive data is now incorporated into IWAG.)
- For pitchers, changes in the run environment are now accounted for. In other words, rather than using only FIP, the focus has shifted to use of FIP- (which is a measure of FIP indexed to 100). This doesn’t really change the system relative to prior years, but it does hopefully make the 2018 projections more accurate, because without this adjustment, pitchers were being credited far too much for mediocre FIPs in prior years that would have looked really good in the 2017 run environment. In case the above was not comprehensible, consider that in 2015, Julio Teheran had a 4.40 FIP, and in 2017, he had a 4.95 FIP. Yet, relative to league average, those were close-to-identical lines (116 FIP- in 2015, 117 FIP- in 2017). Essentially, the adjustment just makes it so Teheran’s would-be 4.40 FIP isn’t interpreted as better than it is. This sort of adjustment was not as necessary when I first started working on IWAG, and arguably should have been integrated last year as the run environment had already begun to evolve, but I only got around to it now. (This might explain IWAG’s worse performance, retrospectively, for the 2017 rotation, though honestly that had more to do with Teheran and Bartolo Colon imploding again than other parameters.)
Can you refresh me on the stats being analyzed?
Sure. Since I’m only really interested in aggregate performance, I only look at the following for position players.
- wRC+ is an aggregate hitting measure, indexed to 100. It is based on results, and not on game state, so a double is always worth the same, regardless of whether it’s a walkoff double, a garbage time double, a double with the bases loaded and two outs, or a double with the bases empty. The easy way to think about wRC+ is that each number above or below 100 is how much better/worse the player is, relative to league average. A 105 wRC+ means the player produces offensive outcomes at a rate five percent better than league average; a 90 wRC+ means the player produces offensive outcomes at a rate ten percent worse than league average. An easy shorthand is that a point of wRC+ is worth about 0.75 runs, meaning that a 110 wRC+ is about +7.5 runs relative to league average, and an 80 wRC+ is about -15 runs relative to league average.
- Def and Def/600 are aggregate defensive measures, with the latter being scaled over 600 PAs. Def includes the positional adjustment, so a corner outfielder with +7 Def/600 means something very different than a shortstop with +7 Def/600 in terms of how well that player plays his position. A corner outfielder with +7 Def/600 is well above average relative to other corner outfielders, a shortstop with +7 Def/600 is basically average relative to other shortstops.
- WAR and WAR/600 are aggregate player production measures. Chances are you’ve run across these before. In general, 2 WAR/600 is basically an average MLB regular (starter), while 5ish WAR/600 is All-Star caliber. 1 WAR/600 is more indicative of a generic bench player, and getting to or below zero suggests that the team would have been better off, or at least not hurt, by jettisoning the player from the roster altogether and using a generic replacement in his stead. Note that in addition to hitting (as represented by wRC+) and Def, WAR for position players also includes baserunning, which is not separately shown here, but may lead to slight differences in WAR totals beyond what you can see below.
Any other questions will happily be answered on demand.
Below, I’m going to do something a little different than what I’ve done in past years. In the past, I’ve basically just provided a table of performance across all three projection systems, by player, and provided discussion. The table is still there, but for each player, I’m also going to provide a few distribution curves.
The relationship between distribution curves and projection point estimates is something I could talk about at length. Player projection is, in some ways, really an odds-making exercise, yet pretty much all projections that get published are just the estimates. I get why that is: saying “Player X has a 30% chance of being replacement level, a 10% chance of being a 1 WAR player, a 30% chance of being a 2 WAR player, and then a bunch of outcomes of 3 WAR or above with low probabilities” is a mouthful, and can be really hard to wrap one’s head around. The expected value of any distribution can be thought of as its point estimate, and point estimates are easy to digest: “this system says Player X is a 2.7 WAR player who will hit for a 125 wRC+!” However, I think a lot of nuance is lost while clarity (perhaps spurious clarity?) is gained. So, I’m presenting all of the IWAG distribution curves for wRC+ and WAR below.
In addition to modeling player performance, IWAG also models health (likelihood of injury and missing time), as well as playing time related to performance. Therefore, on the WAR distribution, you will also see a golden line, which traces a player’s likely production based on playing time. Note that the relationship between these, as modeled, can be complex. IWAG figures that if a player is being awful, he may lose his lineup spot, and thus accumulate fewer PAs. And, if a player is rocking it and gets even more playing time, he may regress further with that extra playing time. As a result, the tails are shortened on the WAR distributions as opposed to the WAR/600 distributions.
For each player, IWAG also provides a “confidence” measure, which I’ve summarized into four buckets (very low, low, moderate, and solid). The lower the confidence measure, the more splayed out the distribution tends to be, as IWAG has less certainty that the player will have an outcome in any particular range. However, the charts are drawn to avoid showing the distribution across very fringe probabilities. Therefore, the lower the confidence, the more you should keep in the back of your mind that there’s a greater relative chance that the outcome could occur somewhere to the left or the right of the pictured distribution. Where the confidence is solid, the chance of this happening is consequently fairly low.
Also, as a final note, which I cannot emphasize enough: the dots for the Steamer/ZiPS projections on all charts below are placed there simply for illustrative convenience. I am not making any representation whatsoever that I have any knowledge of the Steamer/ZiPS outcome distributions, or that they look anything like the IWAG distributions being presented. Their placement is solely so you can compare all projections for a player on each chart.
Because projections are provided below for each player as a set of three images (IWAG point estimate table, IWAG wRC+ distribution with point estimates, IWAG WAR and WAR/600 distribution with point estimates), I’m only adding a tiny bit of commentary for each of the 13 players examined. You can mostly use the distributions presented to see where IWAG is coming from and how that compares to Steamer/ZiPS, but of course I would be overjoyed to discuss any of this in the comments or via another medium.
Flowers’ projections are basically about whether you buy into his mid-career renaissance with the bat. He’s not going to be a good defender (framing is not accounted for here), but the extent to which you figure 2016-2017 are the new Flowers versus something he won’t sustain (as ZiPS does) really drives his value.
Suzuki basically has two outcomes as shown on the wRC+ chart: either the remarkable nonsense of his 2017 season, axe bat and all, is real, or it isn’t and he’s going to pumpkin to not hitting all too well. As you’ll see repeatedly, this type of bimodal distribution is very common in IWAG output, and the point estimate ends up in the middle.
It’s also something that I’ve basically accepted that IWAG regresses players less heavily than Steamer or ZiPS, hence its tendency to overestimate while these other systems tend to underestimate. IWAG giving more credit to Suzuki’s 2017 is why you see it figuring he can be an average starting catcher, while Steamer/ZiPS see him as much more of a backup type.
What’s exciting about Albies is that despite Steamer and ZiPS having his bat as below average, everyone has him as an average player already. Don’t be fooled by the wRC+ distribution; the x-axis is actually fairly narrow so the point estimates aren’t too far apart. While IWAG’s confidence in Albies’ outcomes isn’t too high given his lack of major league track record, it’s kind of exciting that his “one big distribution hump” ranges between 2 WAR and 4 WAR-ish. The Braves need that to reflect reality, and some more of that at other positions, to vaunt back into the playoffs.
Camargo’s got a classic bimodal IWAG thing going on, though ZiPS is especially not a believer. If Camargo’s weird outcomes from last season (xwOBA outperformance, crazy high doubles rate) are even moderately real, he could be quite good, as shown by that second distribution hump. But, the more likely outcome is probably something much more demonstrative of a bench player, as shown by the first distribution hump. The reality might be somewhere in the middle, though one or the other is probably more likely.
I think we should all collectively hope that Alex Anthopoulos or someone saw something in Culberson that the projection systems aren’t, because otherwise... yeesh.
The Freeman question for the immediate future is going to continue to be whether he’s the 150 wRC+ monster of 2016-2017, or something more akin to the 130-140 wRC+ lineup stalwart he was before then. That’s the question captured by the distributions, too. You can see which way IWAG leans, versus how Steamer and ZiPS see it.
Rio Ruiz is funny. Not “ha-ha” funny, though. He had all sorts of launch angle issues last year, yet those issues don’t really seem to come up in AAA. If it’s some kind of majors-specific adjustment issue, that’s not something IWAG will be able to effectively pick up on with such a pittance of major league data on him at the moment. So, IWAG will keep thinking he can be at least decent, commensurate with the fact that he’s posted two above-average batting lines at AAA. If there is something unique about his inability to elevate the ball consistently against major league pitching, however, then the IWAG WAR curves are too far over to the right. I guess we’ll find out, assuming he gets any playing time with Johan Camargo potentially securing the hot corner for himself. A platoon could still make sense, if only as a fact-finding measure about Ruiz’ potential going forward.
This is probably the projection that amuses me the most among all the position player ones. Basically, either Swanson is going to figure out his baseball life, or he won’t. If he doesn’t, we’re looking at more barely-above-replacement performance. If he does, he could be a solid 3-win player, because his defense can really get him that extra win if he hits half-decently. Because the spread seems to be concentrated in either “can’t make adjustments, terrible hitter” or “will make adjustments, average hitter,” you see the relatively low probability of something in between that happening. Yet, that’s pretty much in the middle of those two outcomes, and where IWAG’s point estimate lies. Go figure. It’s not really a great place for Swanson and the Braves to be, but I guess it’s better than a distribution that only features the left hump. (Also kind of funny how close the point estimates for all three systems are clustered given Swanson’s youth and relative inexperience.)
Acuña probably has the greatest intrigue among all Braves position players, and he’s also the only likely difference-maker for the 2018 Braves on the position player end with zero major league experience. The questions are really whether he’ll adjust seamlessly to the majors once called up, or whether there will be problems in that regard. Given his power-speed combo, even the adjustment problems might only feature a low-90s wRC+ rather than something even worse. I’m not sure exactly how Steamer and ZiPS see his distribution of outcomes, but if all of his skills are for real, a 3 to 4-win outcome seems to be very much in play; to IWAG, the only thing ameliorating that kind of potential result is the potential for average to below average performance while he adjusts.
This is probably my least favorite position player projection that IWAG has produced, and something I will seek to address next year unless it turns out that these types of (speculative) Four-A players like Adams really do play like good bench pieces when given a chance. The main issue is that you see Adams’ generic Four-A skillset reflected around the left hump/plateau, but a lot of stock is also being put into him hitting okay as he’s bounced around and repeated AA and AAA, which is probably less relevant for him than a real prospect given how often he’s repeated those levels, and his relatively advanced age. He does have a bit of a speed-and-defense backstop to his value, but I’m pretty prepared for the IWAG outputs to be strong or severe overestimates in this case. If Adams was a younger player, and therefore a notable prospect, putting up his type of AA and AAA stats, this would make more sense; as is, I think it’s a modeling adjustment waiting to happen.
(As a side note, Adams had a lot of red flags last year that suggest his performance isn’t repeatable, including a sky-high BABIP and HR/FB, and worrying contact rates. Those aren’t really driving his project as much as the general good AA and AAA performances he’s had, so that’s why the adjustment I’m speaking about has to do with controlling for those performances for older players, rather than doing anything to avoid reading into some unsustainable peripherals.)
Ender Inciarte’s weird defensive year in 2017 really tanked the IWAG projections in terms of total value, but IWAG is banking on a return to defensive form for him in 2018, though Steamer and ZiPS are also more positive about his future defensive rating than what he actually produced in 2017. The difference between the projections is probably between the fact that IWAG at least somewhat buys into Inciarte’s “ability” to outperform his xwOBA to some extent, and some extra baserunning value as well. In any case, I do find this set of curves amusing as well — they’re very uniform, basically saying that there’s a high chance of a wRC+ between 95 and 103ish, and a pretty uniform chance of a WAR/600 between 2.4ish and 3.8ish, dependent entirely on the vagaries of BABIP and defensive variation.
The funny thing about Nick Markakis is that Steamer and ZiPS keep projecting he’s going to collapse (2015: 1.5 and 0.9; 2016: 0.5 and 0.7; 2017: 0.5 and 0.5), but instead he manages to stay a bit ahead of the curve (1.3, 1.0, 0.8). This is pretty much more of the same, with the projections getting even worse this time around. Will this be the year Markakis finally collapses into replacement-level performance? Maybe, but he’s avoided it so far, and that’s what IWAG is going with. From a methodological perspective, IWAG doesn’t tend to assume collapse unless the warning signs or red flags are there. Markakis being 34 years of age isn’t enough for IWAG to assume that he’s suddenly done after same-y performance the last three years, though the way he’s slightly declined over that period, ending up somewhere between the Steamer 0.3 WAR/600 and the IWAG 1.0 WAR/600 may be another reasonable guesstimate of his future production.
Preston Tucker can possibly hit a bit. Unfortunately, he can’t really do anything else. That’s kind of an issue. When you’re getting worse defensive projections in a corner than Nick Markakis, and not really projected to hit better, well... hey, at least Ronald Acuña should be coming up at some point this season, right?
So, here’s the main summary, position-by-position:
- Catcher: something between 1-3 WAR, probably 2 WARish on average;
- First base: between 4-5 WAR;
- Second base: between 2-3 WAR, average is probably closer to 2 (?);
- Shortstop: between 1.5-2 WAR;
- Third base: 0.5-2 WAR, average is probably around 1;
- Left field: 1.5-2.5 WAR when Acuña is on the scene, and taking a hit of replacement level performance before he’s called up;
- Center field: between 2-3 WAR;
- Right field: between 0-1 WAR.
That’s a wide range, putting the Braves somewhere between the bottom of the league (low end) and towards the top half (but not much further than that, high end). IWAG leans towards the higher end (more on that in another post), so you could say that it’s pretty bullish on how this group is going to perform this season. It may not be a top 10 position player group, but the hope is that it can finish above the 25th, 29th, and 16th that it has so far during this rebuild.
Just for fun, the below is another summary table illustrating projections for some other organizational players that may make an impact at some point in 2018. A note of caution here: Austin Riley’s IWAG projection is for his skillset, assuming he continues his existing performance levels either in AA or AAA, as opposed to suffering some kind of adjustment-related slump. As such, don’t read his projection as “he’s ready to put up a 2 WAR point estimate right now,” but rather, “If he keeps going the way he has, he can be a 2 WAR player when he’s ready for the majors.”
Stay tuned for starting pitchers next, and then the bullpen (sort of).