Analyzing Dan Uggla's Defense, Part 2: Advanced Metrics

Last week, I looked at how Dan Uggla stacks up to other second basemen using traditional defensive statistics. While these stats were able to give some hints about Uggla's skillset, they couldn't really tell us much about how good of a defensive player he is overall. In this post, I'll examine what the three major "advanced" defensive metrics have to say about Uggla's fielding and a few related issues.

The consensus of these comprehensive metrics is that Uggla is a poor fielder. On average, he has cost his team about 4 to 8 runs per season, depending on which metric you use. This agrees with the data from the traditional stats as well as the scouting reports, both of which generally regard him as below average but not catastrophic at second base.

There are, however, some questions regarding the advanced metrics. The first is that Uggla's year-to-year ratings have vacillated quite a bit, ranging from terrible to average. The second is that, by UZR anyway, his home/road splits are huge, which has led to much discussion about whether Dolphins Stadium in some way harmed Uggla's fielding numbers. Many Braves fans want to believe this theory, since it would mean that Uggla's defense would look better now that he calls Turner Field home.

After the jump, I try to get a handle on these questions, with help from some graphs.

A bit about the metrics

For this post, I'll be using the three most commonly cited defensive metrics: Ultimate Zone Rating (UZR), Defensive Runs Saved (DRS), and Total Zone (TZ). For more on these stats, I've included a brief primer at the end of this post; just scroll down.

These systems are broadly similar, so don't get too hung up on the nitty gritty details (unless you want to). The scales are generally equivalent. Zero is average for all three systems. A very good total for a season is in the +10 or higher range. Similarly, a per-season score of -10 or lower is fairly bad.

The important thing to remember for all of these stats is that they are more volatile than hitting statistics. Ideally, you want 2.5 to 3 seasons' worth of data before drawing conclusions about a player's true defensive skill level. If you have less data than that, you have to A) take it with a huge grain of salt, and B) regress it to the mean (i.e., bring the number closer to zero).

How Uggla stacks up

If we're judging Dan Uggla's overall defensive performance, we have plenty of data to do so: five full seasons. Over this time, he has posted a total UZR of -22, (about -4.5 runs per season). His DRS in that time frame is -29, which would put his yearly value at around -6 runs. Finally, TZ gives him a whopping -39 total, which is equivalent to about -8 runs per year.

If we break this down into yearly totals, we can see that the systems generally have a remarkable amount of consensus about Uggla's defense:

Advanced_stats_by_year_medium

For as much as we've heard about the wildly diverging opinions of the defensive metrics, they sure don't diverge much here. Aside from Uggla's 2006 UZR mark, all of the values for a given season are within 4 runs of each other. All the systems agree that he was just fine defensively in 2006 and 2008; they all agree that he was terrible in 2007 and very bad the past two years, too.

Do those year-to-year fluctuations mean anything? In other words, was Uggla actually a better fielder in 2006 and 2008 than he was in the other years? My gut says no--after all, with only a year's worth of data, there will be some large random shifts even if the underlying talent remains the same. The agreement between the three systems is not necessarily meaningful. The systems are not independent, after all--they are all based on actual events from games. If Uggla had a disproportionate number of tricky grounders in 2007, for instance, that would affect all 3 ratings.

For now, I think the best conclusion to draw is that Uggla's natural skill is in the -6 runs/year range, and that it fluctuates around that number based mostly on luck. If Uggla's skills have not yet started declining due to age, we should expect his defensive ratings to be between -1 and -11 runs this year, with the possibility of even greater divergence. Since he is into his 30s now, I would perhaps subtract a further run or two based on the likelihood that age will hurt his fielding a bit.

But what about those home/road splits?

Now we get to the burning question. If Dolphins Stadium is somehow making fielders look worse (like, say, through a poorly maintained surface) or if the data for that park is somehow skewed, the above conclusions must be tossed out the window.

As you've probably seen on TC somewhere, Uggla's overall home UZR/150 is -10.5, but his road UZR/150 is much higher: +1.4. Given that this is across a reasonable time frame (2.5 seasons for each half of the split), it has led many, many posters to conclude that something is, well, fishy at the Marlins' ballpark. It certainly does support the theory that Uggla will have better defensive numbers with the Braves.

I'm going to show you a graph of his home/road UZR splits for each season. Yes, it seems like the home/road difference is persistent from season to season. Don't draw any conclusions yet, though.

Uzr_home-road_by_year_medium

Do not judge Uggla by any of the individual numbers in this graph. These are tiny sample sizes and they are highly volatile. I made this graph to make a couple points. First off, note how Uggla still rates below average even on the road in three of the road samples--it's not like he's consistently been above-average there.

Even more importantly, I wanted to show just how much variation there are in these individual data points. When you look at Uggla's total home and road UZRs, the tendency is to think that each year's numbers are close to the overall average. The common sense idea is that, even if there are large variations, that they would "even out" over the total period. While this is generally the case, five half-seasons is simply not enough time to even out all the extreme variations. Inevitably, there will be cases when several data points are anomalous in the same direction and thus do not balance out.

The point is that those huge gaps in the 2006, 2007, and 2008 numbers should not be taken at face value. It is certainly possible that they describe a real home/road disparity (and that the 2010 figures mask that disparity). It is also very possible, however, that they are actually exaggerating the home/road differences. We just don't know.

I'd still say that we can't dismiss Uggla's home/road UZR splits altogether. They are interesting, and merit further study, but we certainly cannot use them to quantify any effects caused by Dolphins Stadium

Let's suppose that the home/road differences are real, though. Shouldn't we expect other metrics to show similar results? While DRS does not have home/road splits available, Total Zone does. TZ, however, does not show a drastic split: Uggla's home TZ is -21 and his road TZ is -18. On a per-season basis, that puts Uggla's home TZ/150 at -8 and his road TZ/150 at -7. Not exactly a big difference.

For comparison's sake, here is the year-by-year breakdown of his Total Zone splits:

Total_zone_home-road_by_year_medium

In 2006 and 2007, Uggla's TZ numbers were better on the road, but in the past two seasons, Uggla's TZ ratings were actually worse on the road. This is exactly the kind of result we would expect if the home/road differences were due to random fluctuation rather than any meaningful difference. That's not to say that this kills the "Uggla sucked because of his home park" theory. Most observers prefer UZR to TZ, and UZR undeniably uses more detailed information, so TZ might have missed something. It sure doesn't help, though.

Hanley Ramirez has weird splits, too

If there is really something wacky going on with either UZR's numbers or the Marlins' ballpark, you'd expect it to show up with other players as well. And indeed, Uggla's longtime double-play partner, Hanley Ramirez, has similar splits. In basically the same time frame as Uggla, Ramirez has posted a -15.3 UZR/150 at home and a -3.0 UZR/150 on the road.

Let's look at Hanley's splits by year in both UZR and TZ.

Ramirez_uzr_home-road_by_year_medium

As you can see, Ramirez actually had virtually no home/road split in three of his five seasons. That 2007 home UZR is the cause of nearly all of his overall home/road gap, and is almost certainly a statistic anomaly. If we eliminate the 2007 splits from the data set, Ramirez's UZR/150 splits are: -6.5 home, -3.0 road. In other words, if we get rid of this one anomalous data point, Ramirez's home-road split nearly disappears. This does not necessarily mean that the gap is an illusion, but I find it hard to believe that Dolphin Stadium is to blame for Ramirez's woes based on this information.

According to Total Zone, Ramirez has a home TZ/150 of -8 and a road TZ/150 of -1. That is a noticeable gap, but not a huge one. Let's see how it breaks down by year:

Ramirez_total_zone_home-road_by_year_medium

There's a word for this: randomness. What effect that randomness has on Ramirez's numbers, we can't really say. It could be masking an even larger home/road differential, but it seems much more likely to me that the random variations are making the split seem larger than it actually is. At any rate, I don't think we can draw any conclusions.

At best, Hanley's home-road splits can be characterized as "suspicious." At worst, they could be called "meaningless."

The Debunking by MGL

At this point, it would behoove us to consult a higher authority. Fortunately, TC member PWHjort (of Capitol Avenue Club fame) emailed Mitchel Lichtman, the creator of UZR. This is how Lichtman responded (thanks to PWHjort for posting this):

Peter,

Thanks for the observation and question regarding UZR and Uggla and other Florida players. I have not looked at the data, but off the top of my head I would think that this is random fluctuation, for two reasons: One, it is unlikely that infield UZR’s would have any severe park biases or effects (as opposed to outfield UZR’s). Two, all of the UZR’s you see on Fangraphs are already park adjusted so that any large differences in home/road splits due to quirks in a player’s home park would be mostly "neutralized" by the park factor adjustments, such that any residual large splits would likely be random, by definition. IOW, if you look at all infielder (or outfielder) H/R splits for KC (or any other team) players, the average H/R split should be around league average, although that is not necessarily the case for any one (or other small samples) year.

There is always a possibility of a coding or other error of course, but I wouldn’t put too much (or any, for that matter) stock in home/road splits for any one player or team. Keep in mind that if you look at all players (or all teams) in any one year, you are likely to find some random, large H/R splits by chance alone.

We actually hesitated to post the H/R splits on Fangraphs for precisely this reason – that some people might make more of the splits than was appropriate. As I said, the appropriate amount of importance to attach to these splits is near zero, in my opinion.

Mitchel (MGL)

First off, we should note that MGL admits that he did not look closely at the data; it may be that he is not aware of just how large or long-lasting the home/road gap is in Uggla's numbers, which may have changed his opinions. If his comments all refer to one-season UZR splits, then we can safely dismiss his assessment of "near zero" importance. Let's assume that he knows the splits are across a fairly large sample, though.

Next, he states that he finds it "unlikely that infield UZR’s would have any severe park biases." I would agree that this is much less likely than a park having major outfield differences, since all infields are supposed to have the same dimensions. Still, there remains the possibility that a very poorly maintained park could be tougher on fielders, or on the other hand, that a park specifically groomed to be better for sinkerballers might lead to better UZR numbers. Unlikely is a long way from impossible, and given 30 parks, it seems almost inevitable that there'd be an outlier or two.

His next point is that the UZR numbers are "already park adjusted"--a fact that he believes makes any leftover home/road variations "random." This, I think, overstates the case for park adjustments quite a bit. The mere presence of a park adjustment does not eliminate other non-random factors. Groundball pitchers do better at Coors Field than fly ball pitchers; is that "random"? You can park-adjust all you want, but as long as you're applying the same adjustment to all players, you'll be overrating some and underrating others. Players may wear uniforms, but they are not uniform; they do not respond in the same way to the same environments.

Let's say, for the sake of argument, that Dolphins Stadium has a particularly fast infield up the middle (I have no idea if this is true). Let's also imagine that Uggla has relatively poor reflexes in that direction (again, I'm just speculating). Wouldn't this park effect hurt him more than a fielder who has a quicker reaction time to his right? If the infield there is more forgiving of some fielding flaws than others, then it's quite plausible that Uggla would indeed perform better on a more neutral infield, or one that actually mitigates his particular flaws.

Of course, I have no evidence to support this whatsoever. There simply isn't enough (free) data out there for me to really analyze Dolphins Stadium's (or Uggla's) particular quirks to that degree. So for now we're just going to have to say that the idea that Marlins Stadium is harsher on Uggla than on others is just a plausible but completely unproven hypothesis. Don't get your hopes up, in other words.

Next, MGL brings up the remote possibility that there is a "coding or other error" in UZR's park adjustment for Dolphins Stadium. This would seemingly explain a lot, but there's no way for me to test it. Besides, it seems like someone would have spotted and fixed any such bug years ago.

All in all, while I don't wholly agree with MGL's assessment, he did write the book on UZR, so I feel that we have to defer to him unless there is some real evidence to the contrary. Which there isn't, to my knowledge.

Sure, it seems highly improbable that five seasons of home/road UZR splits could have such a large gap merely because of randomness. But without additional evidence to back up the UZR data, randomness is the most likely explanation. We know there's a lot of random variation in the data. Given enough players, UZR will produce a few crazy splits over even fairly long time periods.

In conclusion, I think it is fair to state that:

  • All the advanced metrics agree that Uggla has been a poor fielder.
  • They also agree about how bad he's been in each individual season (though I'm not sure this means much).
  • If you buy into the fielding stats, Uggla's natural ability is probably around -6 defensive runs per season, give or take.
  • Uggla's home UZRs have been quite a bit worse than his road UZRs, BUT
  • Total Zone shows no such disparity, AND
  • The creator of UZR has said not to take much stock in such splits.
  • Hanley Ramirez's home-road splits seem large, too, but similar problems apply.
  • We need more evidence before we can say that there is any sort of home field disadvantage in Florida.

Overall, Uggla's (and Ramirez's) home/road splits indicate that there might be something amiss at Dolphins Stadium. Given the tremendous amount of randomness involved in the individual data points, however, we should not infer too much. We certainly should not say that "Dolphins Stadium costs Uggla ___ runs per year on defense" or anything like that. The strongest statement I'd be comfortable with is something like, "There may be some home-field UZR disadvantage for Marlins fielders, but if there is, random variation prevents us from determining how large of an effect it is."

There are a couple other ways to try to find some evidence that Dolphins Stadium disproportionately hurt Uggla. The easiest is to simply wait a few years so that we can judge Uggla's performance in a different home park. That will tell us a great deal about whether Uggla has really been bad all along, or whether he was simply an ugly duckling who had to come to Atlanta to become an adequate-fielding swan.

Since I'm sure "wait 3 years" is not a satisfying conclusion to any of you, I'll try in the next post to look at park effects in more detail. In particular, I'll break down not just Uggla's numbers, but the numbers of opposing second basemen playing against the Marlins. If both sets of data indicate that it is more difficult to field at Dolphins Stadium than elsewhere, then we would have some real evidence of a park effect.

============================

Here's an overview of the three stats used in this post. Click the name of each stat if you want to read more about it.

  • Ultimate Zone Rating (UZR): This statistic takes location-based batted-ball data from Baseball Information Systems and assigns a positive or negative run value to every defensive play (or failed play). It uses past data from similar plays, and incorporates adjustments for where the ball is hit, how hard it is hit, how fast the batter is, and numerous other factors. It is also park-adjusted (more on this later). UZR is used as the defensive component in FanGraphs' version of WAR for years since 2002.
  • Defensive Runs Saved (DRS): Part of John Dewan's +/- system, DRS also uses location-specific data from BIS. Where it differs from UZR is in its relative simplicity. In the +/- system, each time a fielder makes a play (or fails to make it), he is credited with a certain fraction of a point based on how many other fielders at his position would make that play. So if a shortstop makes a play that 30% of other shortstops would make, he gets +0.7; if he misses that play, he gets -0.3. These values are converted to runs and summed to make DRS.
  • Total Zone (TZ): Total Zone, unlike the other systems, relies on play-by-play data, which lacks specific locations. Thus, it is generally regarded as being less precise than UZR or DRS. Its main advantage is that it can be applied to players from any era, whereas UZR and DRS goes only as far back as 2002. TZ looks at how many plays a player makes compared to league averages at his position. It is used in Baseball-Reference's WAR and for FanGraphs' WAR for years before 2002.
X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

Join Talking Chop

You must be a member of Talking Chop to participate.

We have our own Community Guidelines at Talking Chop. You should read them.

Join Talking Chop

You must be a member of Talking Chop to participate.

We have our own Community Guidelines at Talking Chop. You should read them.

Spinner

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9351_tracker