Last week at Baseball Digest Daily, we interviewed John Dewan about his new book, The Fielding Bible. That discussion prompted us to begin a short series on defensive metrics and the various men who have developed such fascinating models. We think the topic is important enough that the entire baseball community should be able to hear what these men have to say.
This week, we turn to Baseball Musings' David Pinto to explain the ins and outs of his Probabilistic Model of Range (PMR)...
Joe Hamrahi (JH): Generally speaking, please describe the Probabilistic Model of Range.
David Pinto (DP): PMR attempts to measure range based on the ease or difficulty in fielding a specific ball in play. Easy plays made shouldn't count much toward determining a fielder's range. Difficult plays made should. What the system does is use examples of balls in play to determine what plays are difficult and what plays are easy.
JH: What factors do you consider when determining the probability of a ball being turned into an out?
DP: PMR uses six factors; the batted ball type (fly, ground, liner, etc), how hard the ball was hit, the direction the ball was hit, the handedness of the batter, the handedness of the pitcher, and the park.
JH: How important are ballpark factors in determining the probability? And what factors are included in evaluating ballparks? (i.e., grass vs. turf, foul territory, environmental, etc.)
DP: Parks can be very important. A lot of balls that might be caught or are home runs at other parks are considered in play in left field in Fenway. Some surfaces might slow ground balls down more than others. I don't include any specific factors in evaluating a park. If a park makes a ball harder to turn into an out, it falls out in the analysis.
JH: Are there any parks that have a much greater influence on the probability of a ball being turned into an out?
DP: It depends on the position you're examining. But if you look at left fielders, you can see the influence of Fenway and Coors very easily.
JH: Are there any additional factors you plan to incorporate into your model in the future?
DP: A number of people suggested I incorporate distance directly, rather than through how hard the ball was hit. I'm thinking of how to do that without having a different model for grounders and air balls.
JH: What is defensive efficiency rating (DER)?
DP: DER is a measure of how often a ball in play is turned into an out. These are fieldable balls, so out of the park home runs don't count. When Bill James first proposed this, we didn't know how many balls were in play and how many were turned into outs. This had to be estimated. We now have play by play data so we can determine this exactly.
It can be thought of as the opposite of batting average. A team with a .700 DER allows opposing batters to reach base 30% of the time on balls in play (not including out of the park home runs).
JH: How do you calculate expected DER and actual DER?
DP: Actual DER is (Outs on Balls In Play)/Balls in Play. Expected DER is (Expected Outs on Balls in Play)/Balls in Play. The expectation for each ball in play comes from the model. If a ball with a particular set of parameters is turned into an out 60% of the time, then 10 such balls in play would have six expected outs.
JH: Do the calculations of team PMR differ from the calculations for individual PMR?
DP: Only that it's narrowed down to a fielder. Individual PMR is the player contribution to team PMR.
JH: Were there any team PMR results from 2005 that really surprised you?
DP: I was surprised the Phillies were so good. I didn't expect a great offensive team like that to have great gloves.
JH: Were there any individual player PMR results from 2005 that really surprised you?
DP: Not really.
JH: Does your system translate well to older seasons (providing the data is available) in analyzing defense? For example, can you go back to 1998 and perform the same range calculations for teams and individuals as you would today?
DP: I believe it should work. I don't have the data going back that far. Maybe in ten years we can better answer that question.
JH: How important do you feel positioning is in determining a range rating?
DP: It's very important. One thing I wish the scoring systems kept was direction the outfielder moved to catch the ball. Think of a three by three grid with the fielder at the center. Which way did he move, and possibly how much to get to the ball. It's a simple thing to track and would tell us a lot.
Range, in some ways, is a misnomer. We think of range as the ability to cover a lot of distance. But what we're really measuring is the ability to convert a batted ball to an out. Proper positioning may make up for a lack of ability to cover ground. That's why Ripken was underrated as a shortstop. He couldn't cover as much ground as Ozzie Smith, but his position was outstanding, so he didn't need to move that much. Also, he had a strong arm, so he could play deeper, which gave him more time to get to ground balls.
JH: Do you have any plans to expand your model to evaluate total defensive ability (including turning double plays, fielding bunts, arm strength) rather than just range?
DP: No, not in the near future.
JH: What is your source of data for PMR?
DP: The original work was done with data from STATS, Inc, but lately with data from Baseball Info Solutions.
JH: What prompted you to devise such a defensive rating system? Did anyone in particular inspire you?
DP: I was working at the Center for Intelligent Information Retrieval at the University of Massachusetts. That group used probabilistic language models to compare and retrieve documents. It struck me that balls in play formed a language model for range, and I could apply some of the same techniques.
JH: How would say your system compares to the plus/minus system in The Fielding Bible?
DP: It's very close. That system uses distance as a parameter, so some of our results are different.
JH: John Dewan mentioned he had several conversations with you about fielding models. Can you talk a little bit about what you discussed and the input you may have had on Dewan's book?
DP: I mostly explained my system to John. He decided that they were very similar, and produced similar results. It provided a good check for his work.
JH: How does PMR compare to Mitchel Lichtman's UZR?
DP: They are very similar, although I believe Mitchel calculates park effects differently.
JH: Have you discussed your system with Lichtman at all?
DP: After the 2003 season, when I first published the results, I had not been aware of Mitchel's system. We did have a discussion at that time, and basically decided the two were very similar. In fact, I was going to drop the work since Mitchel created it first, but then he was hired by the Cardinals, so I filled that niche.
JH: Any other comments or points you'd like to make that I may not have touched on?
DP: No, that's it. Thanks for the opportunity.
Next Up: Mitchel Lichtman and UZR!