Using Rookie League Stats to Predict Future Performance

Over the last couple of weeks, I’ve been looking into how a players’ stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. I used a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there. In the future, I plan to engineer an alternative methodology to go along with this one, that takes into account how a player performs in the majors, rather than his just getting there.

For hitters in Low-A and High-A, age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America all played a role in forecasting future success. And walk rate, while not predictive for players in A-ball, added a little bit to the model foDouble-A and Triple-A hitters. Today, I’ll look into what KATOH has to say about players in Rookie leagues. Due to varying offensive environments in different years and leagues, all players’ stats were adjusted to reflect his league’s average for that year. For those interested, here’s the R output based on all players with at least 200 plate appearances in a season in Rookie ball from 1995-2007.

Rookie Output

Just like we saw with hitters in the A-ball leagues, a player’s walk rate is not at all predictive of whether or not he’ll crack the majors. Unlike all of the other levels I’ve looked at so far, a player’s Baseball America prospect status couldn’t tell us anything about his future as a big leaguer. This was entirely due the scarcity players top 100 prospects in the sample, as only a handful of players spent the year in Rookie ball after making BA’s top 100 list.

The season is less than 40 games old for most Rookie league teams, which makes it a little premature to start analyzing players’ stats. But just for kicks, here’s a look at what KATOH says about this year’s crop of Rookie-ballers with at least 80 plate appearances through July 28th. This only considers players in the American Rookie leagues — the Appalachian, Arizona, Gulf Coast, and Pioneer Leagues, meaning it excludes the Dominican and Venezuelan Summer Leagues. The full list of players can be found here, and you’ll find an excerpt of those who broke the 40% barrier below:

Player Organization Age MLB Probability
Kevin Padlo COL 17 73%
Bobby Bradley CLE 18 67%
Alex Verdugo LAD 18 65%
Luke Dykstra ATL 18 64%
Yu-Cheng Chang CLE 18 59%
Magneuris Sierra STL 18 56%
Juan Santana HOU 19 54%
Joshua Morgan TEX 18 50%
Jason Martin HOU 18 49%
Edmundo Sosa STL 18 48%
Oliver Caraballo TEX 19 46%
Sthervin Matos MIL 20 46%
Alexander Palma NYY 18 45%
Eloy Jimenez CHC 17 45%
Javier Guerra BOS 18 44%
Zach Shepherd DET 18 44%
Tito Polo PIT 19 44%
Jose Godoy STL 19 43%
Henry Castillo ARI 19 42%
David Gonzalez DET 20 42%
Dan Jansen TOR 19 42%
Max George COL 18 42%
Gleyber Torres CHC 17 42%
Luis Guzman WSN 18 41%
Jose Martinez KCR 17 41%
Alex Jackson SEA 18 40%
Emmanuel Tapia CLE 18 40%

What stands out most is that KATOH doesn’t think any of these players are shoe-ins to make it to the majors. Even those who are hitting the snot out of the ball get probabilities that fall short of what we saw for unremarkable performances in Double-A. Kevin Padlo, for example, gets just a 73%, despite hitting a ridiculous .317/.463/.619 as a 17-year-old. Its hard to do much better than that. I think this really speaks to how little Rookie ball stats matter in the grand scheme of things. A good offensive showing is obviously better than a poor one, but numbers from this level need to be taken with a huge grain of salt. A hitter’s performance against pitchers who are fresh out of high school just can’t tell us much about how he’ll fare when matched up against more advanced pitching at the higher levels.

Next up, I’ll complete the series by looking at stats from Short-Season A-ball. Teams at that level are also only a few weeks into their season, but at the very least, it will be interesting to see how KATOH feels about SS A-ballers in general. Next week, I’ll apply the KATOH model to historical prospects and highlight some of its biggest “hits” and “misses” from the past.

Statistics courtesy of FangraphsBaseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.

This article originally appeared on Fangraphs.

About Chris

Chris works in economic development by day, but spends most of his nights thinking about baseball. He writes for Pinstripe Pundits, and is an occasional user of the twitter machine: @_chris_mitchell
This entry was posted in Analysis, Prospects and tagged , . Bookmark the permalink.

Comments are closed.