Introducing: A Composite Ranking of Prospects
I had some free time recently and decided to take on a really simple project: to try and combine and create a composite top whatever number prospect ranking based on six major top prospects lists. I’ll go through methodology, including the shortcomings and future goals and why I think this provides little value.
This idea really just sprang from 247sports.com and their composite high school recruiting rankings. They provide their own 247 ranking but also a composite rank, using a Gaussian distribution formula to weight rankings used at other major outlets to create this ranking. They explain it more in-depth here.
My methodology is simple, currently. I take the ranking of each prospect and average them, without any weights, to create an overall ranking. The six outlets are: Baseball America, Baseball Prospectus, Keith Law of the Athletic, Kiley McDaniel at ESPN, MLB.com and Eric Longenhagen of FanGraphs. Currently, FanGraphs has not released their top prospects list, so there are only five rankings to average.
A quick example: Gunnar Henderson is the #1 ranked prospect at four outlets, and #2 at one outlet. So, I would average those rankings by adding those rankings (1+1+1+1+2) and then divide it by 5. This gives a composite ranking of 1.20. I should note that there aren’t necessarily prospects ranked at certain increments on my list, because none of these outlets agree on a single prospect, with Gunnar Henderson being one of a few exceptions.
There are downfalls to this method. Certain prospects have wide ranges of rankings, which could be labeled as outliers. To give another example: Colson Montgomery. His rankings were as follows: #32, #25, #38, #15, #39. The ranking that stands out is his ranking at #15 by ESPN. Since this is equally weighted, his composite ranking is 29.80, which seems low considering three of the five rankings are in the 30s and two are in the high 30s. It’s not the most egregious case but it’s just an easy example to point out.
I thought about removing the highest and lowest ranking for prospects out of the calculation and only averaging four of their rankings to create the composite. This leads to another pitfall: these rankings are subjective and in some cases prospect writers have different criteria for ranking prospects. Gabriel Moreno is an example of this, as he’s ranked on two lists but not ranked in three due to exhausting his rookie eligibility.
Another note, which could lead to future changes, is that outside the top 40 or top 50 prospects, the consensus on prospects goes away. Some prospects are ranked as high 45 on one prospect list (Jeferson Quero), but not ranked on the other four lists. This makes their ranking simply 45.00 since there’s nothing else to average with it. I briefly considered cutting the prospect list to the top 50 prospects since those were the ones consistently ranked. But, I think that it’s worth showing the back end of lists for future promotions. Not to say that one outlet is more accurate than others, but to see who may have evaluated someone high earlier than others. If Quero has a successful career, then we can look back and say “Hey, cool, Keith Law had him in the top 50 back in 2023.”
I think it’s worthwhile because it can highlight what certain outlets value over others. Keith Law explains his ranking methodology, looking for a mix of high ceiling but also likely contributors. Kiley McDaniel loved left handed hitters in his list. The majority of lists also prioritize up the middle position players (catchers, shortstops, and centerfielders) since they are the toughest and most valuable position players in the majors. But at some point, usually in the 40s and above, the lists diverge and the back end contains less consensus.
I’m not sure this composite list has any projection or predictive value at all, but it may have descriptive value, which is to say it could be used as evidence when a player performs at a high level at the majors and was a highly ranked prospect at five outlets. But, no prospect is a sure thing and it’s entirely possible one of these prospects ranked in the top 30 doesn’t live to their ranking.
Honestly, this was for fun and initially meant to be a coding project so I could work on web scraping with Python, but turned into a manual exercise once I learned that web scraping pay-walled articles doesn’t work. I’m interested to see who breaks out in the majors and if one of these outlet’s methodologies might prove to be more accurate. I’m specifically looking at Baseball Prospectus because their willingness to factor in a prospects organization into their ranking. Meaning: the Cleveland Guardians and New York Yankees are historically very good at finding pitching prospects in later rounds and turning them into major league contributors at either starter or reliever. Thus, BP takes that observation and factors it in when deciding who breaks into their top 101. It’s not the only factor but it is a good insight.
This may lead to future exercises of seeing which prospects on past lists and seeing what their WAR totals look like compared to ranking, but that’s for another day. For now, this is for fun, sorry to keep you waiting for the link: https://docs.google.com/spreadsheets/d/1cXGYlWKt4SP0QPViH4ZH-xwVjKlaG5lzT7BHZxA4GMM/edit#gid=0
I can be reached at @MattchewGregory on Twitter for any questions or concerns.