Ranking Players

Ranking Players in Repeated Plays of a Game without Bias

I’ve always found the idea of valuing players in a team interesting. How do you value people who preform different functions on the team by the same scale. Sports are the classical examples. An attacking player might be valued on the points they score and a defending player on the points they stop. However, this data is not always readily available and we may not know the higher order effects other players have. For example, one player might take up the attention of multiple players on the other team, allowing another player to score. How can we account for all these effects without introducing bias?

Fundamentally, the only unbiased metric we have is whether the match resulted in a win or loss. Any other metric has inbuilt bias. For example, we may look at the match score. One may think that the match score would be biasless. However, if we consider a blowout, the players on the losing team might have given up once they realized the result; weighting match scores would unnecessarily punish these players’ score, especially when most sports leagues are primarily concerned with who won rather than the margin of victory. One may argue that the win metric also contains bias; while this may be true, the metric most leagues use is wins, so we must accept bias built into the win metric, which I will detail later. We must also acknowledge the inverse relationship between bias and data quantity; we can consider a lot more data if we start with biased metrics and attempt to remove bias by cross referencing.

Methodology

We can reframe the problem as a learning with noise problem. We assign winning games as positive and losing games as negative for the players and optimize over this space. There are quite a few details I abstract away in this setup. For example, simply subbing in more players would push each player’s value down. Furthermore, the players who preform better under pressure (i.e. during postseason or championships) are not weighted any higher than a counterpart who preform equal to that player’s average. These are examples of structural problems; there are a host of other theoretical problems, but we must first solve some structural problems with this method. Poor model setup can easily lead to malformed outputs where the noise distribution is unintentionally too high for good inference on player value or the space of player values is too large for good inference.

I omit the solutions to these problems, but we use this structure as the base idea. Many of these are non-trival augmentations where one may be tempted to abandon this approach altogether, but I detail the results of this type of approach below.

Biases

While the aim of this methodology is to reduce bias, it does not result in an omnipotent program. We still face some biases that are impossible to remove. Inherently, one of the aims of any player valuation program such as this is to predict future player value based on previous performance. In this, we find some problems. For example, if a team is built around a certain player, that player will be rated more highly than if not, as the player playing directly contributes to more wins for the team due to the team’s strategy. One might argue that some other player in the same position would do better; this argument may have some merit. However, we have no way to say whether this is the case or not without that player actually playing some games under that system.

Uses

This methodology extends beyond sports, providing a framework for evaluating individuals in various team settings.

We often see relatively arbitrary metrics for coaches (wins is likely the best one). With minor adjustments, this method offers empirically driven coach ratings as well.

Outside sports, we find that many companies attempt to compensate based on individual contribution. However, this is often difficult to measure and often results in pay based on hierarchical position rather than individual contribution. While initially crafted for player valuation, this methodology can be fine-tuned for a more precise assessment of individual contribution, proposing a shift towards remuneration based on performance, alleviating reliance on promises of future promotions.

Results

I post preliminary results here. I post more notes with detailed explanations on each results page

NBA 2017-2021
Formula 1 1981-2023