Jump to content
  • Sign Up

The fatal flaw in Glicko-2


Norbe.7630

Recommended Posts

Originally replied as a comment on a deleted thread here so...

At a broad glance, the rating system for GBL behaves just like the well-known Elo rating system and we have generally assumed that it was indeed simply Elo, a guess that was necessary as Niantic, for reasons I don’t understand, is not transparent about their GBL ratings. It turns out that GBL ratings don’t use Elo itself, but a generalization (a more sophisticated version) of it called Glicko-2. In all normal cases, for active and established players Elo and Glicko-2 behave very similarly and can hardly be distinguished from each other.

The Glicko-2 system calculates for each player not only a (visible) rating, but also two hidden variables called deviation and volatility. Whenever you finish a set of games, your rating, deviation and volatility are all updated to new values. I have drawn a diagram showing how these three variables interact with each other and with game results.

Your rating goes up or down depending on your performance: if you score better than your old rating (relative to that of your opponents) suggests your rating goes up and if you score worse than that your rating goes down. Deviation acts as a multiplier on your rating change; having a high deviation means your rating gains and losses will be amplified. Your deviation changes after each set too; this change is driven by your volatility. If your deviation is high compared to your volatility it will go down, if it’s low compared to your volatility it will go up. Finally, your volatility itself will be updated by the results of your games. An extreme score such as 5-0 or 1-7 makes it go up while a score of 3-2 or 2-3 makes it go down.

The Glicko-2 system turns out to contain a massive flaw when using it to create a leaderboard. This flaw was not known until now; it has been (accidentally) discovered by GBL players. The rating system can be exploited to temporarily reach a very high rating, as follows:

  1. By losing on purpose, the player lowers his rating to far below his real skill level

  2. The player plays many sets against opponents of equally low rating. Playing against opponents far weaker than him, the player can choose to win or lose “on demand”. Doing this, he forces extreme sets; he either wins all games or loses all games in a set. The player’s volatility will increase steadily; and his deviation follows.

  3. By alternating winning and losing sets as needed, the player can keep his rating relatively stable, allowing him to continue this process for as long as he wants.

  4. After volatility and deviation have been “farmed” sufficiently high, the player starts to play normally, regaining rating back to his true skill level.

  5. Games change your rating much faster than they change your volatility, so even if volatility and deviation go down in the process of regaining rating it will still be very high.

  6. The player is now at his proper rating, but with gains and losses in his games heavily amplified. Now he plays normally, until getting a good streak bringing him to a peak in rating.

  7. Because of the player’s very high deviation, this peak in rating is much higher than it should be under normal circumstances.

Source: https://www.reddit.com/r/TheSilphRoad/comments/hwff2d/farming_volatility_how_a_major_flaw_in_a/

Guild Wars 2 is using Glicko 2 right?

Link to comment
Share on other sites

I see you sitting at 800 games, so i guess you testef it yourself

A couple seasons we had a bot top 100 with 2.5k games with 47% ish winrating, which made me think mmr system actually push you to 50% winrate, at some point it lost many more games than it won and mmr started to reward him more points for winning and less for losing, something like standard 17 for winning and 9 for losing instead of regular 13 and 14.

As he was climbing this patern kept going and instead of the required 60% to be in top 100 it could do it with 50% minus winrating.

With your post it all makes sense now, I'm not sure this is exploitable as you need a really high nember of games

Link to comment
Share on other sites

Seems as if GW2 punishes you more for losing even when your volatility has settled. But purposely losing and then trying harding to gain a higher volatility, this more gains seems counterintuitive in GW2.

Regardless I’d love to see this put to the test in GW2 to see what results we get.

Link to comment
Share on other sites

@"phokus.8934" said:Seems as if GW2 punishes you more for losing even when your volatility has settled. But purposely losing and then trying harding to gain a higher volatility, this more gains seems counterintuitive in GW2.

Regardless I’d love to see this put to the test in GW2 to see what results we get.

As I said in my post we had a "guy" top 100 with 2.5k games and less than 50% winrate, when requirement to achieve top 100 is normally around 60%, so OPs conclusions make sense

Link to comment
Share on other sites

First yes Guild Wars 2 is using Glicko-2 but a little changed, first it use a min and max violate and min max Deviation that means, this littel trick has his limits. Beside that we have to problem with not many players are able to control a match in that sens to effectiv "farm" Deviation, out for that simple reason u are not alone. And now the truth there is nothing better, many Ranking Systeme has similar or even the same issuse.Here is the wiki Link for the GW2 MMR System: https://wiki.guildwars2.com/wiki/PvP_Matchmaking_Algorithm

Link to comment
Share on other sites

@"Stand The Wall.6987" said:do pro sports players get rated based on volatility or deviation?in pro esports i think its the "winrate" of the "team" like in dota 2 etcbut to determine the "individual" skill level of each player inside those team comes glicko, elo etc etc etc

Link to comment
Share on other sites

Not sure if I understand this correctly, but the rating deviation i GLICKO-2 cannot increase I believe. Didn't study this too deeply though.

I wish they had numbered their formulas, but check the RD' calculation here on page 3: http://www.glicko.net/glicko/glicko.pdfRD' - the "new" rating deviation after a game/some games - is always lower than RD - the rating deviation before a game/some games.

Link to comment
Share on other sites

@Khalisto.5780 said:

@"phokus.8934" said:Seems as if GW2 punishes you more for losing even when your volatility has settled. But purposely losing and then trying harding to gain a higher volatility, this more gains seems counterintuitive in GW2.

Regardless I’d love to see this put to the test in GW2 to see what results we get.

As I said in my post we had a "guy" top 100 with 2.5k games and less than 50% winrate, when requirement to achieve top 100 is normally around 60%, so OPs conclusions make senseThat tells us absolutely nothing. You need to have a controlled test of games played, win streaks, loss streaks and what rating you start and what your positive and negative gains are per match.

Link to comment
Share on other sites

@Norbe.7630 Thanks for posting this btw. It explains many things.

I can see how this would normally work, but I absolutely 100% guarantee you this wouldn't work in GW2 due to the win trading gates. You'd get around 1600-1650 before you were noticed, and if you weren't match manipulating yourself or in with the cool crowd to be ignored, you wouldn't make it any higher than that.

Link to comment
Share on other sites

@phokus.8934 said:

@phokus.8934 said:Seems as if GW2 punishes you more for losing even when your volatility has settled. But purposely losing and then trying harding to gain a higher volatility, this more gains seems counterintuitive in GW2.

Regardless I’d love to see this put to the test in GW2 to see what results we get.

As I said in my post we had a "guy" top 100 with 2.5k games and less than 50% winrate, when requirement to achieve top 100 is normally around 60%, so OPs conclusions make senseThat tells us absolutely nothing. You need to have a controlled test of games played, win streaks, loss streaks and what rating you start and what your positive and negative gains are per match.

It does tell you matchmaking is some how is trying to push you to 50% winrate, otherwise it would be stuck forever in that win 13 loses 14 and it'd be somewhere in silver not high plat 1

More so if you think ppl around p2 have that situation you have to win 3 games to make up for 1 loss

And somehow bot ignored that, which makes me think even in plat it was getting like 17 on wins and losing 9-10

Link to comment
Share on other sites

@Khalisto.5780 said:

@phokus.8934 said:Seems as if GW2 punishes you more for losing even when your volatility has settled. But purposely losing and then trying harding to gain a higher volatility, this more gains seems counterintuitive in GW2.

Regardless I’d love to see this put to the test in GW2 to see what results we get.

As I said in my post we had a "guy" top 100 with 2.5k games and less than 50% winrate, when requirement to achieve top 100 is normally around 60%, so OPs conclusions make senseThat tells us absolutely nothing. You need to have a controlled test of games played, win streaks, loss streaks and what rating you start and what your positive and negative gains are per match.

It does tell you matchmaking is some how is trying to push you to 50% winrate, otherwise it would be stuck forever in that win 13 loses 14 and it'd be somewhere in silver not high plat 1

More so if you think ppl around p2 have that situation you have to win 3 games to make up for 1 loss

And somehow bot ignored that, which makes me think even in plat it was getting like 17 on wins and losing 9-10

That is on the matchmaker though. GLICKO does not care at all about your winrate. :wink:

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...