UPDATED 24 Oct, 11:19 PM (GMT+3)
A few weeks ago there was a thread, where some dude** hypothesised**, that "MatchMaking algorithm forces you to lose games in streaks after you had a win streak." Here it is.
The thread was met with healthy criticism, and one dude, @Megametzler.5729 , even linked a pdf, where MM was described, step by step.
I read it thoroughly (at least I think I did), and after that I had a feeling, that the situation, which OP described, is kinda-sorta possible-ish. Except MM didn't force anyone, ofc. But more on that later.
So, as you can check yourself, the math, which describes the algorithm is quite trivial. _(Well, the math, that describes it is indeed trivial, but math, which took the author of the paper to prove it actually works - is slightly more complicated. Sadly, the full process is not described in the paper) _
Despite that fact, the formulas look quite clunky and not very fit for visual comprehension.
That's why I thought it would be a fun thing to do, if I put them into a code. So, yesterday I was really bored and gave it a try: link to Python 2.7 Jupyter Notebook (updated). The code itself is a little bit trashy, but should be easy enough to read.
The main goal of the code was to simulate a game history of some player in 1v1 scenario (although, in GW2 spvp happens in form of 5v5, in the context of our hypothesis it doesn't really matter).
In order to simulate something, you have to provide the model of some level of adequacy.
In the case of this code, there supposed to be 2 models (the 2nd I'll add later)
Although, he's initially Unranked, and has to play 10 games for seeding against various opponents of some skill level.
And, to his misfortune, he does it, while being STONED AS kitten, which in terms of math means, his winrate against 800-1200 scrubs is precisely 50% (for those 10 games only)
Then he finishes with some result and feels like "kitten, man, that won't do, I must tryhard." And he starts doing exactly that, playing with his full potential.
Important, MatchMaker Algorithm: The matchmaker now assumes, that all players in the game have their ratings distributed according to Gauss Distribution. With mean 1000 rating and standard deviation 266 rating (see the image 1.)
(1200 was taken from here, as well as the other constants, and the standard deviation I took assuming, that 1800 is 3 sigma level, where only 0.3% of the elitist dudes play. 30 ppl above 1800, makes the playerbase something like 10000 - that makes sense, I guess.)
It works like this: the matchmaker rolls normally distributed number (with mu and sigma 1000 and 200, accordingly)
If it does NOT BELONGin a range of +/- 10 rating from our dude current rating, then we increase this range by +10 (making it +/-20, then +/-30, and so on), and re-iterate the process.
If it DOES, however, we take this number value as our opponent rating for the current game. The higher our dude's rating, the tougher it is for him to find decent opponents (see image 2)
Then we calculate the winrate against this opponent, according to Glicko manual (see the formula for E in glicko-2.pdf, Step 3). Then we record win or loss for this game, in accordance to that winrate.
UPDATE from 24 Oct:
1) removed RD decay over time, introduced a hard cap for RD=30,
2) updated mean and standard deviation for gaussian of skill,
2) took system constant and other parameters from wiki page
And, finally, we can see how his rating changes with time, by the end of a season. See image 3.
(from top to bottom):
IMAGE-1: Gauss distribution of TRUE SKILL levels of players in GW2 leaderboard. Approximately, of course. 10000 players, mean is 1000, sigma is 266.
These numbers derived from the assumption, that 3-sigma level is 1800 rating, and there are 30 players above 1800.
IMAGE-2 Matchmaking representative samples for players with 1000 TRUES SKILL level (Blue) and 1900 TRUES SKILL (Green). As you can see, 1000 rating player will almost always be playing with similar level opponents. While 1900 rating player will not only be playing against a_ much wider_ range of opponents, but he will also be forced to play against lower skill players most of the times.
IMAGE-3 The game history of our 1900 rated player. Rating is displayed at the left scale (Red) and the Rating Deviation on the right scale (Blue). Note how quickly it converges to 1800-2000 range and stays there throughout the whole season. Even though winstreaks occur, they won't bring the rating too high.
I've come up with an idea of an interesting (at least to me) experiment.
The data of interest:
1. I'm curious to see the distribution of the overall number of people depending on their rating (like a histogram of the number of players to divisions), as well as "bad" and "good" ones independently.
2. Using the same data as in point 1, calculate the sum of all players' ratings. What I mean is, how does the sum of all players' ratings after 100,000 games compare to the initial sum of their ratings? (The initial ratings' sum would be, for example, 1,500x5,000 = 7,500,000.)
This would require a slightly different model, and... It's coming soon
Well, you've got the idea - although, the graph 3 clearly indicates, that ** winstreaks (and lose streaks) MAY EXIST to certain level, they shouldn't take you much farther, than +/-50 rating below or above your TRUE SKILL LEVEL. ** Especially at the end of the season.
As always, take it with a grain of salt. Because the model is STILL quite simplified and there's STILL a lot of uncertainties and unknowns.
Constructive critics is welcomed. Please check the code yourself, if you're interested.