Baddies -> deadweight
Normal folk -> not a total liability
Really good -> asset
Susp good -> carry
Hacker -> star player
This is exactly the kind of thing that does piss me off about the hive skill system. The fact that I automatically get branded as "dead weight" and get deprived of comm support and generally get ignored in favor of people with higher skill rankings does affect my ability to enjoy the game. I get that a part of that might be because most NS2 players know and recognize other regulars and would prefer to support them instead of a "wildcard", but I can't help but feel that the hive score is affecting their decision on some level.
It doesn't really help that the hive score seems to change extremely slowly. The servers that I join seems to have an average of 1200~1600 skill rank (and often up to 1800 during certain times of the day), but my hive skill still wallows around 1000~1100. Since the average skill of the server is significantly higher than mine, shouldn't it raise my skill ranking significantly for each win and lower the penalty for my losses until my hive skill score is in line with the server average (if I'm pulling an even or positive W/L ratio)?
Someone's inevitably going to say "well maybe they're not giving you support because you're bad", so I'll just drop this here so that you can judge my skill level yourself. Thanks to Nordic for introducing me to that server, btw.
The irony of a post like that and calling me the one with a "shitty attitude" is palpable. The last two posts you quoted were not directed at you, but at someone who has had multiple people explain very simple ideas to, but still continues with denial and trash-talking.
You don't like being held responsible for other player's actions? Good, because, in the long run, your skill is not affected by other people's actions. Glad we agree and you understand the statistics behind it.
Regardless of whether or not the statistics can support this, there is plenty of discussion to be had around other areas where skill tracking can improve.
Your shitty behavior is still something I have to read and influences the culture of the game I play. Please do not contribute to the vast amount of shit talk the members of this community generate. There was no need for you to be patronizing and accuse that man of embracing ignorance over rational understanding.
Can we all move on to discussing on how skill tracking can be improved?
Skill tracking depends on you winning for your personal improvement to register in the system. There are plenty of times where you can make a critical choice or improve in an important fashion and lose the game. This defeats the concept of the hive skill being a tracker of individual skill. There may be an implied response from Hive skill over enough time, as the statistics state, but I do not feel this works to provide any sense of recognition, progress in growing skill or short/medium term representation of skill.
If any part of the traditional fps / rts metrics were included in the hive skill, we could have at least a strong base for the claim it represents individual skill and improve the gameplay experience. Factoring in kdr, avg accuracy, give hive score for responding to commander requests, over a rolling 30day period can help bolster some parts of the game that are lacking and improve the feeling of upwards progression a skill tracker usually provides in other games.
For commanding we can access the traditional rts metrics like apm and ns2 specifics ones like accuracy delivering med packs. I haven't presented these in a coherent mathematical logic, which I will come back later and do.
There are plenty of times where you can make a critical choice or improve in an important fashion and lose the game. This defeats the concept of the hive skill being a tracker of individual skill.
It does not. The argument that the skill system is voided because you can play well individually but lose because of your team has been debunked hundreds of times both in this thread and a dozen threads before this one. I know it feels bad intuitively, but you have to understand that for every round lost because of your team, there is a round you won undeservingly by being carried. These all even out in the long run, leaving behind your average contribution to the team's chances of winning.
You don't like being held responsible for other player's actions? Good, because, in the long run, your skill is not affected by other people's actions.
I think you mean in the looooooooooooooooooooooooooooooooooooooooooooooooooooooooooooonnnnngggggg run.
As in, to the point of almost not being useful at all for the average player.
Have you got absolutely anything to back this up with than your gut feeling?
Since the average skill of the server is significantly higher than mine, shouldn't it raise my skill ranking significantly for each win and lower the penalty for my losses until my hive skill score is in line with the server average (if I'm pulling an even or positive W/L ratio)?
No, it does not raise your skill if the enemies skill is higher. Hive gives or takes skill only if there was a surprise.
As I understand it, it goes like this.
Team A has an average skill of 1000.
Team B has an average skill of 1500.
1500>1000. Hive expects team B to win.
If team A wins, that would be a surprise.
Team A wins.
Each player in team A's would have their skill value go up by the same amount based on how unexpected that was (500).
The value that skill goes up would be a lot, because that was a big surprise.
What if team B won?
1500>1000. Hive expects team B to win.
If team B wins, that would not be a surprise.
Team B wins
Each player in team B's would have their skill value go up by the same amount.
The value that skill goes up would be very little if any at all, because the win was not a surprise.
You have an extremely high K/D ratio. That K/D is in the top 5% of all ns2. Your kill rate is 73%. That is extremely high.
You get 9.3 Score/minute. This is above average. The average score/minute is ~7.
Your average marine accuracy is ~25%. This is really good.
Your W/L is 1.09. Your win rate is 52%. This is about as middle ground as it gets.
From this information I can tell you get a lot of kills. Your aim is good. You get a lot of score/minute so you have an active playstyle. This suggests to me that you likely have a high amount of individual skill.
I don't know why but your hive skill does appear to be lower than I would expect. But then again, I don't know.
Note that if you tend to join the stronger team skill-wise, your wins award less and defeats strip you of more points. You might feel like you are performing well, but that might just be due to inbalanced teams. The skill system is not at fault there if your skill does not rise.
Baddies -> deadweight
Normal folk -> not a total liability
Really good -> asset
Susp good -> carry
Hacker -> star player
This is exactly the kind of thing that does piss me off about the hive skill system. The fact that I automatically get branded as "dead weight" and get deprived of comm support and generally get ignored in favor of people with higher skill rankings does affect my ability to enjoy the game. I get that a part of that might be because most NS2 players know and recognize other regulars and would prefer to support them instead of a "wildcard", but I can't help but feel that the hive score is affecting their decision on some level.
It doesn't really help that the hive score seems to change extremely slowly. The servers that I join seems to have an average of 1200~1600 skill rank (and often up to 1800 during certain times of the day), but my hive skill still wallows around 1000~1100. Since the average skill of the server is significantly higher than mine, shouldn't it raise my skill ranking significantly for each win and lower the penalty for my losses until my hive skill score is in line with the server average (if I'm pulling an even or positive W/L ratio)?
Someone's inevitably going to say "well maybe they're not giving you support because you're bad", so I'll just drop this here so that you can judge my skill level yourself. Thanks to Nordic for introducing me to that server, btw.
Don't take it too seriously, I quoted the previous one for the lols. Apparently div 1(and some div 2) is filled with 5v5(excluding coms) going at eachothers with hacks. The hilarity is that the people claiming this are the guys who are so clueless that they couldn't judge the skill between a banana and a cucumber.
If you are good it will show with time and people will start recognize you. The guys with high hivescore or strong skill have been around enough for people to know this.
If it so that you are rated individually in the long run, because on average your win/lose-ratio is a reflection of how much you've accounted to your team in every possible combination of players facing each other, should competitive servers account to the hive skill then?
Doesn't the possibility of chosing your team mates and your opponents create deviations between elo and reality, that are averaged out on public servers, in the form of unbalanced matches?
If we were to pcw another team, which is equal in (individual) elo hive skill, but clearly better as a team, making it highly unlikely for us to win, wouldn't some players win and some players lose undeserved?
Yes, yes it does. Any Elo rating system is only a representation of your skill in relation to the players you play with. People playing in distinct groups and rarely interacting (pubs vs. pcws, NA vs. EU) might have wildly different skill ratings even if they'd be equally capable.
Is this a problem in NS2? Hard to say. My gut feeling says that most, if not all competitive players still play a lot on pub servers, since there are only so many opportunities to play organised scrims. Likewise, I feel like EU and NA players mingle often as well. I might be wrong.
I don't know why but your hive skill does appear to be lower than I would expect. But then again, I don't know.
There are a lot of very short games in that Hive history. The playerbase there is probably hard to balance, so the algorithm generates unbalanced games and then multiplies the outcome by a near-zero.
I think commanders should need to have a hive that is close to their team's average, to avoid unbalancing teams.
Also on shuffling teams, the steam playtime should be taken into consideration in the algorithm because otherweise rookies with 1500 elo will be regarded as equal to vet with 1500 elo, which is not going to give you balanced games.
And if games are unbalanced at equal elo, hive is broken.
Instead of just excluding the ensl match servers from the hive, could we get the same elo rating system for teams, instead of individuals, on competitive servers?
The gather servers could remain in the individual skill rating system though, since it is similar to pub in terms of mixing up teams.
Have you got absolutely anything to back this up with than your gut feeling?
Sure, but it's in the form of a question :
How many rounds does a player have to play in order for their hive score to accurately reflect their individual skill?
When new players started at 0, I think the answer is "0 rounds" Since most new players are so far below the average regular that they are basically just taking up a spawning slot.
Baddies -> deadweight
Normal folk -> not a total liability
Really good -> asset
Susp good -> carry
Hacker -> star player
This is exactly the kind of thing that does piss me off about the hive skill system. The fact that I automatically get branded as "dead weight" and get deprived of comm support and generally get ignored in favor of people with higher skill rankings does affect my ability to enjoy the game. I get that a part of that might be because most NS2 players know and recognize other regulars and would prefer to support them instead of a "wildcard", but I can't help but feel that the hive score is affecting their decision on some level.
It doesn't really help that the hive score seems to change extremely slowly. The servers that I join seems to have an average of 1200~1600 skill rank (and often up to 1800 during certain times of the day), but my hive skill still wallows around 1000~1100. Since the average skill of the server is significantly higher than mine, shouldn't it raise my skill ranking significantly for each win and lower the penalty for my losses until my hive skill score is in line with the server average (if I'm pulling an even or positive W/L ratio)?
Someone's inevitably going to say "well maybe they're not giving you support because you're bad", so I'll just drop this here so that you can judge my skill level yourself. Thanks to Nordic for introducing me to that server, btw.
I was curious, so I wrote a simulation. This simulates what happens to your skill if you start at 1000, your real skill should be 1600, the server has 24 players, and the people you are playing with all have skill of 1600.
IronHorseDeveloper, QA Manager, Technical Support & contributorJoin Date: 2010-05-08Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
@moultano
So correct me if I am wrong but it looks like average convergence or the time it takes to be statistically significant on average is >900 rounds? ( I was getting ~969 rounds with your simulation)
With a sample size of 20,000 rounds, the average round time is 15 minutes.
So that means it is taking over 242 hours on average for a player's individual skill to be accurately represented.
7% of players have 100+ hours of recorded play.
3% of players have 200+ hours...
No offense at all to moultano's great work, but I do stand by my original statement @Therius .. it takes too long to accurately represent the average player.
Trueskill might be better according to my napkin math calculations (that aren't accounting for segregating skill between the two teams just like Hive) as it would take 28.5 hours or less (assuming 24 player games) to converge.
It is important to note however, that according to Nordic's observations, while Trueskill is faster at convergence the Hive is more accurate. Maybe utilize Trueskill to obtain convergence and then rely on the Hive to further hone long term?
Iron is trying to compare TrueSkill to Hive skill based on how long it takes to get an semi-accurate skill rating. TrueSkill gets close after 10 rounds, but it takes ~90 rounds as TacticalFreedom as implemented it to get a statistically significant skill value.
Looking at the simulation last night, the expected skill met up with the underlying skill asymptote at 969 rounds. The average round of NS2 is 15 minutes. So 969 rounds * 15 minutes = 14,535 round.min. That translates into ~240 hours before the expected skill reaches the underlying skill.
To me it looks like it takes 300 rounds, or ~75 hours before the simulated skill gets close to the underlying skill based on those five simulated players. But then again, the time it takes is variable.
The simulation appears to have players ending pretty far from their underlying skill. It looks like three of the simulated players were close to their underlying skill at 300 rounds but then overshot their underlying skill. Then three were close to their underlying skill at 600 rounds.
At the bottom you will see how his hive skill has fluctuated over time and how his MMR has fluctuated over time. This player has over 90 games recorded and is statistically significant. Also note that the hive skill is updated every game, while the TrueSkill MMR is updated daily. This will cause hive to oscillate more while MMR looks like a strait line.
Hive starts you out at 0 skill for current rookies. This guy has about a ~2800 hive skill. Assuming he was a rookie who started at 0, which he is not, then it took him a long time to get to 2800. Hive only goes up in small increments.
His was given an MMR of ~2800 after 10 games. Over the next >80 games it has come to a MMR of ~3000. That is only a difference of 200.
Taking that player as an example, I put 2800 as the underlying skill parameter with a starting skill of 0. Background rate remained 1600.
Appears my picture isn't showing. If it isn't for you, here is a direct link. i.imgur.com/4xib8K5.png
With those parameters, it appears that 4/5 of the simulated players were close to the underlying skill after 300 rounds. Taking this at face value, it appears that it takes ~300 rounds for hive to get close to the intrinsic skill value while TrueSkill only takes 10 rounds.
moultanoCreator of ns_shiva.Join Date: 2002-12-14Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
There are two slightly different issues here:
1. How fast does a new player reach something close to an accurate skill value?
2. How fast does an existing player reach an accurate skill value when they get substantially better or worse?
I suspect that TrueSkill is much much better for 1. but about the same for 2. Thankfully, new players with a high skill level are pretty rare, so I don't think this is a huge issue, but I'd like to fix it anyways if we ever are able to make changes to hive again.
My preferred method for this would be to use AdaGrad. http://www.magicbroom.info/Papers/DuchiHaSi10.pdf It's relatively simple to implement. If I get a chance, I'll implement my ideal algorithm in another spreadsheet, and we can compare.
moultanoCreator of ns_shiva.Join Date: 2002-12-14Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
edited May 2016
Adagrad works much better! This is comparing the current hive algorithm to one that uses a modified Adagrad. (Modified because adagrad assumes that the underlying parameter value is stationary so the learning rate should eventually go to 0.)
Implementing adagrad correctly would require adding one more field to the database to store the sum of squared gradients over the players history. The learning rate for adagrad uses the 1/sqrt of this sum. If we don't want to do that, a close approximation would be to use 1/sqrt(playtime) as the learning rate parameter.
Adagrad works much better! This is comparing the current hive algorithm to one that uses a modified Adagrad. (Modified because adagrad assumes that the underlying parameter value is stationary so the learning rate should eventually go to 0.)
Implementing adagrad correctly would require adding one more field to the database to store the sum of squared gradients over the players history. The learning rate for adagrad uses the 1/sqrt of this sum. If we don't want to do that, a close approximation would be to use 1/sqrt(playtime) as the learning rate parameter.
What I see is that the underlying skill value is found much much quicker. I read the abstract of the adagrad paper you linked. Could you explain to me what you changed by using adagrad like I am five?
moultanoCreator of ns_shiva.Join Date: 2002-12-14Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
We're optimizing the values using stochastic gradient descent. The best ELI5 for stochastic gradient descent is "figure out which direction you are most wrong, and.walk the opposite way."
Gradient descent doesn't tell you how far to walk in that direction, so you have to choose a "learning rate." For the current hive implementation, the learning rate is constant.
With adagrad, the learning rate is variable. To continue the analogy, adagrad says "when you walk in a direction, walk 1/sqrt(the total distance you've walked so far)."
Implement paying $1 for 500 decrease in hive score -> endless revenue stream for NS2.
The few things I have to say about hive are:
- Good players staying at 1300-1500 hive round after round when they should be 2K+ .
- Bad næbs staying at ~1K round after round when they should be 300-, zero or even negative. Someone mentioned if someone's good or bad they will "quickly" descend or ascend to their eLOL level. I don't know about the patience levels of anyone else, but after 3 rounds of ridiculously stupid waste of time stax, I lose the desire to play the game for the day really.
- Shuffle putting 2 good players with useless overeLOLd næbs against undereLOLd players with nifty icons, round after round. Bonus eLOLs if one of the 2 good players goes com. Double the stax, double the fun.
Is there a post somewhere explaining how the skill ranking works? No matter how well or how poorly I perform in game, I always seem to get a skill ranking between 950-1050, and I'm rather confused as to whether the skill ranking was ever implemented properly. It generally doesn't affect my enjoyment of the game, of course, but I like being able to visually see that I'm improving at the game. As it is now, I get basically the same skill score at 20/20 KDR and 40/10 KDR.
It isn't a "skill" ranking. It is basically just a tally of your wins. The more your team happens to win, the more points you get. Broadly, having a high "skill" score doesn't necessarily mean you are "skilled" at the game. It just means that you happened to be on winning teams a lot of the time that you played. I personally know several people with high scores who aren't that good individually.
However, most of the players with very high scores (2000+) tend to be good at the game. They tend to be experienced, know how to use the flaws of the game to get kills, they play gathers with others of high game ability, or tend to stack on public servers, and bring about wins. For this tiny subset of the NS2 population, the "skill" score does reflect their actual ability.
If you are a casual-ish player you may do well in-game, but a foolish team, or poorly trained commander, will bring you loss and cause a reduction in your "skill" score. For the majority of NS2's surviving population, the "skill" score doesn't mean anything.
It is important not to compare the "skill" score of NS2 with other ranking systems such as DOTA2, SC2 or CSGO. Those games are far better at measuring your ability than NS2 is, mainly because those games have a far larger population on which to base "skill" measurements, and because the developers of those games actually make an effort to improve the ranking system to make it actually... useful.
The best policy is to ignore the "skill" number and just play the best you can, and try to have fun too. If it is causing you stress or frustration, it's not worth spending any time on it.
I would also like to add these other games have a much less complex set of mechanics to measure to win. The strategy and skill involved in Winning DOTA are less dependant on your team (note I said less and not doesn't). NS is alot more complex and chaotic.
As long As you only take account of the win rate to get the player skill, It will take too much time to converge.
I would like a system that calculate the player skill based on a combination of K/D score/minute and winrate.
And more the playtime is big, more the importance is given to the winrate.
Y, Z are constant to obtain values comparable with skill points values.
X is a value that must reflect how fast hiveskill become accurate.
with X*playtime < 1
else:
skill = hiveskill
if you make it recursive and set:
hiveskill = old_skill
Then you can decrease drastically the value of X. And obtain a good estimation of the player skill in a few rounds.
However, one problem exists if the player real skill increase faster than the hiveskill is able to follow after X*playtime.
As long As you only take account of the win rate to get the player skill, It will take too much time to converge.
Not true with adagrad as moultano has shown.
Hive skill does not directly take KDR into account. It does take it indirectly though. Killrate correlates to winrate with a coefficient of .65. A player who kills more is more likely to win. A player who wins more will have their hive skill go up faster. The same is true with Score/minute but to a lesser effect. Score/minute correlates to winrate with a coefficient of 0.58. The problem is that hive does not update very fast.
When Moultano added adagrad to the hive algorithim for the simulation, hive skill updated quickly. If hive skill updated quickly, then there would be no point in using KDR or SPM for hive skill as you have described.
As long As you only take account of the win rate to get the player skill, It will take too much time to converge.
Not true with adagrad as moultano has shown.
I don't understand how adagrad works, and Idon't have the time to dig into the pdf. So, I will not try to demonstrate the limits or to compare it with another algorithm like the hive.
However running the excel, without changing any parameters don't show what i was expecting to see. And it is valid for hive skill simulation too. The simulated values varie a lot, and also take a long time to stabilize and is also not able to find the player skill value in some of the cases after +-1000 rounds??????
So I either missunderstood the simulation, or there is something wrong in the simulation, or there is something wrong on both algorithm.
(Is POW / POWER equivalent in excel? formula translation issues...)
My remark point to the fact that the duration of a round is more or less 15mins.
If you shuffle all rounds, if many players like rookies don't have a stabilized hive skill, if the delta between hiveskill and real skill of the players of each teams compensate each other, if the "luck" has a significant impact.
for all those reasons, i don't think that you can find an accurate value of semeone skill (let's say between 90% and 110% of their real skill)
in less than 100 rounds with any algorithms systems that is only based on winrate.
100*15min=25hours.
So the big question is more, what is your goal in term of convergence speed?
My objective, what i would like to have, in term of convergence should be something with 90-110% below 10hours and between 0 and 10 the skill should grow as your expected skill line in one of the graph above, so has an horizontal asymptote.
Hive skill does not directly take KDR into account. It does take it indirectly though. Killrate correlates to winrate with a coefficient of .65. A player who kills more is more likely to win. A player who wins more will have their hive skill go up faster. The same is true with Score/minute but to a lesser effect. Score/minute correlates to winrate with a coefficient of 0.58. The problem is that hive does not update very fast.
Hive does not take KDR into account, directly or not.(causality)
What you should correlate is the hiveskill with the winrate. And you will know the precision of your system.
However to be able to that you need to have historical datas with the team avg skills, the winner team and the playerskill.
My point in the post above using K/D is only a quick fix, that don't require a lot of work to allow faster convergence for rookies.
A system based only on K/D as I explained to you on discord would require a lot more information about each kills.
(skill of the 2 players involve, their liform, their upgrades, and take only care of some of the cases to extrapolate the skill value)
When Moultano added adagrad to the hive algorithim, hive skill updated quickly. If hive skill updated quickly, then there would be no point in using KDR or SPM for hive skill as you have described.
added?
I thougth it was something for the future. Am I wrong?
I was saying he added it for the simulation. I updated my post to make that clear. I don't understand how adagrad works myself, and I did not even attempt to read the white paper on it. That is why I asked moultano to explain it like I was 5. He did that just a few posts above.
We're optimizing the values using stochastic gradient descent. The best ELI5 for stochastic gradient descent is "figure out which direction you are most wrong, and.walk the opposite way."
Gradient descent doesn't tell you how far to walk in that direction, so you have to choose a "learning rate." For the current hive implementation, the learning rate is constant.
With adagrad, the learning rate is variable. To continue the analogy, adagrad says "when you walk in a direction, walk 1/sqrt(the total distance you've walked so far)."
I have calculated the correlation coefficient between hive skill and winrate before. Just recently I broke it down over time. This also shows that hive as it is currently takes a long time to find the intrinsic skill.
Comments
This is exactly the kind of thing that does piss me off about the hive skill system. The fact that I automatically get branded as "dead weight" and get deprived of comm support and generally get ignored in favor of people with higher skill rankings does affect my ability to enjoy the game. I get that a part of that might be because most NS2 players know and recognize other regulars and would prefer to support them instead of a "wildcard", but I can't help but feel that the hive score is affecting their decision on some level.
It doesn't really help that the hive score seems to change extremely slowly. The servers that I join seems to have an average of 1200~1600 skill rank (and often up to 1800 during certain times of the day), but my hive skill still wallows around 1000~1100. Since the average skill of the server is significantly higher than mine, shouldn't it raise my skill ranking significantly for each win and lower the penalty for my losses until my hive skill score is in line with the server average (if I'm pulling an even or positive W/L ratio)?
Someone's inevitably going to say "well maybe they're not giving you support because you're bad", so I'll just drop this here so that you can judge my skill level yourself. Thanks to Nordic for introducing me to that server, btw.
Regardless of whether or not the statistics can support this, there is plenty of discussion to be had around other areas where skill tracking can improve.
Your shitty behavior is still something I have to read and influences the culture of the game I play. Please do not contribute to the vast amount of shit talk the members of this community generate. There was no need for you to be patronizing and accuse that man of embracing ignorance over rational understanding.
Can we all move on to discussing on how skill tracking can be improved?
Skill tracking depends on you winning for your personal improvement to register in the system. There are plenty of times where you can make a critical choice or improve in an important fashion and lose the game. This defeats the concept of the hive skill being a tracker of individual skill. There may be an implied response from Hive skill over enough time, as the statistics state, but I do not feel this works to provide any sense of recognition, progress in growing skill or short/medium term representation of skill.
If any part of the traditional fps / rts metrics were included in the hive skill, we could have at least a strong base for the claim it represents individual skill and improve the gameplay experience. Factoring in kdr, avg accuracy, give hive score for responding to commander requests, over a rolling 30day period can help bolster some parts of the game that are lacking and improve the feeling of upwards progression a skill tracker usually provides in other games.
For commanding we can access the traditional rts metrics like apm and ns2 specifics ones like accuracy delivering med packs. I haven't presented these in a coherent mathematical logic, which I will come back later and do.
It does not. The argument that the skill system is voided because you can play well individually but lose because of your team has been debunked hundreds of times both in this thread and a dozen threads before this one. I know it feels bad intuitively, but you have to understand that for every round lost because of your team, there is a round you won undeservingly by being carried. These all even out in the long run, leaving behind your average contribution to the team's chances of winning.
Have you got absolutely anything to back this up with than your gut feeling?
How many rounds does a player have to play in order for their hive score to accurately reflect their individual skill?
As I understand it, it goes like this.
What if team B won?
Is this your hive profile? I will assume it is.
http://hive.naturalselection2.com/profile/84956848
http://www.tacticalfreedom.com/stats/84956848
You have an extremely high K/D ratio. That K/D is in the top 5% of all ns2. Your kill rate is 73%. That is extremely high.
You get 9.3 Score/minute. This is above average. The average score/minute is ~7.
Your average marine accuracy is ~25%. This is really good.
Your W/L is 1.09. Your win rate is 52%. This is about as middle ground as it gets.
From this information I can tell you get a lot of kills. Your aim is good. You get a lot of score/minute so you have an active playstyle. This suggests to me that you likely have a high amount of individual skill.
I don't know why but your hive skill does appear to be lower than I would expect. But then again, I don't know.
Don't take it too seriously, I quoted the previous one for the lols. Apparently div 1(and some div 2) is filled with 5v5(excluding coms) going at eachothers with hacks. The hilarity is that the people claiming this are the guys who are so clueless that they couldn't judge the skill between a banana and a cucumber.
If you are good it will show with time and people will start recognize you. The guys with high hivescore or strong skill have been around enough for people to know this.
Doesn't the possibility of chosing your team mates and your opponents create deviations between elo and reality, that are averaged out on public servers, in the form of unbalanced matches?
If we were to pcw another team, which is equal in (individual) elo hive skill, but clearly better as a team, making it highly unlikely for us to win, wouldn't some players win and some players lose undeserved?
Yes, yes it does. Any Elo rating system is only a representation of your skill in relation to the players you play with. People playing in distinct groups and rarely interacting (pubs vs. pcws, NA vs. EU) might have wildly different skill ratings even if they'd be equally capable.
Is this a problem in NS2? Hard to say. My gut feeling says that most, if not all competitive players still play a lot on pub servers, since there are only so many opportunities to play organised scrims. Likewise, I feel like EU and NA players mingle often as well. I might be wrong.
There are a lot of very short games in that Hive history. The playerbase there is probably hard to balance, so the algorithm generates unbalanced games and then multiplies the outcome by a near-zero.
Also on shuffling teams, the steam playtime should be taken into consideration in the algorithm because otherweise rookies with 1500 elo will be regarded as equal to vet with 1500 elo, which is not going to give you balanced games.
And if games are unbalanced at equal elo, hive is broken.
The gather servers could remain in the individual skill rating system though, since it is similar to pub in terms of mixing up teams.
When new players started at 0, I think the answer is "0 rounds" Since most new players are so far below the average regular that they are basically just taking up a spawning slot.
I was curious, so I wrote a simulation. This simulates what happens to your skill if you start at 1000, your real skill should be 1600, the server has 24 players, and the people you are playing with all have skill of 1600.
You can see the simulation here and play with the params if you make a copy of the spreadsheet.
So correct me if I am wrong but it looks like average convergence or the time it takes to be statistically significant on average is >900 rounds? ( I was getting ~969 rounds with your simulation)
With a sample size of 20,000 rounds, the average round time is 15 minutes.
So that means it is taking over 242 hours on average for a player's individual skill to be accurately represented.
7% of players have 100+ hours of recorded play.
3% of players have 200+ hours...
No offense at all to moultano's great work, but I do stand by my original statement @Therius .. it takes too long to accurately represent the average player.
Trueskill might be better according to my napkin math calculations (that aren't accounting for segregating skill between the two teams just like Hive) as it would take 28.5 hours or less (assuming 24 player games) to converge.
It is important to note however, that according to Nordic's observations, while Trueskill is faster at convergence the Hive is more accurate. Maybe utilize Trueskill to obtain convergence and then rely on the Hive to further hone long term?
Looking at the simulation last night, the expected skill met up with the underlying skill asymptote at 969 rounds. The average round of NS2 is 15 minutes. So 969 rounds * 15 minutes = 14,535 round.min. That translates into ~240 hours before the expected skill reaches the underlying skill.
To me it looks like it takes 300 rounds, or ~75 hours before the simulated skill gets close to the underlying skill based on those five simulated players. But then again, the time it takes is variable.
The simulation appears to have players ending pretty far from their underlying skill. It looks like three of the simulated players were close to their underlying skill at 300 rounds but then overshot their underlying skill. Then three were close to their underlying skill at 600 rounds.
Iron got those values from hive data. That is over all players in hive.
Take a look at this profile. http://www.tacticalfreedom.com/stats/52881962
At the bottom you will see how his hive skill has fluctuated over time and how his MMR has fluctuated over time. This player has over 90 games recorded and is statistically significant. Also note that the hive skill is updated every game, while the TrueSkill MMR is updated daily. This will cause hive to oscillate more while MMR looks like a strait line.
Hive starts you out at 0 skill for current rookies. This guy has about a ~2800 hive skill. Assuming he was a rookie who started at 0, which he is not, then it took him a long time to get to 2800. Hive only goes up in small increments.
His was given an MMR of ~2800 after 10 games. Over the next >80 games it has come to a MMR of ~3000. That is only a difference of 200.
Taking that player as an example, I put 2800 as the underlying skill parameter with a starting skill of 0. Background rate remained 1600.
Appears my picture isn't showing. If it isn't for you, here is a direct link. i.imgur.com/4xib8K5.png
With those parameters, it appears that 4/5 of the simulated players were close to the underlying skill after 300 rounds. Taking this at face value, it appears that it takes ~300 rounds for hive to get close to the intrinsic skill value while TrueSkill only takes 10 rounds.
1. How fast does a new player reach something close to an accurate skill value?
2. How fast does an existing player reach an accurate skill value when they get substantially better or worse?
I suspect that TrueSkill is much much better for 1. but about the same for 2. Thankfully, new players with a high skill level are pretty rare, so I don't think this is a huge issue, but I'd like to fix it anyways if we ever are able to make changes to hive again.
My preferred method for this would be to use AdaGrad. http://www.magicbroom.info/Papers/DuchiHaSi10.pdf It's relatively simple to implement. If I get a chance, I'll implement my ideal algorithm in another spreadsheet, and we can compare.
As before, make a copy of this if you would like to play with the parameters.
https://docs.google.com/spreadsheets/d/17-Ubrqk76dpx21YsMjICnzGXhAuddSrtMXbyS5RvBZ8/edit?usp=sharing
Implementing adagrad correctly would require adding one more field to the database to store the sum of squared gradients over the players history. The learning rate for adagrad uses the 1/sqrt of this sum. If we don't want to do that, a close approximation would be to use 1/sqrt(playtime) as the learning rate parameter.
What I see is that the underlying skill value is found much much quicker. I read the abstract of the adagrad paper you linked. Could you explain to me what you changed by using adagrad like I am five?
Gradient descent doesn't tell you how far to walk in that direction, so you have to choose a "learning rate." For the current hive implementation, the learning rate is constant.
With adagrad, the learning rate is variable. To continue the analogy, adagrad says "when you walk in a direction, walk 1/sqrt(the total distance you've walked so far)."
The few things I have to say about hive are:
- Good players staying at 1300-1500 hive round after round when they should be 2K+ .
- Bad næbs staying at ~1K round after round when they should be 300-, zero or even negative. Someone mentioned if someone's good or bad they will "quickly" descend or ascend to their eLOL level. I don't know about the patience levels of anyone else, but after 3 rounds of ridiculously stupid waste of time stax, I lose the desire to play the game for the day really.
- Shuffle putting 2 good players with useless overeLOLd næbs against undereLOLd players with nifty icons, round after round. Bonus eLOLs if one of the 2 good players goes com. Double the stax, double the fun.
May Arpad save us all.
I would also like to add these other games have a much less complex set of mechanics to measure to win. The strategy and skill involved in Winning DOTA are less dependant on your team (note I said less and not doesn't). NS is alot more complex and chaotic.
I would like a system that calculate the player skill based on a combination of K/D score/minute and winrate.
And more the playtime is big, more the importance is given to the winrate.
skill = (K/D * 0.5*Z + scorePerMin * 0.5*Y)* (1 - X *playtime) + hiveskill * (X*playtime)
Y, Z are constant to obtain values comparable with skill points values.
X is a value that must reflect how fast hiveskill become accurate.
with X*playtime < 1
else:
skill = hiveskill
if you make it recursive and set:
hiveskill = old_skill
Then you can decrease drastically the value of X. And obtain a good estimation of the player skill in a few rounds.
However, one problem exists if the player real skill increase faster than the hiveskill is able to follow after X*playtime.
Not true with adagrad as moultano has shown.
Hive skill does not directly take KDR into account. It does take it indirectly though. Killrate correlates to winrate with a coefficient of .65. A player who kills more is more likely to win. A player who wins more will have their hive skill go up faster. The same is true with Score/minute but to a lesser effect. Score/minute correlates to winrate with a coefficient of 0.58. The problem is that hive does not update very fast.
When Moultano added adagrad to the hive algorithim for the simulation, hive skill updated quickly. If hive skill updated quickly, then there would be no point in using KDR or SPM for hive skill as you have described.
I don't understand how adagrad works, and Idon't have the time to dig into the pdf. So, I will not try to demonstrate the limits or to compare it with another algorithm like the hive.
However running the excel, without changing any parameters don't show what i was expecting to see. And it is valid for hive skill simulation too. The simulated values varie a lot, and also take a long time to stabilize and is also not able to find the player skill value in some of the cases after +-1000 rounds??????
So I either missunderstood the simulation, or there is something wrong in the simulation, or there is something wrong on both algorithm.
(Is POW / POWER equivalent in excel? formula translation issues...)
My remark point to the fact that the duration of a round is more or less 15mins.
If you shuffle all rounds, if many players like rookies don't have a stabilized hive skill, if the delta between hiveskill and real skill of the players of each teams compensate each other, if the "luck" has a significant impact.
for all those reasons, i don't think that you can find an accurate value of semeone skill (let's say between 90% and 110% of their real skill)
in less than 100 rounds with any algorithms systems that is only based on winrate.
100*15min=25hours.
So the big question is more, what is your goal in term of convergence speed?
My objective, what i would like to have, in term of convergence should be something with 90-110% below 10hours and between 0 and 10 the skill should grow as your expected skill line in one of the graph above, so has an horizontal asymptote.
Hive does not take KDR into account, directly or not.(causality)
What you should correlate is the hiveskill with the winrate. And you will know the precision of your system.
However to be able to that you need to have historical datas with the team avg skills, the winner team and the playerskill.
My point in the post above using K/D is only a quick fix, that don't require a lot of work to allow faster convergence for rookies.
A system based only on K/D as I explained to you on discord would require a lot more information about each kills.
(skill of the 2 players involve, their liform, their upgrades, and take only care of some of the cases to extrapolate the skill value)
added?
I thougth it was something for the future. Am I wrong?
I have calculated the correlation coefficient between hive skill and winrate before. Just recently I broke it down over time. This also shows that hive as it is currently takes a long time to find the intrinsic skill.
I saw that graph before, however I thougth that it was the correlation and not the coefficient of correlation, my bad.