And how exactly does that change the Elo rating? They will predominantly be on the winning team and their rating will go up. On average, other players will not always be on their team and over time their Elo rating will normalise.
I'd need to sit down with a pen and notebook, but my initial hypothesis is that in a multiplayer game with 8+ concurrent players per team and spread over different countries and servers, it will take less time to average out than with fewer people involved in each game. Maybe only 100 games each to get an accurate rating, but that's just conjecture without working it out.
Yeah, the thing people don't seem to understand about elo is that it only works after you've played a lot of games. It WILL work, but any early figures are probably inaccurate. After 50 rounds or so your elo is probably correct. Maybe 100.
I was just talk in to the ppl who are making it sound like k:d doesn't affect balance in a match...
It doesn't. Especially the deaths. I can go 20/20 and contribute more than someone on the other team going 20/3, and meanwhile the guy with 30 assists or the gorge who has been keeping marines out of an important room for the last 10 minutes might be doing even more. Objectives matter more, and who you kill and where and when are all more important. Sometimes it's even better NOT to kill someone.
It should all be based purely on wins. Score is interesting, and a useful metric when you have absolutely no long term basis for comparison, but it doesn't always account for better tactics. Winning DOES account for tactics. If a player is actually good at helping his teams win (no matter HOW he does it) he will win more often and against better opponents, ultimately improving his rating. Meanwhile the people who win with him will fail and drop back down when they aren't matched with him if he was carrying them
It depends. Statistically, if you win more engagements, you normally win. But that's the beauty of this game. Like you said, you could be a capper and shoot only when needed. Both aspects are important, but IMO neither is more important then the other. You need teamwork for expanding/defending and you need individual skill to kill any opposition.
Strongly disagree. Winning a game tells much less of your personal competence than K/D or score. A marine with 80-2 stats with triple the score of the second-highest marine due to harassed extractors and chased onoses is most probably the best player on the server, but if the rest of the marine team or the commander is lacking in skill, he might not get any recognition for it. Applies for vice versa situations also
One of the biggest problems with the current hive scoring system is that it gives victories too much weight. Not only does it lead to a rating not representing the actual skill of a player, but it will also encourage stacking, because you cannot get a good rating if you're in the losing team, no matter ho well and valiantly you fight.
@Therius the whole point is that it's an average. That 80-2 player will on average carry more games to victory because they are a good player. You shouldn't take 1 game in isolation.
The most statistically robust system is one based on a median polish model over as many different datum points as possible, not an over-complicated equation that tries to capture everything from one game (the term for this is overfitting and it is bad practice).
It still encourages stacking. Stackers win much more by average than non-stackers or counter-stackers, it's not a measure of skill, it's a measure of picking the right team.
One of the biggest problems with the current hive scoring system is that it gives victories too much weight. Not only does it lead to a rating not representing the actual skill of a player, but it will also encourage stacking, because you cannot get a good rating if you're in the losing team, no matter ho well and valiantly you fight.
It seems the less people know about data analysis, the more conclusive their positions seem to be.
The current system is exploitable by team stacking, as there is a human designed +/- score adjustment based on win/loss (for lack of any other apparent heuristics/processes).
This can be improved though, compare populations grouped by win rate category, so if someone is middle of the pack score wise with a 5:1 win rate, his skill can be balanced comparably to someone middle of the pack at a 3:1 win rate, 1.5:1 win rate, or a .5:1 win rate. This should account somewhat for the inflated skill rating of team stackers. This simple routine will improve Hive greatly. (Once again, Elo, and user mandated scoring systems are inherently the wrong choice for gauging player ability in this game)
@roobubba I think tough love is better in this game. It's not like WOW where you can be patient and teach casuals how to be useful - in this game they need awareness and focus akin to an RTS like starcraft.
Getting stomped in sc is the #1 motivation to improve!
@therius the trouble is that stacking is promoted by k:d based scoring just as much: the best way to ensure a high k:d is to be on the same team as that other good player.
I suppose you could scale the overall scores according to how 'balanced' the teams are. One way to do that would be to have a score assigned for the round and multiply that by a weighting factor (0-1) where if the teams are perfectly balanced, everyone gets 1 times their score and if one team has cumulative skill that outweighs the other team, their scores are weighted down accordingly.
I still don't think k:d should feature though. The goal of the game is to kill the hive/cc. If anything other than achieving the winning condition is used in calculating rank, there will be unwanted effects as people change their goal from winning the game to getting the highest k:d possible. That's a large part of the problem with the current system that's already resulted in rookie farming.
I'd like to see a simple system I'll dub "Commander Selection":
At the start of the round, 2 players go commander.
Then, they elect their teams. Just like in sports class (if you like to remember your school days) - except this time the nerds get picked first for once.
This way, huge stacks of all top-scorers vs. all-greens won't happen anymore, because every commander would make sure to pick at least 50% of last round's top-scorers.
the trouble is that stacking is promoted by k:d based scoring just as much: the best way to ensure a high k:d is to be on the same team as that other good player.
True, but I didn't say the current system is good either.
However, try as much as you like to undermine the value of K/D in determining skill, but the truth is that if you look at the scoreboard of any given match, chances are that the people who have been most useful towards making the team successful are the ones with the highest score and the most kills. Have you ever seen a truly bad player dominating the scoreboard simply by farming something, or a truly good player (who's also playing well) dangling near the bottom while still being extremely useful to the team? Doesn't really happen. The only role I can think of that has a huge discrepancy in the benefit/score ratio is the gorge. Another one is the recapper marine/RT harasser skulk combo, but it's much more situational and actually raises your score.
Not only does a high K/D help the team by simply keeping the enemy in the spawn queue, but it also indicates that the player has been doing useful things and racked up the kills in the meanwhile. It also gives you a chance to prove that you are good even if you're in the losing team.
Both winning and K/D -based ratings encourage stacking, but a K/D (or, rather, score) based one doesn't punish anti-stacking as much.
It seems the less people know about data analysis, the more conclusive their positions seem to be.
The current system is exploitable by team stacking, as there is a human designed +/- score adjustment based on win/loss (for lack of any other apparent heuristics/processes).
This can be improved though, compare populations grouped by win rate category, so if someone is middle of the pack score wise with a 5:1 win rate, his skill can be balanced comparably to someone middle of the pack at a 3:1 win rate, 1.5:1 win rate, or a .5:1 win rate. This should account somewhat for the inflated skill rating of team stackers. This simple routine will improve Hive greatly. (Once again, Elo, and user mandated scoring systems are inherently the wrong choice for gauging player ability in this game)
Nope because your Hive skill is what you did in your last 20 matches. You could have a skill of 100 over the course of 1 billion matches but if you then play excellently in the next 20 you will have a gajillion.
the trouble is that stacking is promoted by k:d based scoring just as much: the best way to ensure a high k:d is to be on the same team as that other good player.
Have you ever seen a truly bad player dominating the scoreboard simply by farming something, or a truly good player (who's also playing well) dangling near the bottom while still being extremely useful to the team? Doesn't really happen. The only role I can think of that has a huge discrepancy in the benefit/score ratio is the gorge.
Agree with everything you said but I've been the lone pushing guy (for RTs) with a commander who drops a medpack every couple of builds and I am taken out time and again while all my teammates are in, say, departures defending an RT blueprint and racking up kills such that I am bottom of the scoreboard for a good 5 minutes.
Edit: I can't remember which server I was on (dumbass brits I think?) but we did a random shuffle for the kdr and I ended up on the same team as the guy who was second to me. Then in the next round I went 20;52;0 and that round wasn't even recorded (because when I hit escape to leave the game I noticed the score hadn't changed).
No one is arguing that the sample size is currently to limited... But to imply that a guy staring at an RT is demonstrating the same level of skill as the guy who is 20-1 is ridiculous. If the person building that RT is as skilful he will go 20-1 in future games when he is in the front line, and that will even out.
But a person who dedicates himself to building is not as skill-full, he is helpful and does contribute but is not demonstrating skill.
As everyone has said NS2 is a team game! But our scores are not. Our scores are individualized and should reflect individual skill, not the skill of those you play with. If a player goes 0-20 against me he should not be concurred equal to the guy on his team who went 5-5 against me.
To base individual score on team W/L is to suggest that the individual won or lost the game... While in some instance maybe an individual did win or lose the round, however it assumes this this for every player for every round.
Edit: I had this big long reply but decided to simply ask 1 question. Could you rank every player in the NFL/MBA/NHL... based solely on career win/loss? If you answer yes to this I'm sorry but your opinion is wrong.
No one is arguing that the sample size is currently to limited... But to imply that a guy staring at an RT is demonstrating the same level of skill as the guy who is 20-1 is ridiculous. If the person building that RT is as skilful he will go 20-1 in future games when he is in the front line, and that will even out.
But a person who dedicates himself to building is not as skill-full, he is helpful and does contribute but is not demonstrating skill.
As everyone has said NS2 is a team game! But our scores are not. Our scores are individualized and should reflect individual skill, not the skill of those you play with. If a player goes 0-20 against me he should not be concurred equal to the guy on his team who went 5-5 against me.
To base individual score on team W/L is to suggest that the individual won or lost the game... While in some instance maybe an individual did win or lose the round, however it assumes this this for every player for every round.
Edit: I had this big long reply but decided to simply ask 1 question. Could you rank every player in the NFL/MBA/NHL... based solely on career win/loss? If you answer yes to this I'm sorry but your opinion is wrong.
It seems a great point but then you need to consider that you will be playing in the same team in those leagues and as such if your team is terrible and you're brilliant then win/loss should have a lower weighting whereas if you're playing in different teams all the time then it, hypothetically, averages out. So the analogy has a flaw.
HOWEVER, I still think that win/loss should mean nothing. And you might as well turn the hive thing off in competitive servers because it's for matchmaking (which is public?) so it shouldn't need to apply.
My skill level is now 115. I keep starting servers at home for my mod pg, and run around killing myself through rocket jumps, seems it is damaging my skill ability.
Funny to see it change and I haven't played vanilla NS2 yet
How could that happen? It should not be possible, only vanilla gameplay NS2 servers are included in the stats.
The hbz private server is counted on hive stats, which is ridiculous! Nsl mod servers should not be counted.
Mad max of course the answer to your question is no, but the question is completely wrong in this context. Even on pub community servers where you see many of the same people frequently, the teams are still mixed up. That is the crucial point that would make win/loss work: the assumption is that the team make up keeps changing, which is over time a very likely scenario.
The analogy may not be perfect, so I challenge someone to find any better analogy... the supposition that the mixing of teams is what makes this works is pure unsubstantiated guess work. The fact is that it would not work in any established team based game I can think of, so guessing that this is the only team game that it would work for is questionable...
My skill level is now 115. I keep starting servers at home for my mod pg, and run around killing myself through rocket jumps, seems it is damaging my skill ability.
Funny to see it change and I haven't played vanilla NS2 yet
How could that happen? It should not be possible, only vanilla gameplay NS2 servers are included in the stats.
Loading mod via -game => modlist 0 => hive gets enabled for server
If @AceDude detects modded server files (-game would prob cause this as it's not a 'mounted' mod) the server will get permanently blacklisted. Don't do that.
If @AceDude detects modded server files (-game would prob cause this as it's not a 'mounted' mod) the server will get permanently blacklisted. Don't do that.
I never intended to do that. It's just the reason my test server appeared at hive.
BTW there is no way for Acedude to see this if you don't do a obvious gamechanging mod. That's the reson there is a report function at hive whitelist page.
It still encourages stacking. Stackers win much more by average than non-stackers or counter-stackers, it's not a measure of skill, it's a measure of picking the right team.
If the team is stacked you are facing a team with a weaker rating. If you win against a weaker team you gain next to no rating yourself and they lose nothing either. On the off chance they win though...
Anyway, pure wins/losses will work in literally any game where winning matters. The only problem with the system is elo hell, or the question of to what degree can a skilled player offset an unskilled team? When a player is better than a team but not enough so to carry the team to victory his rating will never increase and he will be stuck matched with players far beneath him. There is a certain minimum threshold required to overcome the enemy team and if you can't ever pass the threshold you can't be rated based on how close to it you came. The hope though is that on a large number of games you will eventually influence some wins, more so than the players who have been holding you back, and it is therefor just a matter of time before the problem corrects itself.
I was just talk in to the ppl who are making it sound like k:d doesn't affect balance in a match...
It doesn't. Especially the deaths. I can go 20/20 and contribute more than someone on the other team going 20/3, and meanwhile the guy with 30 assists or the gorge who has been keeping marines out of an important room for the last 10 minutes might be doing even more. Objectives matter more, and who you kill and where and when are all more important. Sometimes it's even better NOT to kill someone.
It should all be based purely on wins. Score is interesting, and a useful metric when you have absolutely no long term basis for comparison, but it doesn't always account for better tactics. Winning DOES account for tactics. If a player is actually good at helping his teams win (no matter HOW he does it) he will win more often and against better opponents, ultimately improving his rating. Meanwhile the people who win with him will fail and drop back down when they aren't matched with him if he was carrying them
both examples have a 1:1 or higher. you can contribute by not killing, I know, this isn't combat. but if you're matchmaking pubs... well, would u want a 5:8:10 on or team or a 23:10:3? which team you think would win in a pub. add a cpl more high kd ratios on that winning team... ud need a lot more coordination, which, sadly, isn't always there on pubs. so yes, kd ratio plays a part.
AceDudeJoin Date: 2007-08-26Member: 61994Members, Super Administrators, Forum Admins, NS2 Developer, Reinforced - Supporter, Reinforced - Shadow, Subnautica Developer, Pistachionauts
Well, looks like showing the stats in main menu was a bad idea after all. Sooo much confusion...
There is nothing named ELO when we're talking about ranks. There is mr Elo, who invented chess ranking system, you can read about it here: http://en.wikipedia.org/wiki/Elo_rating_system. It's the best rating system around, but the trouble is: it was designed for a 1v1, 100% symmetric game. Yeah, we could adapt it for a multiplayer game, but it's hard to do so. And what about asymmetrical games? Lets say: marines vs aliens? Sounds damn hard.
Many games tried to implement Elo rating (CoH2, BF4 etc) and they never did it properly. It's impossible to do it perfectly, you can just adjust your system. Most of games keep the ranking hidden. Why is that? It's because it is IMPOSSIBLE to measure skill of a player in a multi-layered, complicated multiplayer games. You simply can't do that. You can try, you can improve your algorithms, but you will never get rid of people saying it's broken. There are really, really many situations that you can't predict without an AI (we don't have AIs around if you don't know that). Take a minute and try to imagine few examples on your own. Gorge rush? Shotgun jetpacker killing a wounded fade? Fade picking up 5 highly ranked marines that lost armory?
Remember what's the job of skill rating. It must tell you how good given player is. And damn, if a guy has a KDR of 6 in the Hive, it DOES mean something. If he spends his time "farming" the skill - he IS getting better! Look, if I'd remove the numbers and replaced them with letters based on percentiles...
Anyway, I just adjusted the skill algorithm. Suprise, KDR is (and always been!) LEAST important. I also completely removed the multiplier for now.
Ah, and I'm going to whitelist new servers every Monday evening CEST.
Comments
I'd need to sit down with a pen and notebook, but my initial hypothesis is that in a multiplayer game with 8+ concurrent players per team and spread over different countries and servers, it will take less time to average out than with fewer people involved in each game. Maybe only 100 games each to get an accurate rating, but that's just conjecture without working it out.
It should all be based purely on wins. Score is interesting, and a useful metric when you have absolutely no long term basis for comparison, but it doesn't always account for better tactics. Winning DOES account for tactics. If a player is actually good at helping his teams win (no matter HOW he does it) he will win more often and against better opponents, ultimately improving his rating. Meanwhile the people who win with him will fail and drop back down when they aren't matched with him if he was carrying them
Strongly disagree. Winning a game tells much less of your personal competence than K/D or score. A marine with 80-2 stats with triple the score of the second-highest marine due to harassed extractors and chased onoses is most probably the best player on the server, but if the rest of the marine team or the commander is lacking in skill, he might not get any recognition for it. Applies for vice versa situations also
One of the biggest problems with the current hive scoring system is that it gives victories too much weight. Not only does it lead to a rating not representing the actual skill of a player, but it will also encourage stacking, because you cannot get a good rating if you're in the losing team, no matter ho well and valiantly you fight.
The most statistically robust system is one based on a median polish model over as many different datum points as possible, not an over-complicated equation that tries to capture everything from one game (the term for this is overfitting and it is bad practice).
This. 100x this.
The current system is exploitable by team stacking, as there is a human designed +/- score adjustment based on win/loss (for lack of any other apparent heuristics/processes).
This can be improved though, compare populations grouped by win rate category, so if someone is middle of the pack score wise with a 5:1 win rate, his skill can be balanced comparably to someone middle of the pack at a 3:1 win rate, 1.5:1 win rate, or a .5:1 win rate. This should account somewhat for the inflated skill rating of team stackers. This simple routine will improve Hive greatly. (Once again, Elo, and user mandated scoring systems are inherently the wrong choice for gauging player ability in this game)
Getting stomped in sc is the #1 motivation to improve!
I suppose you could scale the overall scores according to how 'balanced' the teams are. One way to do that would be to have a score assigned for the round and multiply that by a weighting factor (0-1) where if the teams are perfectly balanced, everyone gets 1 times their score and if one team has cumulative skill that outweighs the other team, their scores are weighted down accordingly.
I still don't think k:d should feature though. The goal of the game is to kill the hive/cc. If anything other than achieving the winning condition is used in calculating rank, there will be unwanted effects as people change their goal from winning the game to getting the highest k:d possible. That's a large part of the problem with the current system that's already resulted in rookie farming.
At the start of the round, 2 players go commander.
Then, they elect their teams. Just like in sports class (if you like to remember your school days) - except this time the nerds get picked first for once.
This way, huge stacks of all top-scorers vs. all-greens won't happen anymore, because every commander would make sure to pick at least 50% of last round's top-scorers.
True, but I didn't say the current system is good either.
However, try as much as you like to undermine the value of K/D in determining skill, but the truth is that if you look at the scoreboard of any given match, chances are that the people who have been most useful towards making the team successful are the ones with the highest score and the most kills. Have you ever seen a truly bad player dominating the scoreboard simply by farming something, or a truly good player (who's also playing well) dangling near the bottom while still being extremely useful to the team? Doesn't really happen. The only role I can think of that has a huge discrepancy in the benefit/score ratio is the gorge. Another one is the recapper marine/RT harasser skulk combo, but it's much more situational and actually raises your score.
Not only does a high K/D help the team by simply keeping the enemy in the spawn queue, but it also indicates that the player has been doing useful things and racked up the kills in the meanwhile. It also gives you a chance to prove that you are good even if you're in the losing team.
Both winning and K/D -based ratings encourage stacking, but a K/D (or, rather, score) based one doesn't punish anti-stacking as much.
Nope because your Hive skill is what you did in your last 20 matches. You could have a skill of 100 over the course of 1 billion matches but if you then play excellently in the next 20 you will have a gajillion.
Also I've gotten a few 1000+ when losing:/
Agree with everything you said but I've been the lone pushing guy (for RTs) with a commander who drops a medpack every couple of builds and I am taken out time and again while all my teammates are in, say, departures defending an RT blueprint and racking up kills such that I am bottom of the scoreboard for a good 5 minutes.
Edit: I can't remember which server I was on (dumbass brits I think?) but we did a random shuffle for the kdr and I ended up on the same team as the guy who was second to me. Then in the next round I went 20;52;0 and that round wasn't even recorded (because when I hit escape to leave the game I noticed the score hadn't changed).
But a person who dedicates himself to building is not as skill-full, he is helpful and does contribute but is not demonstrating skill.
As everyone has said NS2 is a team game! But our scores are not. Our scores are individualized and should reflect individual skill, not the skill of those you play with. If a player goes 0-20 against me he should not be concurred equal to the guy on his team who went 5-5 against me.
To base individual score on team W/L is to suggest that the individual won or lost the game... While in some instance maybe an individual did win or lose the round, however it assumes this this for every player for every round.
Edit: I had this big long reply but decided to simply ask 1 question. Could you rank every player in the NFL/MBA/NHL... based solely on career win/loss? If you answer yes to this I'm sorry but your opinion is wrong.
It seems a great point but then you need to consider that you will be playing in the same team in those leagues and as such if your team is terrible and you're brilliant then win/loss should have a lower weighting whereas if you're playing in different teams all the time then it, hypothetically, averages out. So the analogy has a flaw.
HOWEVER, I still think that win/loss should mean nothing. And you might as well turn the hive thing off in competitive servers because it's for matchmaking (which is public?) so it shouldn't need to apply.
How could that happen? It should not be possible, only vanilla gameplay NS2 servers are included in the stats.
Mad max of course the answer to your question is no, but the question is completely wrong in this context. Even on pub community servers where you see many of the same people frequently, the teams are still mixed up. That is the crucial point that would make win/loss work: the assumption is that the team make up keeps changing, which is over time a very likely scenario.
Loading mod via -game => modlist 0 => hive gets enabled for server
I never intended to do that. It's just the reason my test server appeared at hive.
BTW there is no way for Acedude to see this if you don't do a obvious gamechanging mod. That's the reson there is a report function at hive whitelist page.
If the team is stacked you are facing a team with a weaker rating. If you win against a weaker team you gain next to no rating yourself and they lose nothing either. On the off chance they win though...
Anyway, pure wins/losses will work in literally any game where winning matters. The only problem with the system is elo hell, or the question of to what degree can a skilled player offset an unskilled team? When a player is better than a team but not enough so to carry the team to victory his rating will never increase and he will be stuck matched with players far beneath him. There is a certain minimum threshold required to overcome the enemy team and if you can't ever pass the threshold you can't be rated based on how close to it you came. The hope though is that on a large number of games you will eventually influence some wins, more so than the players who have been holding you back, and it is therefor just a matter of time before the problem corrects itself.
both examples have a 1:1 or higher. you can contribute by not killing, I know, this isn't combat. but if you're matchmaking pubs... well, would u want a 5:8:10 on or team or a 23:10:3? which team you think would win in a pub. add a cpl more high kd ratios on that winning team... ud need a lot more coordination, which, sadly, isn't always there on pubs. so yes, kd ratio plays a part.
There is nothing named ELO when we're talking about ranks. There is mr Elo, who invented chess ranking system, you can read about it here: http://en.wikipedia.org/wiki/Elo_rating_system. It's the best rating system around, but the trouble is: it was designed for a 1v1, 100% symmetric game. Yeah, we could adapt it for a multiplayer game, but it's hard to do so. And what about asymmetrical games? Lets say: marines vs aliens? Sounds damn hard.
Many games tried to implement Elo rating (CoH2, BF4 etc) and they never did it properly. It's impossible to do it perfectly, you can just adjust your system. Most of games keep the ranking hidden. Why is that? It's because it is IMPOSSIBLE to measure skill of a player in a multi-layered, complicated multiplayer games. You simply can't do that. You can try, you can improve your algorithms, but you will never get rid of people saying it's broken. There are really, really many situations that you can't predict without an AI (we don't have AIs around if you don't know that). Take a minute and try to imagine few examples on your own. Gorge rush? Shotgun jetpacker killing a wounded fade? Fade picking up 5 highly ranked marines that lost armory?
Remember what's the job of skill rating. It must tell you how good given player is. And damn, if a guy has a KDR of 6 in the Hive, it DOES mean something. If he spends his time "farming" the skill - he IS getting better! Look, if I'd remove the numbers and replaced them with letters based on percentiles...
Anyway, I just adjusted the skill algorithm. Suprise, KDR is (and always been!) LEAST important. I also completely removed the multiplier for now.
Ah, and I'm going to whitelist new servers every Monday evening CEST.