Forum Game: Who can hand balance better than shuffle?
Nordic
Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
Awhile back I was trying to beat the shuffle algorithm with hand crafted teams based on hive skill only. I was unable to do so; shuffle still made better teams than mine. So I thought, why not make a little game out of this. Who can beat the shuffle?
How does this game work?
I will provide a list of 16 players. You will try to balance the teams by hand to the best of your ability. In about a week, more or less depending on interest, I will post the teams that shuffle would make. Shuffle tries to balance my minimizing the Average Skill and Standard Deviation so I recommend you try to do the same. If you try to balance without using Average skill or Standard Deviation, please state why you think they should be balanced that way.
If you have the ability to use a computer algorithm to balance these teams, please refrain from using it. This is a contest of hand crafted teams.
What do you win? Internet cookies and maybe an awesome.
For this exercise we are assuming that hive skill values are are fairly accurate. In reality that is not always the case, but for the intents and purpose of this forum game we will assume so.
Try to balance these 16 players.
In order to help you, I made a little google sheet you can test your teams with. Please do not troll it and delete your work afterwords, so others can't use it.
https://docs.google.com/spreadsheets/d/1xaTg5QVkxVej3Dg_VwkL5EIb6jBEnRjqJUWIyDjZ0D0/edit?usp=sharing
How does this game work?
I will provide a list of 16 players. You will try to balance the teams by hand to the best of your ability. In about a week, more or less depending on interest, I will post the teams that shuffle would make. Shuffle tries to balance my minimizing the Average Skill and Standard Deviation so I recommend you try to do the same. If you try to balance without using Average skill or Standard Deviation, please state why you think they should be balanced that way.
If you have the ability to use a computer algorithm to balance these teams, please refrain from using it. This is a contest of hand crafted teams.
What do you win? Internet cookies and maybe an awesome.
For this exercise we are assuming that hive skill values are are fairly accurate. In reality that is not always the case, but for the intents and purpose of this forum game we will assume so.
Try to balance these 16 players.
3094 2715 1887 1730 1691 1622 1432 1420 1281 1276 1202 1191 1130 1061 902 520
In order to help you, I made a little google sheet you can test your teams with. Please do not troll it and delete your work afterwords, so others can't use it.
https://docs.google.com/spreadsheets/d/1xaTg5QVkxVej3Dg_VwkL5EIb6jBEnRjqJUWIyDjZ0D0/edit?usp=sharing
Comments
Edit - This is basically a 1-2-2-2 placing. With the exception of the first and last couple, the skills are virtually identical. Its really not that hard to balance. It gets harder with a greater variance in skill.
The problem is still the same as it is now - what if one guy is 3k elo and the rest 900-1300 elo? The team with the 3k elo, will have a significant advantage - however, the pressure to perform isn't there. Which leaves the question - does the 3k guy wanna be a pub-stomper? A question the elo can have no say in.
Hmm.... If it needs to be more numerically satisfying, the beforementioned method might work for the first 3-5 "picks", to destribute the pressure of performing, whereafter the rest are put either in random teams, or destributed using the current system. For a variable number of "picks", could depend on the repaining players' standart variation going below a set threshhold (Could be a server decided number?). This would only work, if its true that higher elo means higher pressure of performing, and the lower part actually only have low impact.
NB: By pressure of performing this is meant: You can not expect greenies to have +25% LMG accuracy marine side, whereas you can more reasonably expect it from high skilled competitive players. Therefore the pressure of performing has nothing to do with your skill level in relation to other players, but to do with you performing to your own skill level, which elo in an ideal world would be an indicator of.
Is a 1900 player automatic better than a 1400?
The system says yes, the reality no.
A 2000 wooza player is far away from the 2000 gather player for example.
THATS why shuffle cant end in perfect balanced rounds.
Shuffle is good to break obvious stacks, thats all.
You can play around with the numbers in whatever way, but you cant predict the human factor.
That doesn't mean I claim there's a "real" balance from the teams it produces, but you can't expect a computer to be able to do more than number crunching.
I am really glad you balanced with 1-2-2-2-2 because we can then compare shines results. These are relatively easy teams to balance which is why I chose this set of players. Hand balancing as you have done produces pretty good teams but shine does give an even better team composition.
Shuffle balance will be statistically better, but for the second part of this thread I want to compare them qualitatively. @Ixian, when I do share the team composition that shine gives I expect you to give an explanation of why Aeglos are better. You seem very adamant without even knowing the shuffle composition.
Oh my god, shocking surprise.
Here's a game that people might actually be interested:
Write an algorithm that shuffles better than shuffle..
As expected, the average skill values are almost exactly the same. However, the standard deviations are wildly different. This is due to the team picking first getting both the best individual player as well as the worst individual player. Thus, Team 1 has both extremes within its ranks, inevitably leading to larger standard deviation.
I then hypothesised another way to create the teams. To avoid one team having both extremes, I started with Team 1 having both the best and the second-worst player, while the Team 2 got the second-best and the worst player. Now, obviously Team 2 is worse off at this point, since both of its players are worse than the corresponding players in Team 1. To compensate for this, the rest of the player pool is divided into pairs starting from the top, and for every pair, the better player goes to Team 2. The teams become the following:
The standard deviations are much closer to each other than in the first example. This is done at the expense of a larger difference in average skill, but the difference isn't huge.
I have absolutely no theory behind any of these calculations, just two draft processes that struck my mind first. Out of these two options, I'd choose the latter just out of pure gut feeling since I have no way judge the trade-off between mean and stdev meaningfully. How does shuffle optimise this trade-off, I don't know, but the trade-off is always there.
The reason my score is crashing so hard is because I was an 1000 score player, as were most others. We are now playing against even experienced and better players with lower score than us, causing this balance to happen. The trouble is, most people 1000-1800, should in reality be 0-800, although, obviously, that is not entirely accurate, they must be roughly average to still be around the thousand mark.
Realistically, 0-800 is not likely, and the range would probably be 400-1200 or so. The problem is, with all the new players scoring points from 0, older players losing are going to suffer huge points losses as the system tries to converge on itself. But with the small playerbase, and the separatism in play between pub and comp, essentially, this convergence is never going to happen.
The only way to start getting meaningful information from data is to have accurate data. What we have from the current hive system is anything but that. That is without even mentioning the whitelisting issue.
Essentially, all the hive analysis stuff is just practice in anaylsis, because the data you formulate is from an innacurate representation. A total hive reset is the only way, with non-rookies info somehow able to pass over to mark them as non-rookies so they don't need to do the tuts etc.
It created this thread separate from the others to focus in on shuffle itself separate from the skill system itself. For the purpose of this thread I would like it if we assume hive skill is fairly accurate.
So let's keep the hive skill discussion in the other threads, at least for now.
The problem is not the shuffle or the Hive.
It is the total lack of players.
People don't like being told what to do, so they'll either leave, or switch teams > wrecking balance. Sure it takes into account where you are now for most of the players, but that one it switched to make the teams balanced might just decide that "fk it. I don't really want to play aliens, and I really should be doing <suchforth>." So they leave. There's literally nothing anything can do about this. In my experience most games turn into stomps only when people start leaving.
Now, many people leave when the game's a stomp. So there's that too.
Shuffle's not ideal, but that's not its fault, and it's better than an intentional stack, always has been, always will be.
I couldn't talk to him for very long.
Team1:
1000
1000
1000
3000
Team2:
1500
1500
1500
1500
There's no real "good" way to balance those teams anyway, all the solutions end up sacrificing something, be it average difference or standard deviation. That's probably the only real issue in team sorting and it's not something that anything can solve without literally excluding the outliers from the game entirely.
Now I ask, is shuffles team composition better or worse than the 1-2-2-2 split? Could shuffles team composition be improved?
Even the example of "balanced" teams that I give I don't find that balanced. It is just the best team composition one can make based on hive skill alone.
Why not?
PS: No "Hive 2.0 will address this" answers, please. As long as that's not released, it might as well not exist for these type of discussions.
I get that you think that people new to the system should start at the same skill rating, however I don't understand how some people starting at different points absolutely breaks all of the skill values.
The whole premise of this system is that the more you play, the more your skill rating will go towards where it should be, hence new people starting at 0 vs 1000 shouldn't cause such a huge problem assuming they play enough games to get where they should be. It all balances out in the end...
3094
2715
1887
1730
1691
1622
1432
1420
Team two:
1281
1276
1202
1191
1130
1061
902
520
At the end of round (should take about 5 minutes), swap teams.
Marine wins: 1
Alien wins: 1
50% win percentage for both sides.
Perfect balance.
It's the best anyone or anything could do, but the 3000 is going to mop the floor with the 1500s.
1500 1500
1000 1500
1000 1500
1000
The server should have skill caps to prevent 3k players from joining in the first place. Skill segregation is the only way to achieve proper team balance.
Second shuffle is more traditional, trading each tail.
I bet an Alien win on both by a thin margin.
If the 520/902 on the 2nd shuffle were swapped, I would bet Marine win.