Enhanced Server Statistics for Server Filtering
Sarisel
.::' ( O ) ';:-. .-.:;' ( O ) '::. Join Date: 2003-07-30 Member: 18557Members, Constellation
<div class="IPBDescription">from "Towards a Cohesive Matchmaking System"</div>The idea is to use stats in a way that enables the player to find servers that fit his criteria. With a short tutorial and maybe a "server filtering wizard", it would be possible to make the filtering process fairly straightforward or even automatic. Some general ideas presented are as follows:
<!--quoteo(post=1675669:date=Apr 12 2008, 10:25 PM:name=Sarisel)--><div class='quotetop'>QUOTE(Sarisel @ Apr 12 2008, 10:25 PM) <a href="index.php?act=findpost&pid=1675669"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec--><i>Server Stats</i>
Already, it is possible to filter servers based on ping, map, and number of players. Perhaps it would be possible to determine some useful indicators from other server stats.
1. Skill biases
Take the last 3-5 rounds of P:K:D stats on a particular server. (These include overall, team, and individual stats.)
Determine which individual players are outliers (abnormally high or abnormally low) in the P:K:D stats, how many of them exist, and for which teams.
Now we have something that we can potentially play with:
A) How many outliers consistently play on the same team? This might give an idea of stacking (more applicable for point 2 further down). Server filters can be set up to filter out games where a certain % of total players are outliers (and play on the same team) over the last 3-5 rounds.
B) To what extent are the outliers different compared to the average players? If someone's mean P:K:D is several standard deviations away from the server mean over the last 3-5 rounds, then that could be an indication of skill bias. Players in the server could be made aware of this and the outlier could be voted off if he is undesirable (the voting process could take place after a round is complete or initiated during the round). Also, in the server browser, the presence of outliers could be indicated to the player and maybe a threshold value set to filter out undesirable servers.
2. Stacking
What is the server's win:lose distribution for alien and marine teams over the last 3-5 rounds?
Of the players that are present within the server, what are the individual marine:alien roles chosen over the last 3-5 rounds?
Over the last 3-5 rounds, are there a certain number of players who consistently play on marines and aliens when the respective teams win?
If so, then there is a good possibility that stacking is occurring. The detection of stacking could be made a little bit more robust if team P:K:D distributions are also factored in here. In the end, the result is a value that can be set to filter off servers in the browser window that are showing evidence of stacking. If done properly, this will encourage admins to monitor fair play on their servers.
3. Competitive, casual, and open servers
Is there a way to determine at a particular time via recent server stats if a server's player base is more competitive, casual, or perhaps inexperienced? I think there may be a way. For example, in NS1, competitive play usually follows a very rigid format in terms of time spent per round - usually under 15 minutes. I think there may be other <!--coloro:#008080--><span style="color:#008080"><!--/coloro-->statistical signatures<!--colorc--></span><!--/colorc--> that can distinguish different server types if the developers ran some stats in beta testing on games between experienced FPS gamers versus games between casual gamers. The idea is to find an expression for server "skill" based on as few statistical variables as possible. Personally, I think it may be a combination of average time-per-round, mean total kills per minute per round, and mean points:kills ratios (if assists are accounted for, which will give an indication of where teamwork is taking place). <i>Note that these signatures would be based on server stats over a particular range of time, not individual player stats that are monitored across different servers.</i> Once these signatures are established, the player can adjust his server browser to filter out undesirable server skill levels.<!--QuoteEnd--></div><!--QuoteEEnd-->
So far, the following comment was made by Radix:
<!--quoteo(post=1675704:date=Apr 13 2008, 11:12 AM:name=Radix)--><div class='quotetop'>QUOTE(Radix @ Apr 13 2008, 11:12 AM) <a href="index.php?act=findpost&pid=1675704"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->This seems much deeper than simply using K:D and environmental factors to determine skill. If you weren't already, I'd probably still look at using a system of "advantages" to determine how much weight a given kill would garner. The advantages would be small-footprint boolean or integer numbers in server memory that were calculated when the upgrade was added or the gamestate changed (so that cpu cycles weren't wasted). It might seem daunting, but I don't think it would actually be that taxing.<!--QuoteEnd--></div><!--QuoteEEnd-->
It might be interesting to consider advantages for upgrades and tech in an attempt to give P:K:D more meaning, but I'm not sure if that will help to distinguish between competitive and casual servers. Could you perhaps suggest how it would be helpful?
I can see how it might be useful for decreasing false positives for filtering of servers based on skill bias (point 1 in the main idea), since players who get high frags under high advantage conditions would not get the same treatment as those who get high frags under equal or negative advantage conditions.
<!--quoteo(post=1675669:date=Apr 12 2008, 10:25 PM:name=Sarisel)--><div class='quotetop'>QUOTE(Sarisel @ Apr 12 2008, 10:25 PM) <a href="index.php?act=findpost&pid=1675669"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec--><i>Server Stats</i>
Already, it is possible to filter servers based on ping, map, and number of players. Perhaps it would be possible to determine some useful indicators from other server stats.
1. Skill biases
Take the last 3-5 rounds of P:K:D stats on a particular server. (These include overall, team, and individual stats.)
Determine which individual players are outliers (abnormally high or abnormally low) in the P:K:D stats, how many of them exist, and for which teams.
Now we have something that we can potentially play with:
A) How many outliers consistently play on the same team? This might give an idea of stacking (more applicable for point 2 further down). Server filters can be set up to filter out games where a certain % of total players are outliers (and play on the same team) over the last 3-5 rounds.
B) To what extent are the outliers different compared to the average players? If someone's mean P:K:D is several standard deviations away from the server mean over the last 3-5 rounds, then that could be an indication of skill bias. Players in the server could be made aware of this and the outlier could be voted off if he is undesirable (the voting process could take place after a round is complete or initiated during the round). Also, in the server browser, the presence of outliers could be indicated to the player and maybe a threshold value set to filter out undesirable servers.
2. Stacking
What is the server's win:lose distribution for alien and marine teams over the last 3-5 rounds?
Of the players that are present within the server, what are the individual marine:alien roles chosen over the last 3-5 rounds?
Over the last 3-5 rounds, are there a certain number of players who consistently play on marines and aliens when the respective teams win?
If so, then there is a good possibility that stacking is occurring. The detection of stacking could be made a little bit more robust if team P:K:D distributions are also factored in here. In the end, the result is a value that can be set to filter off servers in the browser window that are showing evidence of stacking. If done properly, this will encourage admins to monitor fair play on their servers.
3. Competitive, casual, and open servers
Is there a way to determine at a particular time via recent server stats if a server's player base is more competitive, casual, or perhaps inexperienced? I think there may be a way. For example, in NS1, competitive play usually follows a very rigid format in terms of time spent per round - usually under 15 minutes. I think there may be other <!--coloro:#008080--><span style="color:#008080"><!--/coloro-->statistical signatures<!--colorc--></span><!--/colorc--> that can distinguish different server types if the developers ran some stats in beta testing on games between experienced FPS gamers versus games between casual gamers. The idea is to find an expression for server "skill" based on as few statistical variables as possible. Personally, I think it may be a combination of average time-per-round, mean total kills per minute per round, and mean points:kills ratios (if assists are accounted for, which will give an indication of where teamwork is taking place). <i>Note that these signatures would be based on server stats over a particular range of time, not individual player stats that are monitored across different servers.</i> Once these signatures are established, the player can adjust his server browser to filter out undesirable server skill levels.<!--QuoteEnd--></div><!--QuoteEEnd-->
So far, the following comment was made by Radix:
<!--quoteo(post=1675704:date=Apr 13 2008, 11:12 AM:name=Radix)--><div class='quotetop'>QUOTE(Radix @ Apr 13 2008, 11:12 AM) <a href="index.php?act=findpost&pid=1675704"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->This seems much deeper than simply using K:D and environmental factors to determine skill. If you weren't already, I'd probably still look at using a system of "advantages" to determine how much weight a given kill would garner. The advantages would be small-footprint boolean or integer numbers in server memory that were calculated when the upgrade was added or the gamestate changed (so that cpu cycles weren't wasted). It might seem daunting, but I don't think it would actually be that taxing.<!--QuoteEnd--></div><!--QuoteEEnd-->
It might be interesting to consider advantages for upgrades and tech in an attempt to give P:K:D more meaning, but I'm not sure if that will help to distinguish between competitive and casual servers. Could you perhaps suggest how it would be helpful?
I can see how it might be useful for decreasing false positives for filtering of servers based on skill bias (point 1 in the main idea), since players who get high frags under high advantage conditions would not get the same treatment as those who get high frags under equal or negative advantage conditions.
Comments
Instead, to enrich the stats section more:
<i>Detecting teamwork and distinguishing it from structure spamming</i>
This could be possible if a few features were added to how the points are calculated for each player.
Marine and alien teamwork: indicated by the number of assist points (the more people that are working together to kill aliens, the more assist points overall). I think the alien assist points will be a more robust indicator, since aliens (at least in NS1) struggle to work together a lot more than marines (read: walker skuks).
Structure spamming: this is distinguished from teamwork by means of distinguishing assist points from structure building points.
Servers that excel in teamwork will have the most assist points in the last 3-5 rounds.
<i>Long versus short games</i>
Instead of using these stats to find the overall level of play in a server, this could just be used to see what kind of round times are common in particular servers. I still maintain that lower-skill servers generally take longer to finish their rounds, because both sides make many more mistakes and take much longer to finish off the rounds (i.e. turret farming, teching to heavy armor for 10 mins when you can just rush a hive, etc. in NS1 - not sure about NS2, but it might be similar).
However, for the purposes of the player choosing servers, the mean round time could be a good indicator of the type of games that he might want to play.
<i>Rate of technological development</i>
I'm not sure how exactly everything is going to work in NS2, so it is difficult to make a detailed argument for this point. The general idea would be to measure how quickly the alien and marine teams attain technological progress. Does it take very long for a particular marine team to get upgrades, get an upgraded armory, prototype lab? Likewise, does the alien team take a long time to get chambers and hives?
If there was some kind of standard which could be used to compare tech development (I know in competitive NS we had a pretty good idea of when structures should show up, hence why I talked about statistical signatures before), a server filter could be used to find servers that show an average rate greater or equal than a particular set level.
I know this last idea is pretty crude - much more information would be needed to show that it could work.
That way absolutely no calculations would need to be done until the game finishes.
I like the idea of listing who won which round, and perhaps a round time wouldn't be bad either, but other than that I think it would be an information overload for the player. NS2 wants to cater to the new player, but having a super complex system of choosing servers will just act as a turn off, rather than an enhancement.
My ideal server list would include the following.
*Type of server
4 different icons for the following
Competitive, Casual, Open, and Locked
*Servername
*Available slots/Maximum Slots(Reserve slots+/- kicking slot) i.e. 10/24(4) 10 players in a server that holds 24 with 4 reserve slots. A server with a kicking reserve slot would be 10/24(1k) or something along those lines.
*map
*ping
Right click info tab would bring up the following
*above server info
*Player List, with score AND Frags
*Last 5 games won/lost, maps that were played, and the times it took.
*Any Custom LUA scripts that are a deviation from the standard game.
Personally I don't know how useful most of these stats would be and you could very well be right about them being unnecessary. It's quite possible that the differences in stats won't be sensitive enough to give any useful information at all. I think it should be tried though if it isn't very much work to code in a stat collection and monitoring system.