The only thing I can say with certainty is that they are not publicly available. I don't have access to them, but I have never asked.
I don't feel comfortable trying to describe their purpose when I would not be able to do it well. Maybe @McGlaspie or @GhoulofGSG9 would be willing to chime in.
GhoulofGSG9Join Date: 2013-03-31Member: 184566Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Squad Five Silver, Reinforced - Supporter, WC 2013 - Supporter, Pistachionauts
edited February 2017
We use the end game rating as general quick and dirty user experience survey tool. We tend to compare user ratings between builds, maps, servers and used mods.
For those interested here are two graphs based on the rating data (there are on avg. 58777 completed (user selected rating and reason) surveys for each build):
As you can see there were some build which didn't perform well but the user experience improved slightly over time. Also as uneven teams, bad teamwork and inexperienced commander continuously the most selected reasons we still need to improve the way teams are formed.
Edit: As requested here's a graph with the ratings by map (excluding any map with less than 200 completed surveys):
Don't suppose you can provide me with the primary reason distribution for NS2 Combi map. Curious whether map balance has been sighted as an issue.
While I have publicly stop updating the map, I have been making significant changes to some rooms within the map (but I don't see a roll out anytime soon - months at my current rate of work). Would be nice to get an idea on what the feedback says and whether it's something that can be worked upon.
Don't suppose you can provide me with the primary reason distribution for NS2 Combi map. Curious whether map balance has been sighted as an issue.
While I have publicly stop updating the map, I have been making significant changes to some rooms within the map (but I don't see a roll out anytime soon - months at my current rate of work). Would be nice to get an idea on what the feedback says and whether it's something that can be worked upon.
I am willing to hand out a limited anonymised survey data export to interested mappers. Just send me a private message with your maps name and the workshop id.
However I doubt those data are as helpful as any quality feedback. For that I recommend to organize a group like the Spark Crafter Collective who frequently play and test given maps and collect direct user feedback.
However I doubt those data are as helpful as any quality feedback. For that I recommend to organize a group like the Spark Crafter Collective who frequently play and test given maps and collect direct user feedback.
Not anymore, sadly. SCC stopped happening regularly about three months ago due to lack of interest. I do think there was one other group that played custom maps, though. I don't quite recall who, though. I think they had a server and focused more on comp.
MephillesGermanyJoin Date: 2013-08-07Member: 186634Members, NS2 Map Tester, NS2 Community Developer
@F0rdPrefect TGNS does play custom maps regulary (I think). Apart from that I am trying atm to revive NSL maptests but it is hard to get people for that.
TGNS has a few custom maps they like. Sometimes they try new ones, but they remove them if they are not played.
For anybody who takes ghoul on his offer I would take the results for your server/map with a grain of salt. Not all statistics are significant. By that I mean there is a mathematical definition of significance.
@Nordic is it fair to say, that these statistics are basically useless to determine player satisfaction regarding the state of the game itself?
Just yesterday we got egg-locked on TTS, but still managed to win the round, so i rated it a 4, because it was exciting. I don't know about others, but my ratings always have to do with the relative quality of the round...
I'm asking because I have the feeling, that the devs would like to point to that graph and conclude that everything is fine and dandy, when in reality they have just recently introduced another bug (egg-lock), which can ruin whole rounds for everyone...
No offense intended, it's just that false feedback is the worst kind.
They are not useless at all. I was trying to say they might be significantly less useful for a smaller sample of just one server or just one map. This is more true for custom maps because they have so few games. Popular servers might have a sizeable sample size, but the stats would probably tell more about the server community than the server experience. I could be wrong too.
Ghouls graph shows that the rating is going up. Although everyone probably has a different criteria for their rating, we can assume they are going keep the criteria fairly consistent. Overall people are saying they are having better games even if they don't agree on what better is. It does not mean the game is perfect but that people are enjoying themselves.
It is also interesting to see the overall rating improve greatly with the introduction of hive 2. I wonder if that is a statistically significant change. It probably is.
But that's my point: everyone rates differently, so what is the rating for in the minds of the devs? (honest question)
To track how well they are doing with the updates? To track how the community playstyle changes? Or something else?
It just doesnt seem to me that it fills a concrete purpose.
And what you've said about it increasing is exactly what I'm afraid of: The devs pointing to the graph and sitting back cause "hey, they like it!".
But WHY does the rating increase? Has the game gotten better with the updates? Or are there just simply more fun rounds which has nothing to do with the game quality itself?
But that's my point: everyone rates differently, so what is the rating for in the minds of the devs? (honest question)
To track how well they are doing with the updates? To track how the community playstyle changes? Or something else?
It just doesnt seem to me that it fills a concrete purpose.
And what you've said about it increasing is exactly what I'm afraid of: The devs pointing to the graph and sitting back cause "hey, they like it!".
But WHY does the rating increase? Has the game gotten better with the updates? Or are there just simply more fun rounds which has nothing to do with the game quality itself?
What is the metric by which game quality is judged? Of course, something does not have to be fun to be worthwhile, but surely how much fun you have in the game is a large factor reflective of the game's quality?
At a first glance, I can see in the ratings-by-map graph that the higher ratings seem to be in favour of maps that tend to have longer-lasting rounds (eg, caged, mineshaft sticking out among the official maps, most of the seige maps have highish ratings).
I would hypothesize that longer rounds tend to be indicative of more even games, skewing the ratings up, and also that the use of (fun) high tech also lends itself towards this. Of course this is nothing new, but should probably be accounted for if you're going to get anything meaningful from these stats.
Are you able to look at rating vs round length? Is there a significant effect?
Are you able to subtract the effects of round length on the ratings from the ratings vs map?
It is, don't get me wrong. I would also love to see that the graph reflects reality, it's just that i have my doubts about the methodology :] We are creatures of habit, we get used to whatever change is made fairly easily, and have to make conscious effort to compare an older state to the current one. I see it with myself: I loathe HP bars, but I got used to them. Would it be better without them? For me, definitely. But I rate the rounds themselves, not the current state of game, so that aspect of my opinion is never reflected when I vote. Otherwise, I'd never give a rating above 3, which is unfair after an awesome back-and-forth round.
We use the end game rating as general quick and dirty user experience survey tool. We tend to compare user ratings between builds, maps, servers and used mods.
That is not the purpose of this survey. It does not ask that question. It does ask what made that round a bad experience if the rating was poor. If the purpose was to better understand what they do well, they would have asked.
A basic user experience survey like this is small in scope. It asks basic questions and provides basic answers. Simple surveys like this are useful because they get a high response rate from a wide range of demographics. It is easy to answer and quick to complete. You question the methodology, but you do not seem to understand the scope.
This survey does not assess the state of the game. It is a simple survey that asks simple questions. UWE must know this. They are not stupid. Ghoul did call it a "quick and dirty" survey. Please don't make a mountain out of a molehill.
And what you've said about it increasing is exactly what I'm afraid of: The devs pointing to the graph and sitting back cause "hey, they like it!".
This is making a mountain out of a molehill. The increase was from (3.5) to (3.6). It is nice that the reported user experience is increasing, but it is still only a (3.6). In terms of an academic grade, this would be a C average. Look Mom, I increased my grade from 70% to a 72%.
The survey might validate that their efforts are improving the user experience, but it does not mean much. Even if the rating was a (5.0), it would not mean the game was perfect. It would just mean users are enjoying themselves. That is great, but it only means so much. Again, this is only a quick and dirty survey.
Thx x) You're right, I looket at it from the other end; but if the purpose is to pinpoint what went wrong (negative feedback), then yeah, i do see it's usefulness.
At a first glance, I can see in the ratings-by-map graph that the higher ratings seem to be in favour of maps that tend to have longer-lasting rounds (eg, caged, mineshaft sticking out among the official maps, most of the seige maps have highish ratings).
I would hypothesize that longer rounds tend to be indicative of more even games, skewing the ratings up, and also that the use of (fun) high tech also lends itself towards this. Of course this is nothing new, but should probably be accounted for if you're going to get anything meaningful from these stats.
Are you able to look at rating vs round length? Is there a significant effect?
Are you able to subtract the effects of round length on the ratings from the ratings vs map?
There is much less a significant correlation between the rounds length and rounds rating than there is between the subjects rating and the given server environment. So yes you would have to subtract that effect from the maps rating. BUT the rating is lacking dimensions to really allow these kind of interpretations.
All the survey was designed for is to be used a trend indication. A significant within-subject values change over time can certainly indicate the effect of an environmental change. For further details about that change you would have to do more detailed interviews. So for us the survey is mostly about detecting trends and allowing us to react to them in-time.
There is much less a significant correlation between the rounds length and rounds rating than there is between the subjects rating and the given server environment
@GhoulofGSG9 That's pretty interesting; when you say 'given server environment' what exactly are we talking about here? There are a bunch of different things that could mean in my head, at least.
Comments
I recommend checking @Nordic 's post history and keeping an eye out for charts.
Oh, I guess that makes sense.
I don't feel comfortable trying to describe their purpose when I would not be able to do it well. Maybe @McGlaspie or @GhoulofGSG9 would be willing to chime in.
For those interested here are two graphs based on the rating data (there are on avg. 58777 completed (user selected rating and reason) surveys for each build):
As you can see there were some build which didn't perform well but the user experience improved slightly over time. Also as uneven teams, bad teamwork and inexperienced commander continuously the most selected reasons we still need to improve the way teams are formed.
Edit: As requested here's a graph with the ratings by map (excluding any map with less than 200 completed surveys):
Don't suppose you can provide me with the primary reason distribution for NS2 Combi map. Curious whether map balance has been sighted as an issue.
While I have publicly stop updating the map, I have been making significant changes to some rooms within the map (but I don't see a roll out anytime soon - months at my current rate of work). Would be nice to get an idea on what the feedback says and whether it's something that can be worked upon.
I am willing to hand out a limited anonymised survey data export to interested mappers. Just send me a private message with your maps name and the workshop id.
However I doubt those data are as helpful as any quality feedback. For that I recommend to organize a group like the Spark Crafter Collective who frequently play and test given maps and collect direct user feedback.
Not anymore, sadly. SCC stopped happening regularly about three months ago due to lack of interest. I do think there was one other group that played custom maps, though. I don't quite recall who, though. I think they had a server and focused more on comp.
For anybody who takes ghoul on his offer I would take the results for your server/map with a grain of salt. Not all statistics are significant. By that I mean there is a mathematical definition of significance.
Just yesterday we got egg-locked on TTS, but still managed to win the round, so i rated it a 4, because it was exciting. I don't know about others, but my ratings always have to do with the relative quality of the round...
I'm asking because I have the feeling, that the devs would like to point to that graph and conclude that everything is fine and dandy, when in reality they have just recently introduced another bug (egg-lock), which can ruin whole rounds for everyone...
No offense intended, it's just that false feedback is the worst kind.
Ghouls graph shows that the rating is going up. Although everyone probably has a different criteria for their rating, we can assume they are going keep the criteria fairly consistent. Overall people are saying they are having better games even if they don't agree on what better is. It does not mean the game is perfect but that people are enjoying themselves.
It is also interesting to see the overall rating improve greatly with the introduction of hive 2. I wonder if that is a statistically significant change. It probably is.
To track how well they are doing with the updates? To track how the community playstyle changes? Or something else?
It just doesnt seem to me that it fills a concrete purpose.
And what you've said about it increasing is exactly what I'm afraid of: The devs pointing to the graph and sitting back cause "hey, they like it!".
But WHY does the rating increase? Has the game gotten better with the updates? Or are there just simply more fun rounds which has nothing to do with the game quality itself?
What is the metric by which game quality is judged? Of course, something does not have to be fun to be worthwhile, but surely how much fun you have in the game is a large factor reflective of the game's quality?
I would hypothesize that longer rounds tend to be indicative of more even games, skewing the ratings up, and also that the use of (fun) high tech also lends itself towards this. Of course this is nothing new, but should probably be accounted for if you're going to get anything meaningful from these stats.
Are you able to look at rating vs round length? Is there a significant effect?
Are you able to subtract the effects of round length on the ratings from the ratings vs map?
That is not the purpose of this survey. It does not ask that question. It does ask what made that round a bad experience if the rating was poor. If the purpose was to better understand what they do well, they would have asked.
A basic user experience survey like this is small in scope. It asks basic questions and provides basic answers. Simple surveys like this are useful because they get a high response rate from a wide range of demographics. It is easy to answer and quick to complete. You question the methodology, but you do not seem to understand the scope.
This survey does not assess the state of the game. It is a simple survey that asks simple questions. UWE must know this. They are not stupid. Ghoul did call it a "quick and dirty" survey. Please don't make a mountain out of a molehill.
This is making a mountain out of a molehill. The increase was from (3.5) to (3.6). It is nice that the reported user experience is increasing, but it is still only a (3.6). In terms of an academic grade, this would be a C average. Look Mom, I increased my grade from 70% to a 72%.
The survey might validate that their efforts are improving the user experience, but it does not mean much. Even if the rating was a (5.0), it would not mean the game was perfect. It would just mean users are enjoying themselves. That is great, but it only means so much. Again, this is only a quick and dirty survey.
There is much less a significant correlation between the rounds length and rounds rating than there is between the subjects rating and the given server environment. So yes you would have to subtract that effect from the maps rating. BUT the rating is lacking dimensions to really allow these kind of interpretations.
All the survey was designed for is to be used a trend indication. A significant within-subject values change over time can certainly indicate the effect of an environmental change. For further details about that change you would have to do more detailed interviews. So for us the survey is mostly about detecting trends and allowing us to react to them in-time.