Performance benchmarks for NS2 alpha build 152
Avalon
Join Date: 2007-03-04 Member: 60224Members
I decided that "laggy" wasn't a good enough metric for me, so I wanted to benchmark exactly how fast or slow NS2 was running on my machine. On top of that, I wanted to see what various things affected fps, such as resolution, in game quality settings, gpu overclocks, and cpu overclocks. Here's what I found....
*System specs*
Intel E8400 Core2Duo
4GB PC2-6400 DDR2 RAM
Gigabyte EP-35 DS3L
Visiontek Radeon HD4850 512MB
ATI Catalyst 9.11 driver set (fall 2009 :D)
Western Digital 1TB Black Edition 7200RPM HDD
Windows 7 Ultimate 64bit
The benchmark...
Spawn as a marine, run to the tram doorway, and start recording fps with fraps. Walk out into the tram long hall, and take a left towards alien hive. Walk into alien hive room and start shooting hive with rifle. Unload 2 clips. Pull out handgun, unload 2 clips. Reload rifle, unload two clips. Benchmark ends at 60 seconds at the end of the last pair of rifle clips.
1) My first run, at overclocked CPU settings, default GPU settings
1680x1050, high settings, e8400 @ 3.78ghz
min/avg/max
13/25.2/31 fps
2) My second run, I wanted to see what CPU speeds would do here, so I ran my CPU at stock
1680x1050, high settings, e8400 @ 3ghz stock
min/avg/max
9/22/27 fps
3) My third run, to see what effect overclocking my GPU would have
1680x1050, high settings, e8400 @ 3ghz stock, 4850 @ 700mhz (11% overclock)
min/avg/max
11/22.5/29 fps
4) My fourth run, to see what having everything overclocked would do
1680x1050, high settings, e8400 @ 3.78ghz, 4850 @ 700mhz (11% overclock)
13/25.3/31fps
5) My fifth run, to see what the best fps I could get at my native resolution was by reducing in game quality
1680x1050, awful settings, e8400 @ 3.78ghz, 4850 @ 700mhz
min/avg/max
22/27.4/33 fps
6) My sixth run, to see what the best fps I could possibly get at all while sticking with my monitor's native aspect ratio
1280x800, awful settings, e8400 @ 3.78ghz, 4850 @ 700mhz
min/avg/max
21/27/32 fps
From this, I can gather that the GPU is not being worked enough. Increasing the resolution is supposed to reduce the load on the CPU and increase it on the GPU. Vice versa, reducing resolution is supposed to increase the load on the CPU versus the GPU. The fact that I did not get a perceptible fps increase while overclocking my GPU or reducing my resolution proves this. I know a few of you have documented that your GPUs weren't even being used anywhere near 100%.
I did note that reducing the quality while leaving all else the same did result in a noticeable increase in fps. The visible changes to the game seemed to lie with shadows. Since we've determined the GPU to not be having much of an impact on fps, could we thus conclude that the engine is relying too heavily on the CPU for rendering shadows?
*System specs*
Intel E8400 Core2Duo
4GB PC2-6400 DDR2 RAM
Gigabyte EP-35 DS3L
Visiontek Radeon HD4850 512MB
ATI Catalyst 9.11 driver set (fall 2009 :D)
Western Digital 1TB Black Edition 7200RPM HDD
Windows 7 Ultimate 64bit
The benchmark...
Spawn as a marine, run to the tram doorway, and start recording fps with fraps. Walk out into the tram long hall, and take a left towards alien hive. Walk into alien hive room and start shooting hive with rifle. Unload 2 clips. Pull out handgun, unload 2 clips. Reload rifle, unload two clips. Benchmark ends at 60 seconds at the end of the last pair of rifle clips.
1) My first run, at overclocked CPU settings, default GPU settings
1680x1050, high settings, e8400 @ 3.78ghz
min/avg/max
13/25.2/31 fps
2) My second run, I wanted to see what CPU speeds would do here, so I ran my CPU at stock
1680x1050, high settings, e8400 @ 3ghz stock
min/avg/max
9/22/27 fps
3) My third run, to see what effect overclocking my GPU would have
1680x1050, high settings, e8400 @ 3ghz stock, 4850 @ 700mhz (11% overclock)
min/avg/max
11/22.5/29 fps
4) My fourth run, to see what having everything overclocked would do
1680x1050, high settings, e8400 @ 3.78ghz, 4850 @ 700mhz (11% overclock)
13/25.3/31fps
5) My fifth run, to see what the best fps I could get at my native resolution was by reducing in game quality
1680x1050, awful settings, e8400 @ 3.78ghz, 4850 @ 700mhz
min/avg/max
22/27.4/33 fps
6) My sixth run, to see what the best fps I could possibly get at all while sticking with my monitor's native aspect ratio
1280x800, awful settings, e8400 @ 3.78ghz, 4850 @ 700mhz
min/avg/max
21/27/32 fps
From this, I can gather that the GPU is not being worked enough. Increasing the resolution is supposed to reduce the load on the CPU and increase it on the GPU. Vice versa, reducing resolution is supposed to increase the load on the CPU versus the GPU. The fact that I did not get a perceptible fps increase while overclocking my GPU or reducing my resolution proves this. I know a few of you have documented that your GPUs weren't even being used anywhere near 100%.
I did note that reducing the quality while leaving all else the same did result in a noticeable increase in fps. The visible changes to the game seemed to lie with shadows. Since we've determined the GPU to not be having much of an impact on fps, could we thus conclude that the engine is relying too heavily on the CPU for rendering shadows?
Comments
With an X6 1090T, 5870, and 8Gb 1333 (All stock), my results were
1) 1920x1080 Maxed
Min/Avg/Max
13/23.25/26
2) 1280x720 Maxed
Min/Avg/Max
19/23.65/26
CPU bound indeed.
It's interesting to me that I get a solid 60+fps at the main menu; but when I am in-game my fps drops to what people have posted here - ~30fps.
Both have similar #s of faces/triangles etc being rendered according to r_stats.
I wonder if its something like the 'game logic loop' for example (which does not exist on the main menu) that is responsible for the poor FPS.
<!--c1--><div class='codetop'>CODE</div><div class='codemain'><!--ec1-->function OnCommandVersion()
Shared.Message("Jit is " .. tostring(jit.status()))
jit.off()
end<!--c2--></div><!--ec2-->
When executing this command twice, I see the following:
<!--c1--><div class='codetop'>CODE</div><div class='codemain'><!--ec1-->Jit is true
Jit is false<!--c2--></div><!--ec2-->
which is expected. What I do not see is a noticeable FPS drop. Based on the graphs at <a href="http://luajit.org/performance.html" target="_blank">http://luajit.org/performance.html</a> , we should see a significant performance drop when Lua is disabled.
Really, I would have expected that Lua would have been the bottleneck here, but apparently it's not even close. I wonder wtf else it's doing that could be sucking up a ton of cpu.
Edit: Ran NS2 through a profiler, seems the time is mostly divded between:
ntoskrnl (windows kernel)
Engine.dll
PhysXCooking.dll (physx cooking me dinner!)
For the kernel, seems to be calling thread synchronization functions a lot, which isn't something we can do much about. The engine is spending a lot of it's time in M4::Model::GetBoneCoords. I wonder if we can find a way to play with significantly reduced models, to see if that helps any. PhysX is spending a lot of it's time in NxCookConvexMesh. I don't really know enough about physx (and the nvidia forums are down) to say what that is doing.
Why would I run it at 640x480 when dropping from 1680x1050 to 1280x800 did nothing? I'm already CPU bound, no use in running an ugly resolution that I don't even think my monitor can display :D
Anyway, it's not like it will be this way forever. I was just trying to explore what different things did to the engine in its current state. I also found it very strange when I was sitting in the ready room, getting 45fps on awful settings, then turned straight into a wall and continued to get 45fps. Staring at a wall should make fps skyrocket, since there is less to draw.
I am going to benchmark every build from here on out to see what improvements are realized, if any. Despite how it looks, I am actually not most concerned about fps at this point, but it is the only metric I have control over measuring.
i've experienced something similiar related to the models. I didnt do such a structured benchmark you did. I've started a lan server, joined aliens and enabled cheats. then went to marine base. staying there alone with noone else on the server i had around 26-30 fps. then i spawned eight extra skulks by typing "give skulk" in the console. they appeared next to my position in base. by doing that my fps dropped to 8-9 fps.
this experience is the same people talked about.
server with low playercount=> more or less playable
server with higher playercount=>unplayable
I dont know if this information helps anyone, but maybe you can setup a structured benchmark with that information to proof it, if the information is relevant to track down the problems
cheers
I'd be interested in removing all models etc from the game and other things like that to see if FPS is boosted or not.
This is not helpful. In this alpha, they should be in the same sentence. Benchmarking is providing data that will help solve an engine problem. No one is trying to max FPS numbers here.
Your test got me thinking Avalon. I jumped onto a sever online (NOT a LAN) with one person sitting in the RR, and assets spawned all over the map. My FPS was a solid 50, looking in any direction. Using r_stats, it's obvious that FPS will not increase or decrease no matter the draw calls, primitives, lights, models or anything.
I think that once this issue is sorted out, and I am guessing (as a very uninformed punter) that it is a combination of game logic and occlusion problems, FPS in NS2 is going to be very, very good.
Well i suspect given the item in the progress tracker that they are using Physx for all their collisions which is probably quiet expensive hence separating them out and having a more simplified collision detection system for general player movement?
Perhaps also they are not doing the character skinning on the GPU doing the skinning on the CPU is very slowed compared to GPU...
Max is a smart dude though so i'm sure he has thought of these things :)
<img src="http://img820.imageshack.us/img820/4639/map2yp.png" border="0" class="linked-image" />
FPS are still 30fps as you can see, exactly like the ns_tram etc maps that are included with the game that obviously contain a lot more detail.
This to me makes me think the rendering engine is fine. As mentioned earlier, the main spawn menu renders for me at 60fps too.
I then actually spawned in the menu.level map (I added a ready room spawn marker using the editor) and in-game I got ~30fps too (instead of 60fps like at the game menu).
As mentioned earlier, I agree with others that there must be some game logic code slowing down performance.
For example, perhaps its LUA, or the networking code, or the 'Is the player pressing a keyboard button yet?' code, etc that is responsible, given a basic map renders as fast as a complex map.
Once that bottleneck is sorted out, I guess it'll be easier to identify further bottlenecks.
<a href="http://www.assembla.com/code/pxcode/subversion/nodes/PhysX/SDKs/Cooking/include/NxCooking.h" target="_blank">http://www.assembla.com/code/pxcode/subver...ude/NxCooking.h</a>
<!--quoteo--><div class='quotetop'>QUOTE </div><div class='quotemain'><!--quotec-->To create a triangle mesh object(unlike previous versions) it is necessary to first 'cook' the mesh data into
a form which allows the SDK to perform efficient collision detection.
NxCookTriangleMesh() and NxCookConvexMesh() allow a mesh description to be cooked into a binary stream
suitable for loading and performing collision detection at runtime.
NxCookConvex requires the input mesh to form a closed convex volume. This allows more efficient and robust
collision detection. The input mesh is not validated to make sure that the mesh is convex.<!--QuoteEnd--></div><!--QuoteEEnd-->
<!--c1--><div class='codetop'>CODE</div><div class='codemain'><!--ec1--> virtual bool NxCookConvexMesh(const NxConvexMeshDesc& desc, NxStream& stream)= 0;
/**
\brief Cooks a triangle mesh to a ClothMesh.
\param desc The cloth mesh descriptor on which the generation of the cooked mesh depends.
\param stream The stream the cooked mesh is written to.
\return True if cooking was successful
*/<!--c2--></div><!--ec2-->
i'm not really experienced in programming so i'm not qualified to speak on this.
on another note, i decided to run out and do a little testing;
noticed a little something interesting in the ready room of tram...when i look at the floor, i get ~22 fps. when i look at the ceiling, even though r_stats says i see more of everything (more lights, more geometry, more meshes, more primitives and draw calls, more face sets and occlusion queries), i get 38-40 fps. only thing that seemed different was the ceiling didn't seem to reflect much light, while there was a lot of lighting being reflected on the floor.
ran out into the skybox in spectator mode. max fps achieved was ~65, with 0 of everything except draw calls, primitives (~3500), and occlusion queries(1). tried to vary the amount of stuff seen by changing perspective. looked at the map, minimum framerate was about 6-10fps until i ran far enough away that the engine distance clipped portions of the map (it was pretty far before this happened). i'm guessing scaling isn't optimized yet.
also the HUD does make a huge difference. as does the muzzle flash - looking at a wall, 25 fps. looking at a wall while firing, 20-22 fps. *edit* maybe this drop in fps was due to flash updating the ammo count. not really sure.
system specs:windows xp sp 3
amd phenom ii x4 @ ~3.4 ghz (stock)
HIS ATi radeon 4850 HD 1gig (stock)
3g (4g, but windows xp can only use 3) ddr3 1333
ati catalyst driver 8.780.0.0 (sept '10)
1680x1050 res, medium qualit
Having a Dx9 and Dx11 render path wouldn't be unreasonable.
There's no reason for that statement.
Keep the discussing going guys, there's some good stuff in here. I personally enjoy seeing where the engine is at right now, and honestly am impressed at how well it's handling at such an early stage. It seems like there is a lot that can be done to improve performance, so I'm pretty excited to see how further refinements pan out, and like I said, will document any changes between builds.
<!--c1--><div class='codetop'>CODE</div><div class='codemain'><!--ec1-->function OnCommandVersion()
Shared.Message("Jit is " .. tostring(jit.status()))
jit.off()
end<!--c2--></div><!--ec2-->
When executing this command twice, I see the following:
<!--c1--><div class='codetop'>CODE</div><div class='codemain'><!--ec1-->Jit is true
Jit is false<!--c2--></div><!--ec2-->
which is expected. What I do not see is a noticeable FPS drop. Based on the graphs at <a href="http://luajit.org/performance.html" target="_blank">http://luajit.org/performance.html</a> , we should see a significant performance drop when Lua is disabled.<!--QuoteEnd--></div><!--QuoteEEnd-->
That really is bizarre.
Lots of interesting stuff in this thread. I'd like to see what framerates someone with serious beefy hardware would get. I mean Crysis isn't the real test. The real test is if you can play NS2.
Lots of interesting stuff in this thread. I'd like to see what framerates someone with serious beefy hardware would get. I mean Crysis isn't the real test. The real test is if you can play NS2.<!--QuoteEnd--></div><!--QuoteEEnd-->I got a i7 950 and a 5850 not the best of the best but pretty dam fast...
Menu screen i get 100fps
Connected to a server i get 30fps
Playing lan (by myself) i get 40-50 fps
1: Is there a console command or a .LUA file you can edit to remove the flash-based HUD for FPS texting? (I think it's r_drawhud 0 in Source games etc.)
2: If enough of us use @NS2 on twitter to point Flayra in the direction of this thread we might be: A: sharing useful info + B: get some info from UWE's perspective.
It's not the graphics card because I've run awful/med/high and that does not seem to affect performance at all.
RAM isn't an issue because loading times are fine.
It almost like internet lag during gameplay, for instance, when it very laggy and connection is bad you start jumping around and almost lose control of your character, this is what happens to me generally.
I know from my experience with 3d animation that skeletons are usually only retained during the preliminary phases. IE Once you have completed your model and rendered it you're left with just several different animation sequences.
For Example
You shoot your gun - Game plays the fire animation
You walk - Game plays the walking animation
etc.
Unless the bones/skeleton is somehow associated with any ragdoll physics for things that happen on death I really don't think they should be present still and that would account for a huge drop in FPS because the model is still completely malleable in the final stage.
The only thing I could think of is maybe they are having unique damage responses? Ala' Killzone 2? So the model needs to know how to coordinate the movements and the skeletons are still associated.
Anyway I may just be rambling, wanted to throw my two cents in.
AMD Phenom II x3 720BE @ 3.6Ghz
4GB DDR3
GTX 275
Windows 7 64-bit
Ingame the FPS seems to be around 25-40, the graphical settings do have a massive effect on the amount of vRAM that's being used though.
I know from my experience with 3d animation that skeletons are usually only retained during the preliminary phases. IE Once you have completed your model and rendered it you're left with just several different animation sequences.
Unless the bones/skeleton is somehow associated with any ragdoll physics for things that happen on death I really don't think they should be present still and that would account for a huge drop in FPS because the model is still completely malleable in the final stage.<!--QuoteEnd--></div><!--QuoteEEnd-->
That's interesting. On the main menu there is a skulk running around so there would have to be a difference between how things are handled there versus in game. So maybe there is no ragdoll physics overhead on the main menu? Is there a way to set a console variable to not do ragdolls in game?
It's interesting to me that I get a solid 60+fps at the main menu; but when I am in-game my fps drops to what people have posted here - ~30fps.
Both have similar #s of faces/triangles etc being rendered according to r_stats.
I wonder if its something like the 'game logic loop' for example (which does not exist on the main menu) that is responsible for the poor FPS.<!--QuoteEnd--></div><!--QuoteEEnd-->
As i expected, if you open the file \natural selection 2\ns2\maps\menu.level with the map editor then you will see that the main-menu map is no bigger then 1 room. Obviously the problem is maps designed with more then 1 room :D
Read Plasma's post on page 1 and you see that it is not the number of rooms.
However, maybe <!--quoteo--><div class='quotetop'>QUOTE </div><div class='quotemain'><!--quotec-->Separate collision detection from physics (server CPU usage)<!--QuoteEnd--></div><!--QuoteEEnd-->
this has something to do with it? It's keeping the bones around for physics but they're slowing down regular collision detection. Or something. Even though I thought they just use rectangular bounding boxes for that. In any case the fact that you're running a server is the difference between the main menu and single room situations, is that right?
Its a minimum basic render.
They do indeed use luaJIT and its functioning normally. You wont get a slow down if its not being hammered to death anyway?
Like what was said, the Flash huds do take up a lot of render time and create an excessive number of draw calls and primatives. The Send Feedback text is flash.
There is no need to spend any time trying to disable flash, I already ran some tests with it and presented the results using the engines core render that runs @ 1400 fps.
I tried many things to speed it but as only able to squeeze another 10 fps out of it. besides that, the other reason was because its such a ballache to make the interfaces in flash.
As for CPU core spread problems, I found this to actually be caused by the flash, when I replaced the GUI the CPU usage on core 0 dropped and raised on cores 12&3 giving an overall lower usage. Gameplay felt smoother too.
Other than that without the flash there are still other issues they are aware of that is being delt with, going to take some time to optimize all the features that have been piled on.
I am reading all your posts, questioning things i dont already know about and passing on anything new that is found. So keep it coming & good job all.