Ns 64 Bit .so
mcgoo
Join Date: 2003-04-24 Member: 15794Members
<div class="IPBDescription">Running ns in 64 bit mode</div> Hey guys,
Im looking for a way to run NS effectivly on our dual opteron 64 bit systems. At the moment we need to run it as a 32 bit application on a 64 bit platform because there is no ns_amd64.so library in the install.
Any devs have any suggestions on how we can get ns to run in 64 bit mode? Has this file been developed and just not publicly released?
Any help would be appreciated!
Im looking for a way to run NS effectivly on our dual opteron 64 bit systems. At the moment we need to run it as a 32 bit application on a 64 bit platform because there is no ns_amd64.so library in the install.
Any devs have any suggestions on how we can get ns to run in 64 bit mode? Has this file been developed and just not publicly released?
Any help would be appreciated!
Comments
Thanks for the reply.
I second that!
I don't believe an actual 64 bit machine is needed to make a 64 bit build though. They'd just need it to test if it works.
I don't believe an actual 64 bit machine is needed to make a 64 bit build though. They'd just need it to test if it works.<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
As always.. pretty much hit the nail on the head Cheezy <!--emo&:)--><img src='http://www.unknownworlds.com/forums/html//emoticons/smile.gif' border='0' style='vertical-align:middle' alt='smile.gif' /><!--endemo-->
Michael, thanks for the offer. I got your email, but since then I've been in discussions with Flayra and other people about it trying to figure out the right way to go since none of us have Opteron based systems yet but we simply cannot and will not put source on an "external" machine.
(Aside: Now *heres* an opportunity for some AMD Marketing bod to gain some serious kudos by offering to send us a few uber leet Opteron machines for free <!--emo&:D--><img src='http://www.unknownworlds.com/forums/html//emoticons/biggrin.gif' border='0' style='vertical-align:middle' alt='biggrin.gif' /><!--endemo-->)
What we *can* do is cross compile to AMD_64 and have you guys who have opteron based servers test in some closed tests with us?
This will probably come in the week *after* beta 4 is released... as you can imagine, for B4 we need to focus on getting the thing running (you guys can run 32bit emulation for a few more days surely? <!--emo&:)--><img src='http://www.unknownworlds.com/forums/html//emoticons/smile.gif' border='0' style='vertical-align:middle' alt='smile.gif' /><!--endemo--> ) first before looking at 64bit builds.
That sound fair?
joev
<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
We will put our stamp of love on it, and it will be solid.
Now, debugging symbols... You might want to let those get onto other people's boxes, but definitely strip that from the release.
<!--QuoteBegin--></div><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> </td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->
That sound fair?
joev<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
What sounds fair is you using a really new gcc build, say 3.3.2 with needed-patches and linked to the newest glibc. That would make me uber-happy. -m64 pretty much takes care of the rest.
We will have the 64-bit hardware deployed to the datacenter within a week or two for use in closed testing. Hoefully it'll be ready before you have the AMD64 .so done.
Sounds excellent. More than happy to participate in the testing. The best way to contact me is #mcgoohq @ irc.enterthegame.com (im on there 19 ish hours per day - 7 days).
Else there is always email (my address you already have seeings though you got my email).
Please contact me as soon as you have compiled so we can start testing - should be very interesting to see how we can tweak ns in 64 bit mode <!--emo&:)--><img src='http://www.unknownworlds.com/forums/html//emoticons/smile.gif' border='0' style='vertical-align:middle' alt='smile.gif' /><!--endemo-->
<!--c1--></div><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>CODE</b> </td></tr><tr><td id='CODE'><!--ec1-->amd64-dev 32bit-land # cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 4
model name : AMD Athlon(tm) 64 Processor 3400+
stepping : 8
cpu MHz : 2202.872
cache size : 1024 KB
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow
bogomips : 4308.99
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp
amd64-dev 32bit-land # uname -a
Linux amd64-dev 2.6.5-gentoo-r1 #4 Mon Apr 26 05:20:29 CDT 2004 x86_64 4 GNU/Linux
amd64-dev 32bit-land # cat /proc/version
Linux version 2.6.5-gentoo-r1 (root@amd64-dev) (gcc version 3.3.3 20040217 (Gentoo Linux 3.3.3, propolice-3.3-7)) #4 Mon Apr 26 05:20:29 CDT 2004<!--c2--></td></tr></table><div class='postcolor'><!--ec2-->
:O i would like to say:
w00t w00t
Its for me Hurrry up!!!
<!--emo&:D--><img src='http://www.unknownworlds.com/forums/html//emoticons/biggrin.gif' border='0' style='vertical-align:middle' alt='biggrin.gif' /><!--endemo-->
1. Get NS 3.0 ready.
2. BUS
3. Improve Linux server performance (there are many things that I wish to do to see if I can eke out better performance in general)
I do not (yet) have a 64bit machine to build this stuff on. That means I have to get together a complete cross compiling toolchain for amd64. This takes time (more time than I expected).
So, in short, not for a while yet. It is on my list of things to do but the performance gains you might get are minimal versus the effort I have to make and as such it's far down my priority list.
joev.
Like what? What compiler and options have you been using? Have you tried icc? As someone that has developed on Linux since the mid-90s and on Unix longer than that, I'm curious. Have you considered releasing different -march builds?
<!--QuoteBegin-"joev"+--></div><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> ("joev")</td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->It is on my list of things to do but the performance gains you might get are minimal versus the effort I have to make and as such it's far down my priority list.<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
The effort it would require of you must be monumental to make a 30% performance gain seem minimal. From benchmarking software in 32-bit and 64-bit mode, I can say the gain is definitely significant. This single A64 3400+ under 64-bit linux is roughly equal to dual 2.4Ghz xeons in multi-threaded performance.
If you do any kind of serious work in Linux like I do, you ought to do yourself a favor and build a 64-bit linux box to work with. ATI doesn't have 64-bit opengl under linux yet, but nVidia's driver works flawlessly. The performance I've seen is higher than what I expected; I've had nothing but pleasant surprises.
Like what? What compiler and options have you been using? Have you tried icc? As someone that has developed on Linux since the mid-90s and on Unix longer than that, I'm curious. Have you considered releasing different -march builds?<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
As someone who has been developing on BSD, UNIX, VAX VMS, PC DOS, x86 and Z80 ASM and CPM since the mid 80's and on Linux since the <b>early</b> 90's (Kernel 0.98) I can tell you that the world is never as simple as it may seem.
The main problem I have with the linux build is that HLDS_L is compiled against glibc 2.1 using gcc 2.95.2. Because of this and many complications that we have come across if we don't, we match this build system.
Quite simply, there are so many problems with this that they are too numerous to mention (if you know your GCC's and glibc's you'll know what I'm talking about though) but a simple example is the woeful performance of the libstdc++ library that is statically linked to NS.
We're moving to using STLPort which is much more performant. It also should hopefully enable us to move to a newer compiler/glibc setup (since the main reason we match VALVe's build system is because of the libstdc++/glibc combo clobbering stuff ) but that's not without it's own problems, which we are currently hammering out.
I have thought about releasing different architectures, of course I have and I think, given that we need a 64bit .so that this is the time to do that but again, anything we do must fit into our priority list.
<!--QuoteBegin-"joev"+--></div><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> ("joev")</td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->It is on my list of things to do but the performance gains you might get are minimal versus the effort I have to make and as such it's far down my priority list.<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
<!--QuoteBegin- HK - Heretic+May 8 2004, 10:18 AM--></div><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> ( HK - Heretic @ May 8 2004, 10:18 AM)</td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->
The effort it would require of you must be monumental to make a 30% performance gain seem minimal. From benchmarking software in 32-bit and 64-bit mode, I can say the gain is definitely significant. This single A64 3400+ under 64-bit linux is roughly equal to dual 2.4Ghz xeons in multi-threaded performance.<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
Strange, any benchmarks I've seen quote 64bit performance as about 10-15% better than the equivalent 32bit processor. I don't see that software, designed for a 32bit address space, is going to leap in performance like that for no reason. Mind you, HLDS_L itself for 64bit is (finally) compiled against glibc 2.3 using gcc 3.3.2 so that's gonna help in any case.
For this reason, I have been trying to get the ns .so to compile against at least glibc 2.2 using gcc 3.3. I'm not even positive at this stage if that will offer anything more than a token percentage or two performance benefit since the HL executable itself will still be compiled against the older versions but we live in hope.
In any case, once that's working then all I need to do is get my 64bit toolchain together and hit the "compile now" button.
I probably should have made it clearer that one thing on my list (performance improvements) begets the other (64bit). In this context, put simply, focusing on 64bit versions is putting the cart before the horse.
Even if there is a huge hike in performance (30% as you say) I still have to weigh the effort involved in harnessing that improvement against the number of people who would actually benefit. I see it as a higher priority to investigate ways to make gains for *all* server operators rather than a select few.
<!--QuoteBegin-[HK+--></div><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> ([HK)</td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->Heretic,May 8 2004, 10:18 AM]
If you do any kind of serious work in Linux like I do, you ought to do yourself a favor and build a 64-bit linux box to work with. ATI doesn't have 64-bit opengl under linux yet, but nVidia's driver works flawlessly. The performance I've seen is higher than what I expected; I've had nothing but pleasant surprises.<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
Rest assured, as soon as I can *afford* it, I'll be getting my hands on a x86_64bit machine. However keep in mind that this is a volunteer effort on our part and most of us are not made of money.
All in good time.
joev.
Your statements basically agree with and backup <a href='http://www.unknownworlds.com/forums/index.php?showtopic=60836&st=0&#entry940926' target='_blank'>all my earlier statements on the matter back in Febuary</a>. As I mention in that post, there are instrinsic benefits to the 64-bit x86 extensions that eliminate or greatly reduce restrictions in the x86 ISA. You gain performance purely on architecture tweaks. That's the 10-15% you mention. The additional 15-20%--as seen in 64-bit Counter-Strike--is probably attributed to using gcc-3.3; really it's some synergy between the two, the combined effects of using both are probably more than their individual sums. I was 99% sure you were using 2.95, that was about the only thing that could account for the performance difference.
In general, 10-15% increase is about right for most applications on average. Applications that are register starved benefit more, and applications that are floating-point intensive will benefit more (no more floating point stack).
If valve released a 32-bit build using modern glibc and gcc, the 32-bit builds would also get that 10-20% (or whatever) performance increase, but I won't hold my breath on that. I do not think compiling just the NS shared object will give much of a perofmance benefit, I suspect most of the serious computation is still done within the engine itself. Most of the benefit, in fact, probably come from the engine binaries being optimized better. The whole thing with x64_86 is that you abandon compatibility and get a performance boost for doing so by targetting a higher common level of functionality. blah blah blah. You've been around, you know all this right?
I appreciate your work on this. This flipside of you having to setup a cross-compiling environment to build 64bit binaries from 32bit hardware is that I have to build and maintain a 32-bit environment on a primarily 64-bit system--or I could use emulation libraries and take a performance hit. One way or the other there is extra work, but it is not without benefit. Maintaining a 32bit chroot in a 64-bit environment is interesting and I haven't quite worked out all the kinks, particularly in giving non-root users access to it without a nasty setuid script.
Whenever you want to test a 64-bit build on the 4th most heavily trafficed NS server in the world--3rd most in the USA--we'll be waiting. We'll get those code paths tested =)
<!--QuoteBegin--HK-Heretic+May 10 2004, 01:57 PM--></div><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> (-HK-Heretic @ May 10 2004, 01:57 PM)</td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->
I do not think compiling just the NS shared object will give much of a perofmance benefit, I suspect most of the serious computation is still done within the engine itself. Most of the benefit, in fact, probably come from the engine binaries being optimized better.
<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
In fact, the reality of the matter is exactly that. We know specifically where the biggest 'hog' is and it's in the HL engine (AddToFullPack... grrr) which, unfortunately, we don't have access to.
However, as you surmise, my hope is that simply by having a better optimising compiler we might eke out another few % spent computationally in the NS .so itself. At this stage, every little bit helps.. even if we can eliminate a few stray computations per tic per entity, that helps NS performance immeasurably (as you can imagine).
<!--QuoteBegin--HK-Heretic+May 10 2004, 01:57 PM--></div><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> (-HK-Heretic @ May 10 2004, 01:57 PM)</td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->
I appreciate your work on this. This flipside of you having to setup a cross-compiling environment to build 64bit binaries from 32bit hardware is that I have to build and maintain a 32-bit environment on a primarily 64-bit system--or I could use emulation libraries and take a performance hit. One way or the other there is extra work, but it is not without benefit. Maintaining a 32bit chroot in a 64-bit environment is interesting and I haven't quite worked out all the kinks, particularly in giving non-root users access to it without a nasty setuid script.
<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
I appreciate you guys having patience. I realise it's a pain to have your cool 64bit processors "ramped down" and hence playing poor cousin but I assure you it won't be forever and as soon as I can I'll get on to it. As I said, I believe simply the move to gcc 3.3 will be the bulk of the work.. then I can concentrate on getting 64bit builds working.
<!--QuoteBegin--HK-Heretic+May 10 2004, 01:57 PM--></div><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> (-HK-Heretic @ May 10 2004, 01:57 PM)</td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->
Whenever you want to test a 64-bit build on the 4th most heavily trafficed NS server in the world--3rd most in the USA--we'll be waiting. We'll get those code paths tested =)<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
Great stuff. I'll be on to ye as soon as I can.
joev.
<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
This is interesting stuff, and server admins don't get much detail on this level most of the time. If we were to look at this from the server side what would be the best way to tune a system to make up for this sort of slowdown? More ram? More system l2/l3 cache? Just more CPU?
I guess what I'm asking is: "We all acknowledge that the valve hlds engine is the largest hog of cpu, so how can we tweak our systems to deal with that?" Not specific to 64 bit processors I guess, but more general hlds config tweaks we could implement to chop out some CPU hogging routines.
Also, are there sections of the hl API(such as it is) that AMX and metamod coders should avoid to keep from using high CPU calls?
<!--QuoteEnd--></td></tr></table><div class='postcolor'><!--QuoteEEnd-->
This is interesting stuff, and server admins don't get much detail on this level most of the time. If we were to look at this from the server side what would be the best way to tune a system to make up for this sort of slowdown? More ram? More system l2/l3 cache? Just more CPU?
I guess what I'm asking is: "We all acknowledge that the valve hlds engine is the largest hog of cpu, so how can we tweak our systems to deal with that?" Not specific to 64 bit processors I guess, but more general hlds config tweaks we could implement to chop out some CPU hogging routines.
Also, are there sections of the hl API(such as it is) that AMX and metamod coders should avoid to keep from using high CPU calls? <!--QuoteEnd--> </td></tr></table><div class='postcolor'> <!--QuoteEEnd-->
The cheapest way which requires no hardware upgrade is to run the well-optimized Win32 binary; ie don't run a Linux server. That is the unfortunate reality. Processors that run old i387 stack-based floating-point efficiently will be a better choice for hlds_l. Pentium4s feel the brunt of the Linux support blow. AMD processors fair better. For some reason going from hlds 1.1.1.0 -> 1.1.1.1+ incurs a significant performance penalty under Linux.
As it will be a cold day in hell before I run a Windows server on hardware that I personally own, I got the fastest AMD solution within my budget. It should run hlds a lot better than the current Pentium4 in 32-bit mode, and it should perform competitively with the windows server when the 64-bit .so ships. If you're going to make a hardware change to compensate, I would suggest an Athlon64 or Opteron system. If you're stuck with a Pentium4 server, try stomaching Windows.
You can make a nice dual 1U opteron for about $1600 now if you build it yourself.
Im familiar with most euro game server companies, and they dont use them because of cost (game server companies always use the cheap way out). Maybe the odd single server 1 man band here and there runs 64bit, maybe a few in the US...
Are there any stats of how many people are running 64bit? Id be surpised if more than 5% of the servers in the world were 64bit.
Am i correct, or should I get ready to be suprised ;-)
btw Heretic, nice server.
In terms of quantity, perhaps. In terms of capital investment, no way. Enterprise Class servers are all pretty much 64-bit. A couple million dollar POWER5 boxes will do the work of about 1000 decent x86 boxes.
If you're just talking about x86-64 versus x86, then yea. But EM64T (Intel's 64-bit) server stuff is out now, so pretty much all -new- x86 servers are going to be 64-bit from here out. Clearly the direction of the future has been determined.
We started with a single 64-bit proc on our cluster to validate the concept and the software platform. Now we're ready to move on to a 4 or 8 processor servers; we can use the exact same software environment on a $10K server as we do on this $1.2K server. It took hundreds of hours to setup everything correctly, but the last 6+ months of usage has validated it as a viable, stable platform. We're still working towards a standards compliant /lib64 and /lib32 single environment, not quite there yet as there are still some libraries we need that don't adhere to the standards, but we get around that by having seperate 32-bit environments we can chroot into.
64-bit Counter-Strike and 64-bit Unreal Tournament run well enough. As for this game, seeing as how every release introduces more bugs than they fix, I doubt we'll ever see a non-beta 32-bit build much less any sort of 64-bit build. A 64-bit Linux HL2 and Tribes Vengeance dedicated server will probably be out before too long. Personally, I've lost my interest in NS.
In terms of quantity, perhaps. In terms of capital investment, no way. Enterprise Class servers are all pretty much 64-bit. A couple million dollar POWER5 boxes will do the work of about 1000 decent x86 boxes.
If you're just talking about x86-64 versus x86, then yea. But EM64T (Intel's 64-bit) server stuff is out now, so pretty much all -new- x86 servers are going to be 64-bit from here out. Clearly the direction of the future has been determined.
We started with a single 64-bit proc on our cluster to validate the concept and the software platform. Now we're ready to move on to a 4 or 8 processor servers; we can use the exact same software environment on a $10K server as we do on this $1.2K server. It took hundreds of hours to setup everything correctly, but the last 6+ months of usage has validated it as a viable, stable platform. We're still working towards a standards compliant /lib64 and /lib32 single environment, not quite there yet as there are still some libraries we need that don't adhere to the standards, but we get around that by having seperate 32-bit environments we can chroot into.
64-bit Counter-Strike and 64-bit Unreal Tournament run well enough. As for this game, seeing as how every release introduces more bugs than they fix, I doubt we'll ever see a non-beta 32-bit build much less any sort of 64-bit build. A 64-bit Linux HL2 and Tribes Vengeance dedicated server will probably be out before too long. Personally, I've lost my interest in NS. <!--QuoteEnd--> </td></tr></table><div class='postcolor'> <!--QuoteEEnd-->
Sorry, didnt make myself clear. I meant how many people for 64bit NS hardware. I know a lot of corporates use 64bit platforms now, but i was specifically asking about 64 bit hardware in terms of running NS servers.
Oh right. Not many. I know of one other company that runs a Dual Opteron server and NS besides people in this thread: www.warmfuzzyland.com. You can't really rent a 64-bit server, you have to pretty much own the hardwre and colocate it inside of a datacenter. This greatly limits the number of servers as I'd say most x86 hardware is leased.
However, as datacenters buy new servers, we'll be seeing 64-bit leasing plans in the future. I like to keep ahead of the pack. A ns_amd64.so would run faster on the same hardware, however, the hardware is so fast it runs the 32-bit build as fast as anything else; 22 NS players use half an Athlon3400+. *shrugs* I'll run an NS server as long as people play on it, but with all the sweet games just over the horizon...