Linux Latency Issues
Rotten_Flesh
Join Date: 2002-11-18 Member: 9203Members
<div class="IPBDescription">How it works . . .</div> This post is taken from Valve's mailinglist hlds_linux:
<!--QuoteBegin--></span><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> </td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->There seems to be a lot of confusion about latency issues in a HL
server. What exactly those pingboost options do, and when and why to
use them. I'm the author of the original UDP Soft pingbooster, so I
know a thing or two about this issue. I'm sure I will be repeating a
lot of info for many of you, but I decided to write it all down once
and for all. All of the recommendations I give here are of course just
my opinions, but I like to think myself as being well informed
about these things.
First, what is wrong with a normal way a Half-Life server does things.
HL server normally tries to work like this:
sleep for 1ms
while (it's not yet time for the next frame)
sleep for 1ms
receive and handle all incoming packets
send updates to clients if updaterate and rate settings allow it
repeat forever
So basically a HL server always sleeps at least 1ms between every frame,
and more if enough time has not passed (determined by sys_ticrate)
since starting of the previous frame. This is supposed to keep a
server frame rate exactly at sys_ticrate. The problem with this approach
is the fact that you can't sleep for a 1ms in a normal Linux system.
Trying to sleep for any amount of time takes at least 10ms or more. This
is due to a scheduling frequency of the Linux kernel, which is normally
100Hz (for a slightly longer explanation check out the manual page of
the nanosleep).
So the way a HL server actually works, looks something like this:
sleep for 10-20ms
while (it's not yet time for the next frame)
sleep for 10-20ms
receive and handle all incoming packets
send updates to clients if updaterate and rate settings allow it
repeat forever
This causes some serious latency, because a hl server is sleeping, while
incoming packets from clients are stacking up and waiting to be
handled. With this system, it's basically impossible to get more than
50fps out of a HL server, no matter how fast hardware you have running
it.
Now, what my pingbooster software, and all those -pingboost options
do, is try to change the way the server behaves between frames, when
it normally tries to do that "1ms sleep loop".
I thought that if the server has some work to be done (incoming
packets from clients), it should immediately start doing that work,
and not sleep. So in my pingbooster I replaced the sleep call the HL
server was using with a select call, which listens the network
connection, and waits until there is incoming packets. I also set
sys_ticrate to some crazy number like 10000, so a HL server would
always think that it's time for the next frame. I also set
sv_maxupdaterate to 100 from 60, because I like getting fresh updates
for every frame in my client <!--emo&:)--><img src='http://www.unknownworlds.com/forums/html/emoticons/smile.gif' border='0' style='vertical-align:middle' alt='smile.gif'><!--endemo-->
So a HL server loop was changed to:
wait until there is incoming packets from clients
receive and handle all incoming packets
send updates to clients if updaterate and rate settings allow it
repeat forever
This turned out to be very effective in reducing internal latencies in
a HL server. And replacing the sleep call with waiting for client
packets is also exactly what -pingboost 3 option does. So if anyone is
still using my pingbooster, you can get the same performance by using
-pingboost 3, and setting sys_ticrate to some extremely high number
(like 10000), and sv_maxupdaterate to 100. With that kind of settings,
your server never waits when it has some work to do. So if you get a
99% cpu usage, it just means there is something to do all the time,
and the server is as responsive as it can be with your hardware.
I haven't checked lately what pingboost 1 and 2 options do, but at
least a few months ago they just used different system calls to try to
sleep that 1ms (and failing IIRC), and I wouldn't recommend using those
options. Correct me if I'm wrong.
There is also a way to "fix" Linux to work better with a default HL
server configuration. You can increase the scheduling frequency of
the Linux kernel by editing a file in the Linux kernel source, and
recompiling it. The file is include/asm-i386/param.h, and you have to
change the line "#define HZ 100" to "#define HZ 1000". I've been told
that this is actually a default in 2.5 series of kernels. This should
cause a HL server to behave more or less exactly as it was originally
designed. However, I personally happen to disagree with that design,
and recommend using -pingboost 3 option regardless <!--emo&:)--><img src='http://www.unknownworlds.com/forums/html/emoticons/smile.gif' border='0' style='vertical-align:middle' alt='smile.gif'><!--endemo--> At least with "one
server/cpu" systems.
When you want to run a quality server (and when I say a quality server,
I don't mean a kind of crap that 99% of the hl servers out there are),
number one priority for you should be getting it running at over
100fps consistently. No one cares if it runs 400fps, when there is no
activity, and everyone is camping. It has to stay over 100fps when all
**** hits the fan. When it really matters. Client input should be
handled as quickly as possible, and the only way to do that is having
a high frame rate on the server.
Almost all of the cpu time HL server uses is caused by handling of
incoming packets, and generating updates for clients. So a lot depends
on what kind of settings your clients are using.
If all your players are just average newbies with default
cl_updaterate 20 and cl_cmdrate 30, you can get away with a fairly
large number of maxclients on a relatively slow hardware. And they are
not likely to recognize a good server if it bit them in the **** anyway
<!--emo&:)--><img src='http://www.unknownworlds.com/forums/html/emoticons/smile.gif' border='0' style='vertical-align:middle' alt='smile.gif'><!--endemo--> On the other hand, if you have mostly more experienced players with
fat pipes, and using settings like cl_updaterate 101 and cl_cmdrate
101, you have to have a significantly more powerful server to have it
running at 100fps.
Tips for increasing your server performance:
1. Use -pingboost 3 +sys_ticrate 10000 +sv_maxupdaterate 100
2. If you have more than 1 server/cpu, drop all except 1.
3. Decrease maxclients
4. Decrease amount of metamod mods, or remove metamod completely. They
truly use up incredible amounts of cpu if you consider how little
processing power all the features in those mods should require. Someone
should really look into this..
5. Decrease sv_maxupdaterate.
For reference on what kind of a hardware you might need, I've been
testing with a 1ghz amd processor, and without metamod I can run a 14
player server (with clients with fast connections) without any
significant slowdown. With a metamod+some small mods that number goes
down to about 10.
Almost all of the information I have about the internal workings of a
HL server, I got from using a very simple debugging utility called
strace, and I recommend anyone with any interest in learning more
about what makes these programs tick, to check it out.
For example, performance differences between 3.1.1.0 and 3.1.1.1
versions are clearly visible by comparing strace outputs of different
versions. It seems that in 3.1.1.1 version, generating an update for a
client can take up to 3 times longer than it does in 3.1.1.0 version,
and that's what causing it to perform so badly.
Also, I would recommend Valve to test using sched_yield() call instead
of usleep(), for purposes of releasing a cpu for other applications.
- Zibbo<!--QuoteEnd--></td></tr></table><span class='postcolor'><!--QuoteEEnd-->
<!--QuoteBegin--></span><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> </td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->There seems to be a lot of confusion about latency issues in a HL
server. What exactly those pingboost options do, and when and why to
use them. I'm the author of the original UDP Soft pingbooster, so I
know a thing or two about this issue. I'm sure I will be repeating a
lot of info for many of you, but I decided to write it all down once
and for all. All of the recommendations I give here are of course just
my opinions, but I like to think myself as being well informed
about these things.
First, what is wrong with a normal way a Half-Life server does things.
HL server normally tries to work like this:
sleep for 1ms
while (it's not yet time for the next frame)
sleep for 1ms
receive and handle all incoming packets
send updates to clients if updaterate and rate settings allow it
repeat forever
So basically a HL server always sleeps at least 1ms between every frame,
and more if enough time has not passed (determined by sys_ticrate)
since starting of the previous frame. This is supposed to keep a
server frame rate exactly at sys_ticrate. The problem with this approach
is the fact that you can't sleep for a 1ms in a normal Linux system.
Trying to sleep for any amount of time takes at least 10ms or more. This
is due to a scheduling frequency of the Linux kernel, which is normally
100Hz (for a slightly longer explanation check out the manual page of
the nanosleep).
So the way a HL server actually works, looks something like this:
sleep for 10-20ms
while (it's not yet time for the next frame)
sleep for 10-20ms
receive and handle all incoming packets
send updates to clients if updaterate and rate settings allow it
repeat forever
This causes some serious latency, because a hl server is sleeping, while
incoming packets from clients are stacking up and waiting to be
handled. With this system, it's basically impossible to get more than
50fps out of a HL server, no matter how fast hardware you have running
it.
Now, what my pingbooster software, and all those -pingboost options
do, is try to change the way the server behaves between frames, when
it normally tries to do that "1ms sleep loop".
I thought that if the server has some work to be done (incoming
packets from clients), it should immediately start doing that work,
and not sleep. So in my pingbooster I replaced the sleep call the HL
server was using with a select call, which listens the network
connection, and waits until there is incoming packets. I also set
sys_ticrate to some crazy number like 10000, so a HL server would
always think that it's time for the next frame. I also set
sv_maxupdaterate to 100 from 60, because I like getting fresh updates
for every frame in my client <!--emo&:)--><img src='http://www.unknownworlds.com/forums/html/emoticons/smile.gif' border='0' style='vertical-align:middle' alt='smile.gif'><!--endemo-->
So a HL server loop was changed to:
wait until there is incoming packets from clients
receive and handle all incoming packets
send updates to clients if updaterate and rate settings allow it
repeat forever
This turned out to be very effective in reducing internal latencies in
a HL server. And replacing the sleep call with waiting for client
packets is also exactly what -pingboost 3 option does. So if anyone is
still using my pingbooster, you can get the same performance by using
-pingboost 3, and setting sys_ticrate to some extremely high number
(like 10000), and sv_maxupdaterate to 100. With that kind of settings,
your server never waits when it has some work to do. So if you get a
99% cpu usage, it just means there is something to do all the time,
and the server is as responsive as it can be with your hardware.
I haven't checked lately what pingboost 1 and 2 options do, but at
least a few months ago they just used different system calls to try to
sleep that 1ms (and failing IIRC), and I wouldn't recommend using those
options. Correct me if I'm wrong.
There is also a way to "fix" Linux to work better with a default HL
server configuration. You can increase the scheduling frequency of
the Linux kernel by editing a file in the Linux kernel source, and
recompiling it. The file is include/asm-i386/param.h, and you have to
change the line "#define HZ 100" to "#define HZ 1000". I've been told
that this is actually a default in 2.5 series of kernels. This should
cause a HL server to behave more or less exactly as it was originally
designed. However, I personally happen to disagree with that design,
and recommend using -pingboost 3 option regardless <!--emo&:)--><img src='http://www.unknownworlds.com/forums/html/emoticons/smile.gif' border='0' style='vertical-align:middle' alt='smile.gif'><!--endemo--> At least with "one
server/cpu" systems.
When you want to run a quality server (and when I say a quality server,
I don't mean a kind of crap that 99% of the hl servers out there are),
number one priority for you should be getting it running at over
100fps consistently. No one cares if it runs 400fps, when there is no
activity, and everyone is camping. It has to stay over 100fps when all
**** hits the fan. When it really matters. Client input should be
handled as quickly as possible, and the only way to do that is having
a high frame rate on the server.
Almost all of the cpu time HL server uses is caused by handling of
incoming packets, and generating updates for clients. So a lot depends
on what kind of settings your clients are using.
If all your players are just average newbies with default
cl_updaterate 20 and cl_cmdrate 30, you can get away with a fairly
large number of maxclients on a relatively slow hardware. And they are
not likely to recognize a good server if it bit them in the **** anyway
<!--emo&:)--><img src='http://www.unknownworlds.com/forums/html/emoticons/smile.gif' border='0' style='vertical-align:middle' alt='smile.gif'><!--endemo--> On the other hand, if you have mostly more experienced players with
fat pipes, and using settings like cl_updaterate 101 and cl_cmdrate
101, you have to have a significantly more powerful server to have it
running at 100fps.
Tips for increasing your server performance:
1. Use -pingboost 3 +sys_ticrate 10000 +sv_maxupdaterate 100
2. If you have more than 1 server/cpu, drop all except 1.
3. Decrease maxclients
4. Decrease amount of metamod mods, or remove metamod completely. They
truly use up incredible amounts of cpu if you consider how little
processing power all the features in those mods should require. Someone
should really look into this..
5. Decrease sv_maxupdaterate.
For reference on what kind of a hardware you might need, I've been
testing with a 1ghz amd processor, and without metamod I can run a 14
player server (with clients with fast connections) without any
significant slowdown. With a metamod+some small mods that number goes
down to about 10.
Almost all of the information I have about the internal workings of a
HL server, I got from using a very simple debugging utility called
strace, and I recommend anyone with any interest in learning more
about what makes these programs tick, to check it out.
For example, performance differences between 3.1.1.0 and 3.1.1.1
versions are clearly visible by comparing strace outputs of different
versions. It seems that in 3.1.1.1 version, generating an update for a
client can take up to 3 times longer than it does in 3.1.1.0 version,
and that's what causing it to perform so badly.
Also, I would recommend Valve to test using sched_yield() call instead
of usleep(), for purposes of releasing a cpu for other applications.
- Zibbo<!--QuoteEnd--></td></tr></table><span class='postcolor'><!--QuoteEEnd-->
Comments
thx!
-f!
On the server of course
<!--QuoteBegin--></span><table border='0' align='center' width='95%' cellpadding='3' cellspacing='1'><tr><td><b>QUOTE</b> </td></tr><tr><td id='QUOTE'><!--QuoteEBegin-->Now, what my pingbooster software, and all those -pingboost options
do, is try to change the way the server behaves between frames, when
it normally tries to do that "1ms sleep loop".<!--QuoteEnd--></td></tr></table><span class='postcolor'><!--QuoteEEnd-->
Zibbo made that ping booster, and as you can see he talks about it in that article.
server configuration. You can increase the scheduling frequency of
the Linux kernel by editing a file in the Linux kernel source, and
recompiling it. The file is include/asm-i386/param.h, and you have to
change the line "#define HZ 100" to "#define HZ 1000". I've been told
that this is actually a default in 2.5 series of kernels. This should
cause a HL server to behave more or less exactly as it was originally
designed. However, I personally happen to disagree with that design,
and recommend using -pingboost 3 option regardless At least with "one
server/cpu" systems.<!--QuoteEnd--></td></tr></table><span class='postcolor'><!--QuoteEEnd-->
This is stating that 1ms is not really 1ms in linux <!--emo&:)--><img src='http://www.unknownworlds.com/forums/html/emoticons/smile.gif' border='0' style='vertical-align:middle' alt='smile.gif'><!--endemo--> And it is true. You can define a "static" one like that but if you really want to experiment use <a href='http://www.kernel.org/pub/linux/kernel/people/rml/variable-HZ/' target='_blank'>http://www.kernel.org/pub/linux/kernel/peo...ml/variable-HZ/</a> and you can echo values into your /proc fs to change the HZ on the fly to see what works best for your server. As i have had pingboost3 be very unstable for my server. I have not yet patches this and cannot confirm its stability. If using this method, it will allow you to use multiple HLDs server w/o using pingboost (or pingboost 1) w/o sacraficing a single CPU, though, this is understood in NS, this also might help you if you run a dualie with other mods/games.
Btw, thnx for the sticky
can any <b>experienced linux server admins</b> share some insight into this? i am anxious to try this out, but do not yet have as much information as i'd like before i begin to experiment with this...
thanks!
-f!
wanna explain em?
<b>EDIT:</b> and roughly how much more cpu usage per slot can I expect to have using this with say a 2.2 ghz 1 gig ram with redhat?