Intermittent Short Lock Ups - RAID Card / Caching ?

GnolevilGnolevil Join Date: 2012-11-02 Member: 166143Members
Hello all,

I'm having an interesting issue here. Every few minutes or so, I'll experience a ~500 msec hard lock up, then the game will recover. This just recently started happening after I installed a RAID controller.
My current setup:
i5 2500k
Asus Z77
8GB DDR3
570 GTX
120GB SATAIII SSD -> On Mobo
LSI 9260-4i RAID Controller
4x WD RE4 2TB in RAID 10 -> On Controller

The RAID10 is split into 2 arrays. The first array being 500GB, then the remainder (For performance).

My guess to why this is happening (A second opinion here would be awesome) is that the card is trying to read ahead. I have caching enabled on the card for both reads and writes. LSI's 'read ahead' allows the card to try to interpret what will be read next and caches it.
Would caching have any significant impact in the gaming environment?
The OS and steam are stored on a different drive. The steam 'common' folder has a junction to the array.

Any feedback will be appreciated. Even the "You're an idiot" :)

Thanks in advance.

Comments

  • ConfusedConfused Wait. What? Join Date: 2003-01-28 Member: 12904Members, Constellation, NS2 Playtester, Squad Five Blue, Subnautica Playtester
    I would try turning on the loadtimes command once you are in the readyroom you should be able to track if the stutter is loading related quickly. If it isn't try r_stats and net_stats and the source should show pretty quickly

    In terms of tweaking your raid, I got nothing on that front:)
  • IronHorseIronHorse Developer, QA Manager, Technical Support & contributor Join Date: 2010-05-08 Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
    If those steps don't produce anything please let us know.
    I have a command for you to try but it comes with complicated instructions, so its last resort.
  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    raidwise.. need more info. Start with basics.

    * check for driver updates, naturally on correct 32/64 bit.
    * how many disks does each stripe (0) array of your total mirror (1) configuration have? ( i take 2 disks in stripe per array?)
    * Heating a issue with your disks?
    * Did you completely reinstall windows from scratch if windows is ON the array? If I read ok you dont have it on the array. If you do not install it fresh when raid is enabled, enabling it lateron will cause weird issues which are insane to troubleshoot.
    Also sometimes this can still be a problem if windows is not on the actuall array.
    * You have Re disks... so thats good!
    * im also at a loss with your 500GB comment. Last I checked, stripe disks must be equal size, and mirror arrays must be equal size.

    I have no experience with cashing on the card. I enabled caching on my disks. (I use a onboard mirror 1 with a mere 2 disks) Granted, not as shiny.. but it gets done what I want it to do. hehe
    But last I checked, caching can be safely enabled/disabled to test.

    * Did you check with and without hotplugging enabled?

    Would definately just make a backup like a ghost and try a clean windows install. Worked wonders on my setup, even with windows not being on the array. World of difference.
    ALso make sure your OS disk drivers are up to date and in ahci.
  • GnolevilGnolevil Join Date: 2012-11-02 Member: 166143Members
    Thanks for the replies so far guys.
    I'll give the loadtimes a shot.

    @DC_Darkling

    * check for driver updates, naturally on correct 32/64 bit.
    --Up to date

    * how many disks does each stripe (0) array of your total mirror (1) configuration have? ( i take 2 disks in stripe per array?)
    --2x mirrors that are striped. Total of 4 disks.

    * Heating a issue with your disks?
    --No errors present

    * Did you completely reinstall windows from scratch if windows is ON the array? If I read ok you dont have it on the array. If you do not install it fresh when raid is enabled, enabling it lateron will cause weird issues which are insane to troubleshoot.
    --Windows is on the SSD. It is a relatively fresh install. I think you're on to something here. I used to have a different RAID controller (Adaptec 3405 with 2x 500GBs on it) and it worked fine.. If the issues persist, I might as well just go with a clean install. If it means anything, I do have Windows 7 Enterprise installed.

    Also sometimes this can still be a problem if windows is not on the actuall array.
    * You have Re disks... so thats good!
    * im also at a loss with your 500GB comment. Last I checked, stripe disks must be equal size, and mirror arrays must be equal size.
    --Sorry, I was not very clear there. I have 2 RAID10s across those 4 drives. The first RAID is 500GB, and the second is the remainder of the disks. I did this for performance reasons. As you are more than likely aware (But great info to others :D ) you get the highest I/O performance toward the center of the platters.

    I have no experience with cashing on the card. I enabled caching on my disks. (I use a onboard mirror 1 with a mere 2 disks) Granted, not as shiny.. but it gets done what I want it to do. hehe
    But last I checked, caching can be safely enabled/disabled to test.
    --I plan on trying this later today. I turned off all forms of caching on the array, both read and write.

    * Did you check with and without hotplugging enabled?
    --I have not.

    Would definately just make a backup like a ghost and try a clean windows install. Worked wonders on my setup, even with windows not being on the array. World of difference.
    ALso make sure your OS disk drivers are up to date and in ahci.
    --Just flashed the SSD a week back, and verified all drivers are up to date and running on AHCI.
    [/quote]

    Appreciate all the help so far gentlemen !
    I just think that technology just hates me :( .. That or I try to use things in improper fashions .. hehe
  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    The more I read on your raid setup, the more it confuses me to no end.

    you have 4 disks.. but you state you have 2 "10" raids.
    If you have 1 mirror each array, each mirror must have 2 stripes, coming on a total of 8 disks..
    Unless you mean you do not use the full disk for each array, but a % of a disk per array, so 8 parts? I can see the logic in doing this for using the middle part, but it still seems rather overcomplicating things.
    I have not actually tried that so I can not say how it will influence performance or stability.

    Do note windows does defragment regularly on its own. I dont notice it on my slower 1 raid setup, but perhaps it matters for yours?
    Also I set my sector size to max. Id prefer losing disk space over performance. (on partitions, I left the stuff in the array on default)


    Caching usually increased performance. The reason it can be turned of is simply to not have data in the cache which can be lost with a power outage. In most cases, cache for performance!

    Good point on your ssd flash. That can work wonders.
    Some ssd also do not like some controllers. just a hint there.

    Windows7 performance is...odd without reinstalling while you have new controllers installed.. I would definately urge you to make it a priority.
    I could give a long list of personal weird shit before I reinstalled mine for the exact same reason, but lets not. hehe
  • GnolevilGnolevil Join Date: 2012-11-02 Member: 166143Members
    Sorry Darkling, didn't mean to confuse you on that one. I did do what you said and did not use the full disk for each array. Not sure I was not able say that in the beginning ...

    But anyhow, I did disable all forms of caching for that array, however I do still experience the lock ups. I enabled loadtimes and I have been checking after every instance of the lockup. I'm seeing that NS is loading one or two items. The most noticeable episode I saw was when I got insta-gibbed by a rail gun and the game hard locked for a good second trying to load all the gib models.

    As much as I want to re-install windows, I really do not. I've re-installed windows on this machine at least 6 times in the past month, so I hope you guys understand my reluctance to do so. :)

    Again, thanks for the feedback !
  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    If it doesnt make a difference, go for cache on, on the disks!
    controller cache on, no experience, can not say.

    Problem here is that possible disk/raid/ahci related errors usually require a reinstall. So I must advice it regardless. (I split most of my os to a single disk to minimize work after reinstall)
    You could make a ghost image of a clean install next time, right? :P

    Did you check hard disk speed? And I mean actual speed. You should not experience problems loading stuff on the fly.
    Check event viewer for errors. Especially disk related it shows.
    You are running s.m.a.r.t. enabled?


    In the event that it is NS2 and not your hardware.. run the profiler command. (Ironhorse is more capable with it then I am)
    You can not use the spacebar while you do, it pauses or caps the game then. (cant remember)
    basicly you hit spacebar after a lag period. Look for purple bars.. expand them to there full so no "+" remains. Post the screenshot. Ironhorse or another Playtester will have clues. (I usually do not hehe)


    How are you using 10? completely controller controlled? software controlled?
    NS2 is a very very very VERY cpu intensive game. So if the raid setup is needing cpu it will start hitching.
  • GnolevilGnolevil Join Date: 2012-11-02 Member: 166143Members
    I'll give that profiler command a shot. Thanks for the info on that !

    I went through the event viewer events and I do not have any disk errors. I also checked the controller and that is clean as well. As far as the S.M.A.R.T. stuff goes, short is answer is: I have no idea. I have yet to find a way to read S.M.A.R.T. data from a drive sitting on an LSI controller. I'm still looking into a way to do so, but at this time, I have no reason to believe these drives have errors on them.

    As for speed, I ran 2 Atto benchmarks. One at 1MB length, and 64MB length which are below. These results are with caching and read ahead enabled. Part of me is starting to think that the read ahead was never disabled and still may be causing this issue.

    The RAID 10 is sitting on the controller. All of the "work" for the RAID is done by the controller, and not by the CPU.

    1MB
    1mbbench.png
    64MB
    64mbbench.png
  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    not entirely sure if it supports smart then.. I never tried LSI.

    Running low on ideas at the moment.. transfer rates aint that bad either.
  • GnolevilGnolevil Join Date: 2012-11-02 Member: 166143Members
    Alright ... Been busy and finally got a chance to hop on.

    I was able to reproduce this 100% of the time by forcing the game to load something. In this case, I had myself (skulk) hit by a railgun to cause gibs. Here are a few screen caps directly after the lag. Two are with the profiler open, and one is the loadtimes.
    [img][/img]2013032900001.jpg
    2013032900002m.jpg
    2013032900003.jpg
Sign In or Register to comment.