![]() |
Fatal1ty K6+ BSOD - Need some fresh ideas |
Post Reply ![]() |
Page 123 4> |
Author | |
heapson ![]() Newbie ![]() ![]() Joined: 03 May 2016 Location: Townsville Status: Offline Points: 22 |
![]() ![]() ![]() ![]() ![]() Posted: 03 May 2016 at 11:17am |
I have a self built system misbehaving badly and need to determine the
cause of the issue so I can replace something. I'm convinced it's a
hardware or BIOS issue. I've had 136 Blue Screens on the main build.
I've also had 39 blue screens on a clean installation with a different
hard drive over the course of my adventure.
I highly suspect motherboard or RAM compatibility but I'm not sure. The blue screens occur more frequently when copying large files or running long hard drive diagnostics such as Seatools long generic or chkdsk /r. System:
Things I've tried:
So, can anyone suggest another plan of attack or something I've missed like a checkbox that says "behave"! I'm really at the end of ideas. I'm happy to replace anything but I need to work out what I need to replace. I now have time to work on this 100% so I'll keep plugging away and update this thread as I try suggestions from you guys. Cheers. Mick. Edited by heapson - 04 May 2016 at 7:09pm |
|
![]() |
|
heapson ![]() Newbie ![]() ![]() Joined: 03 May 2016 Location: Townsville Status: Offline Points: 22 |
![]() ![]() ![]() ![]() ![]() |
I forgot to mention anything about the actual blue screens. I had another one just as I was about to post the first version of this reply so 137 now!
Almost all are KERNEL_SECURITY_CHECK_FAILURE (139) in ntoskrnl.exe+142760 The process name varies and appears to be fairly random. I haven't analysed the minidumps on the fresh installation. I might have a look at those now and post back. The very first blue screen on that one before I'd installed all the drivers from the ASRock site was BAD_POOL_HEADER. |
|
![]() |
|
heapson ![]() Newbie ![]() ![]() Joined: 03 May 2016 Location: Townsville Status: Offline Points: 22 |
![]() ![]() ![]() ![]() ![]() |
The fresh Windows 10 installation is on a Sata SSD drive so I'll call that Sata SSD and my original installation is on my M2 SSD.
The Sata SSD seems to be way less stable than it used to be, even with all other drives disconnected and the M2 drive removed. I tried to analyse the dump files but it kept crashing. It's up to 49 blue screens now. The crashes seem to be much more random on that Sata SSD drive than on the M2 SSD drive. I rebooted back to my M2 SSD and had a quick look at the crashes from both installations. All of the crashes without a bug check string in the images are KERNEL_SECURITY_CHECK_FAILURE. Another thing I've tried but haven't mentioned is manually setting the memory timings, at the stock 2133MHz 15.15.15.36 1.2v and at the rated but overclocked 3200MHz 16.18.18.36 1.35v. It doesn't change things. The XMP profile still seems to be the most stable. M2 SSD (Original installation, which used to be stable) ![]() Sata SSD, which has never been stable even with a fresh installation ![]() Edited by heapson - 03 May 2016 at 1:26pm |
|
![]() |
|
Xaltar ![]() Moderator Group ![]() ![]() Joined: 16 May 2015 Location: Europe Status: Offline Points: 25953 |
![]() ![]() ![]() ![]() ![]() |
Welcome to the forums.
First up, good job so far on the troubleshooting. I don't see any mention of reseating the CPU and checking for bent pins. The issues you describe seem pretty random which would incline me to believe you have an issue with data corruption, likely being caused by the RAM or memory controller. I would try using some different RAM from the support list, generally Kingston and G-Skill work well with ASRock boards from what I have seen. If you have not already you should also perform a full CMOS clear.
|
|
![]() |
|
![]() |
|
heapson ![]() Newbie ![]() ![]() Joined: 03 May 2016 Location: Townsville Status: Offline Points: 22 |
![]() ![]() ![]() ![]() ![]() |
Thanks for the suggestions. I haven't tried either of those but I have flashed the BIOS back to 1.8, the original one it came with, and then back to 2.6 when it didn't help. I'll try a full CMOS reset. It has a button but I'll have to look up how it works.
The system was stable for the first two months of the year. I can't see how the CPU could be seated incorrectly or have bent pins. I'm reluctant to change anything. The CPU temperature is about 28C while idle and rarely gets above 60C. Do you really think I should reseat it? I do have thermal paste but it's currently using the original Corsair thermal compound. |
|
![]() |
|
heapson ![]() Newbie ![]() ![]() Joined: 03 May 2016 Location: Townsville Status: Offline Points: 22 |
![]() ![]() ![]() ![]() ![]() |
If it was memory corruption or bad RAM, wouldn't Memtest show this? It ran last weekend while I was away for 34 hours with zero errors or crashes.
Memtest does show different timings. I can't explain that but it displays 19.15.15.31 even when manually configured in the BIOS. I live remote and to buy any components like RAM takes a week. I want to ensure I buy the right piece of the puzzle. |
|
![]() |
|
Xaltar ![]() Moderator Group ![]() ![]() Joined: 16 May 2015 Location: Europe Status: Offline Points: 25953 |
![]() ![]() ![]() ![]() ![]() |
Fair enough. We have seen a lot of issues with Skylake and RAM compatibility issues and random corruption related BSODs in the OS quite often stem from the RAM/memory subsystem. In that case I would start by removing the CPU and visually inspecting both it and the socket. Make sure the CPU isn't warped and that there are no bent pins in the socket. Once you have it all reseated clear CMOS via the battery removal method then boot into the OS with factory defaults (no changes made in UEFI). If that does not resolve the issue then you should try running some CPU stress tests to determine if the CPU (memory controller) hasn't gone bad. I would recommend using Intel Burn in Test (IBT) for 10 passes then Prime 95 with standard settings for 6 - 8 hours.
I would not risk updating BIOS while the system is unstable so lets leave that as a last resort.
|
|
![]() |
|
![]() |
|
heapson ![]() Newbie ![]() ![]() Joined: 03 May 2016 Location: Townsville Status: Offline Points: 22 |
![]() ![]() ![]() ![]() ![]() |
Removed the CPU, cleaned the thermal compound, checked for warping and bent pins, removed the CMOS battery for 30 minutes and set the clear CMOS jumpers, re-seated the CPU with Arctic Silver thermal compound.
Booted. Nothing. Reset the CMOS jumpers and booted. Still nothing. Removed the power cable from the motherboard and reconnected. Booted fine. Loaded the default UEFI and set the clock. Booted to my Sata SSD. IE crashed almost immediately and then nothing would open or respond, not even CTRL + ESC. Computer wouldn't restart or turn off. Then it blue screened before I could power it down manually. This one was PAGE_FAULT_IN_NON_PAGED_AREA. Reloaded default UEFI settings, booted to my M2 SSD. Posted this. I'll try the stress tests you recommend. |
|
![]() |
|
heapson ![]() Newbie ![]() ![]() Joined: 03 May 2016 Location: Townsville Status: Offline Points: 22 |
![]() ![]() ![]() ![]() ![]() |
Literally less than a second after posting that, KERNEL_SECURITY_CHECK_FAILURE, the standard one for this particular installation.
I'll try loading the XMP memory profile and see if I can get the stress tests to run. It seems to be the most stable configuration for some reason. Thanks again for your time and suggestions. Very much appreciated. |
|
![]() |
|
Xaltar ![]() Moderator Group ![]() ![]() Joined: 16 May 2015 Location: Europe Status: Offline Points: 25953 |
![]() ![]() ![]() ![]() ![]() |
The failure to boot after a CMOS clear happens sometimes, especially if the PSU was still holding a charge at the time the jumper was set to clear. Removing the power from the board allows the board to clear any residual charge it may be holding and subsequently reset the power safety features. The same effect could have been achieved by leaving the system powered off (at the wall) for 20mins - 1 hour (depending on the quality of the PSU).
There is clearly data corruption occurring but at this point I can't tell you if it is the CPU, RAM or motherboard, or some combination thereof that are at fault. Is it at all possible for you to test another power supply with the system? That is another major potential candidate for BSODs. I should have mentioned that at the start. The first step in any kind of hardware troubleshooting should always be trying another PSU. You shouldn't need an overly powerful one to test with just the CPU, RAM, SSD and motherboard so long as it has all the requisite power connectors for the system.
|
|
![]() |
|
![]() |
Post Reply ![]() |
Page 123 4> |
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |