Short-circuit in B550M Steel Legend£ |
Post Reply | Page 12> |
Author | |||
a13antichrist
Newbie Joined: 24 May 2018 Location: Amsterdam, NL Status: Offline Points: 136 |
Post Options
Thanks(0)
Posted: 22 Sep 2023 at 6:02am |
||
I have had my B550M Stl.Lgd for about 24 months and it has constantly been plagued by random reboots. Mostly it is just flat-out system reset with no warning or blue screen or anything.
The past few weeks it had been giving a WHEA_UNCORRECTABLE_ERROR blue screen that would not calculate its %, I would always have to hard-reset manually when that happens. I had suspect a short-circuit somewhere because every now and again when I plug in a USB pen drive in the rear, the whole thing shuts down, instantly. Not very often, but it happened enough to be noteworthy. This is when I got suspicious about the board. On top of that, it would almost -always- crash/reset if I had Polychrome open for long enough. So I got pretty good at flipping it open, configuring quickly and closing it again. I do have a lot of RGB in here and so I thought maybe I had hooked something up incorrectly or crossed a wire. But I have since disconnected all the RGB (unplugged the ARGB headers) and the issue continues. Now on a recent crash my 2Tb PCI Gen4 SSD (installed in the under-the-Armor/GPU slot) didn't come back up again. It's no longer recognised in any slot, case or reader. It's fried. Not nice. So I ordered a new SSD, thinking, well ok, maybe SSD issue, that's what was causing the crashes. But no. The new SSD (different model) crashes also. Only 3 times so far, but that's in a week, so not exactly rare either. There is one curiosity I have noticed, and that's that HWinfo reports wayward readings for the metric: Power Deviation Reporting Accuracy At the moment it looks like this: Current -- Minimum -- Maximum -- Average 64.4% -- 50% -- 141.3% -- 67.6% For info the max CPU core temp at this point is 64°. I have no overclock in the system; except that I am using 3600MHz RAM with resulting InfinityFabric matching, I guess. BUt I have also run the RAM at 2666 just to be sure and the reset still happens. The specs are: Ryzen 9 5950X RTX 3090 FE (but had a 3080 originally and had the same issues) 64Gb 4x16Gb 3600Mhz, full-auto Intel X520 10G NIC 2tb main M.2 under Armor/GPU + 4tb SSD in second/PCI-gen3 Lian Li O11D mini and well 14x 120mm fans. :E PSU is CoolerMaster V850 Gold SFX. Also have Gelid Extender cables on both the GPU (2x) and 24pin ATX. If I had a spare PSU, I would test swapping that out, but I don't, and anyway there is nothing SFX over 850 that is remotely good value so it would be cheaper to swap out the MB in that case. Though less convenient obviously. So the question is are there any known issues with this board, any other reports of similar issues? Or any clues here that someone can recognize as potentially indicative? Appreciate any thoughts. |
|||
R9 5950x | Asrock B550M Steel Legend | 64Gb 3600Mhz | RTX 3090 FE
R3 2200G | ASRock Fatal1ty AB350 ITX Dell Latitude 7470 | QNAP @58Tb | Mikrotik routers/wifi |
|||
eccential
Senior Member Joined: 10 Oct 2022 Location: Nevada Status: Offline Points: 4810 |
Post Options
Thanks(0)
|
||
I'm not aware of any known issues with the board. I have one and it's been perfectly reliable, like all my AsRock (and one AsRockRack) systems. I built 9 of them.
The one with the B550M Steel Legend is my most powerful system, with 5800X3D, 64GB, four SSDs (two NVMe, two SATA), and two BD-RW drives. My GPU is just a RX-6600. All powered by a Seasonic Fanless Prime PX-500. One thing I noticed is your RAM setup. You have four DIMMs at 3600. That's probably kind of hard on the memory controller. I only have two DIMMs (32+32), and they're 100% JEDEC-spec 3200MT/s ECC sticks. JEDEC-spec, so 22-22-22 timing. LOL If you suspect shorting somewhere, I'd remove everything and rebuild, after inspecting each component as I re-seat them. Watch out for motherboard stand that shouldn't be there and what not. I've had PCIe devices not working due to dirty contacts, but they just don't work at all, rather than intermittent issues. A wipe with alcohol fixes them. |
|||
a13antichrist
Newbie Joined: 24 May 2018 Location: Amsterdam, NL Status: Offline Points: 136 |
Post Options
Thanks(0)
|
||
I've been leaning towards a rebuilt but I have three fans on a tower CPU cooler and it was *really* a b*tch to get it all mounted so I'm trying to avoid a full disassembly heh.
4 DIMMs @3600 could be a problem you reckon? You know I did suspect memory initially.. just because an immediate power-off usually ties to memory fault. But the thing with the USB drive.. that *has* to be electrical. The other thing is that, at least before my latest refresh (new OS), the main time it would hang is during high GPU load.. which correlates with high memory usage but also potentially the PSU, since the card is demanding at high power. So many variables and no way to really test any of them. :/ Or is it possible to disable memory banks in BIOS.. don't recall see the option. |
|||
R9 5950x | Asrock B550M Steel Legend | 64Gb 3600Mhz | RTX 3090 FE
R3 2200G | ASRock Fatal1ty AB350 ITX Dell Latitude 7470 | QNAP @58Tb | Mikrotik routers/wifi |
|||
eccential
Senior Member Joined: 10 Oct 2022 Location: Nevada Status: Offline Points: 4810 |
Post Options
Thanks(0)
|
||
You can always try down-clocking (urr, normal-clocking) the RAM.
No need to disassemble anything, since it's just a BIOS setting. *OFFICIALLY*, the max supported speed for 4 DIMMs is 2667MT/s, unless all 4 DIMMs are single-rank. Then, 2933MT/s is officially supported. Most latest 16GB DIMMs are single-rank, because DRAM density has gone up over the years. If the vendor doesn't give you good spec, the rule of thumb is, if it has DRAM chips on both sizes, it's dual-rank. There should be 16 DRAM chips total (2Rx8), or 18 chips if ECC. If DRAM chips are only on one side, it's likely single-rank (1Rx8). Technically, 2Rx16 (8 DRAM chips, each at 16-bits wide) might be possible, but I've never seen such a monstrosity. 1Rx16 DIMMs are horrible for performance. |
|||
a13antichrist
Newbie Joined: 24 May 2018 Location: Amsterdam, NL Status: Offline Points: 136 |
Post Options
Thanks(0)
|
||
My RAM is Crucial Ballistix 16-18-18-38. I do notice it says 1.35V while I seem to recall seeing 1.2 in BIOS. Will check this further. One thing I found today though after another random reset entirely while the computer was idle: EventID 18: WHEA Error Logger. A fatal hardware error has occurred. Reported by component: Processor Core Error Source: Machine Check Exception Error Type: Cache Hierarchy Error Processor APIC ID: 0 The details view of this entry contains further information. This might align with the Power Reporting imbalances I mentioned above. |
|||
R9 5950x | Asrock B550M Steel Legend | 64Gb 3600Mhz | RTX 3090 FE
R3 2200G | ASRock Fatal1ty AB350 ITX Dell Latitude 7470 | QNAP @58Tb | Mikrotik routers/wifi |
|||
eccential
Senior Member Joined: 10 Oct 2022 Location: Nevada Status: Offline Points: 4810 |
Post Options
Thanks(0)
|
||
Well, all that says is there was an uncorrectable hardware error.
If you disable "Automatically restart" in the "Start up and Recovery" setting, you might actually see it the BSOD, not that it would help you. And if you have it dump memory, somebody might be able to help pinpoint the actual error. But even that might be useless if you're getting different and unrelated error each time. Even after just a quick online search, I see all kind of different things can cause the MCE, and therefore, fix is also as diverse. One guy fixed it with a new power supply. Others replaced different parts. Even re-seating things (CPU, PCIe cards) might help. I've personally fixed a system by cleaning the motherboard socket once. I flooded the empty socket with CRC Electronics cleaner and wiggled the lever around to make sure all the wires get touched by the cleaner. But this was an obvious situation, as someone (not me) let thermal paste fall into the socket. Anyway, almost anything and everything can cause MCE. So I don't think a random nobody like me can really help guide you here. One of the reasons why all my PC builds use ECC memory is to eliminate unknowns, if only just one. |
|||
a13antichrist
Newbie Joined: 24 May 2018 Location: Amsterdam, NL Status: Offline Points: 136 |
Post Options
Thanks(0)
|
||
I see the BSOD, at least did before the SSD assassination. It says WHEA_Uncorrectable_Error ;) You're right, nothing helpful heh. But every crash seems to be followed by this same Processor error. And I do see 'power reporting' wayward figures. So that seems correlated. On the other hand a 'live' line on the MB to the case that resets the machine if I touch the USB stick to it.. that could literally manifest as any issue at all.
Many things could lead to an MCE. But how many things can lead to an MCE *and* a live case?
I usually keep spare parts of everything on hand to check/test with. Saves the fluffing around. But I'm trying to go minimal now, I'm even getting rid of my stock of RGB :D |
|||
R9 5950x | Asrock B550M Steel Legend | 64Gb 3600Mhz | RTX 3090 FE
R3 2200G | ASRock Fatal1ty AB350 ITX Dell Latitude 7470 | QNAP @58Tb | Mikrotik routers/wifi |
|||
Skybuck
Groupie Joined: 18 Apr 2023 Status: Offline Points: 975 |
Post Options
Thanks(0)
|
||
I believe this may be some windows 11 related error and might also require some kind of bios update... not sure... I think MSI motherboards had this issue... strange to see it on asrock as well ?
There might also be a windows 11 update/fix for this... |
|||
a13antichrist
Newbie Joined: 24 May 2018 Location: Amsterdam, NL Status: Offline Points: 136 |
Post Options
Thanks(0)
|
||
So I did a bit more playing around.. ripped everything out.. re-seated a few MB standoffs.. not sure that did much.. but I can't seem to crash it now with the USB stick, that's a good sign heh.
I started running OCCT and Memory, GPU, GPU Mem all passed.. then I tried CPU. THis is what I got:
So.. that looks.. informative. Can I draw anything from that? Or maybe bad test? Am new to OCCT so not sure if this is conclusive... |
|||
R9 5950x | Asrock B550M Steel Legend | 64Gb 3600Mhz | RTX 3090 FE
R3 2200G | ASRock Fatal1ty AB350 ITX Dell Latitude 7470 | QNAP @58Tb | Mikrotik routers/wifi |
|||
a13antichrist
Newbie Joined: 24 May 2018 Location: Amsterdam, NL Status: Offline Points: 136 |
Post Options
Thanks(0)
|
||
It just flash-reset again, literally while doing nothing but reading over my post above.
|
|||
R9 5950x | Asrock B550M Steel Legend | 64Gb 3600Mhz | RTX 3090 FE
R3 2200G | ASRock Fatal1ty AB350 ITX Dell Latitude 7470 | QNAP @58Tb | Mikrotik routers/wifi |
|||
Post Reply | Page 12> |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |