Tomahawk b450 max II + Ryzen 7 3700X - fixed crashing with bios setting

geof152002d6

New member
Joined
Jan 29, 2024
Messages
8
Hi,

System:
MSI B450 Tomahawk Max II
BIOS 7C02vHC/2023-10-27
AMD Ryzen 7 3700x
PowerColor Red Devil Radeon 7900 XTX
80 GB ram (32 + 32 + 8 + 8), running at default speeds
NVME: 1TB, 4TB
Windows 11 Pro 22H2 all updates applied
AMD driver 24.1.1
A big Cooler Master CPU cooler


Joined the forum just to post this. For the past month since upgrading to latest bios and adding 2x32GB dimms system has been unstable and crashing randomly under load when using GPU. Sometimes after a few minutes, sometimes after a few hours. The crash was black screen with all fans and lights still on. Not just a graphics crash since machine unreachable over the network after this happened.

The only way to get the system to work again was to hold down power button for 10 seconds and then power back on. Sometimes after this happened it was also necessary to reinstall the AMD drivers to get the GPU working as well.

First I tested the memory with memtest86 and got some errors - must be the new ram right? Nope... did load bios defaults (F6) and retest with the pro edition for 10 cycles/70+ hours - zero errors. Test system in Ubuntu 23.10 bootable USB using s-tui - no errors.

Switch GPU to run on 3x PCI leads since crashes were occuring under GPU load. Not sure if this made any difference

Next up, test GPU since crashes always happened when GPU was running. Used furmark and unigine heaven both at same time no problems.

Finally go to test the CPU in Windows and get errors after a minute or so:
* prime 95: rounding error
* OCCT "CPU error"

Looking on the forums this prime 95 error is normally caused by low CPU voltage when overclocking - but I'm not overclocking I'm using BIOS defaults.

Went to BIOS to see if I could increase voltage, while looking I find option "Precision Boost Overdrive" set to "auto". It messes with CPU speed and power which is suspicious. Set disabled, rebooted and rerun prime95, OCCT, furmark and Heaven all at the same time for over an hour - zero errors.

System has now been stable under load for 24+ hours - fixed!

Looks like this BIOS setting is bugged? My next step would have been to start RMAing hardware. I'm not overclocking this system at all and have no interest in ever doing so. Since this setting was auto after loading defaults, perhaps it should be set disabled since its listed in the overclocking section? Or perhaps its some error with the setting on this board?

In all I spent about 5 days looking at this so hopefully helps someone or the BIOS can be fixed...

Cheers,
Geoff
 
Already tried that - temps were fine. System is on a tp-link smart plug too and has a corsair 1000HX platinum power supply. I caught it crashing while looking at the wall outlet power usage and it was well under 600w at the time. Its been up almost a day and a half now with no issues. Its totally the bios setting that fixed
 
Here's screenshot of current temps and voltages - this system runs at 100% GPU and CPU load 24/7

1706670473917.png
 
This is so strange - after being on for almost a week its started crashing again - so looks like that setting helped a little bit at least initially but now I'm able to crash the whole system after a couple of seconds of furmark - it ran for hours with no issues before. Same windows and radeon drivers... what the heck is going on 😂

Could this power supply be dying?
 
Hey, ive been having a similar issue since updating since i had a nvme drive. Tomahawk b450 Max II, Ryzen 5 5500 32gb of ram, rx 6700xt. I replaced my psu and it stopped for a while, then started again, then i reseat some ram which seemed to help then it happend again sometimes, now my pc straight up just died, cpu ez debug led always on.
 
I fixed my problem and system has now been fully stable for a month or so.

I was able to swap components with another PC and eliminate bad hardware by trial and error. Cause of my problem turned out to be combining memory types/using all four slots. I could not use use my full 80GB ram - I could only use the 64GB kit or the 16GB kit.

I was so happy with the 64GB kit that I bought another one to max out the ram to 128GB using all 4 slots and this caused crashing again. BIOS selects speed 2400 with this kit by default which crashes 100% of the time. After going to overclock section and selecting 2133 speed system is rock solid - nothing I do including piping in hot air crashes the system. BIOS bug? .. I spent WEEKS working this out.

@gertmai0d1a0230 if your system is not booting something else is going on with your system. This report relates only to random crashes after booting. That said, I did have a similar problem a while ago due to shorted RESET switch on case - so try disconnecting your case power and reset switch and power on by shorting the power jumper. Good luck!
 
I was so happy with the 64GB kit that I bought another one to max out the ram to 128GB using all 4 slots and this caused crashing again. BIOS selects speed 2400 with this kit by default which crashes 100% of the time. After going to overclock section and selecting 2133 speed system is rock solid - nothing I do including piping in hot air crashes the system. BIOS bug? .. I spent WEEKS working this out.

Do you have a use for that much RAM? Because while such an amount of RAM can be necessary for professional applications, like rendering, video processing, lots of VMs, things of that nature, it will do absolutely nothing for gaming or other normal use. Most games don't even use more than 16 GB RAM. If you only want 128 GB because you think "more is better", but if you don't actually have any professional workloads that require it, you're only making things more difficult and expensive, while providing zero benefit from that much RAM being available.

As for RAM not working as well once you populate all four slots, read RAM explained: Why two modules are better than four / single- vs. dual-rank / stability testing

The stress on the system essentially doubles when you use four modules, and the attainable DDR speed goes down the drain. For example i would never run the RAM at only DDR4-2133, this is way too low of a speed nowadays, creating a performance bottleneck (because everything the CPU does has to go through the RAM first).

The one and only time that maxing out the RAM capacity makes sense if is one of your workloads requires more RAM than, say, 64 GB, and runs considerably slower because of that limitation. Then you'd upgrade to 128 GB for example to make it breathe more freely. If you don't hit such a ceiling at 64 GB total, then 128 GB total will only worsen the performance for everything, because now you have it at a very low DDR speed (the sweet spot nowadays is DDR4-3600, with DDR4-3200 also being acceptable).

The difficulty at even DDR4-2400 comes from the simple fact that this setup of using four big so-called dual-rank modules at once is causing enormous stress on the memory bus. You can immediately cut that stress in half by using two modules only in slots A2 and B2, which is the ideal configuration with any boards that has the RAM slots connected in a daisy-chained manner, like almost all modern desktop boards have.

Going back to this:

Cause of my problem turned out to be combining memory types/using all four slots. I could not use use my full 80GB ram - I could only use the 64GB kit or the 16GB kit.

There can only be one set of RAM parameters for the whole memory system, but then it would have to be some compromise that tries to make RAM modules with completely different ICs (memory chips) happy at the same time, even though they need different parameters from the board. And i'm not even primarily talking about the speed and timings here, i'm talking about the electrical parameters. The bigger modules need different parameters than the smaller ones. So the IMC (integrated memory controller of the CPU) has to do "the splits". It sees two sets of different modules being used, with quite different requirements for certain parameters on the memory bus, but it can only set a "middle ground". But there often is no good middle ground between such different modules. Sometimes it might only boot after manual intervention.

Mixing different RAM is like this:
PSM_V38_D791_An_ordinary_bicycle_with_lines_of_force.jpg


When you really want it to be completely even, like this:

800px-Storck_Scenario_Light_01.jpg


So a kit of two matched modules only is best, followed by four matched modules.
But the latter configuration can cause its own difficulties, because of the high stress that imposes on the memory system.
 
I do have a use for 128GB ram - Kubernetes: 1x 64GB VM + 3x 12GB VMs + Windows + Fusion360 + browser + other random stuff ;D. I dont have much time for games these days but final fantasy XII remake works.

Thats quite an article nice one !- did not find that in my own searching.

I have been following this advice:
https://www.crucial.com/blog/memory/mix-and-match-dram

And on temperatures (from Crucial support):
We have received an update from our internal team on this matter. They have confirmed that the operating range of the memory is commercial temperature 0-95°C. This means that the memory can function normally and reliably within this range without any errors.

Ideally, yes you would match the memory but I built this machine a few years ago so its just what I had to hand. I would expect the system to use the lowest common denominator for memory settings, so the bike might look like this:
tinybike.png

... but that's how I'm rolling.

I did hear this "do the splits" somewhere else as well which is why I tried 4x DIMMs of the same model. Not too concerned about performance on this box, just want it not to crash. Performance is now more then adequate for what I'm trying to do, just really expected the BIOS to do this legwork for me and select working memory settings without needing a manual underclock.
 
I do have a use for 128GB ram - Kubernetes: 1x 64GB VM + 3x 12GB VMs + Windows + Fusion360 + browser + other random stuff ;D. I dont have much time for games these days but final fantasy XII remake works.

I'm relieved to hear that you have a real use for it.

Thats quite an article nice one !- did not find that in my own searching.

Thank you.


Has some questionable statements, like "Theoretically, if the other traits (generation, speed, latency, voltage) are the same, there should be no issue using DRAM from two different brands."

Generation, of course, otherwise it physically won't fit. Speed and voltage, ok, easy enough. But the latency, first of all, that's the wrong term, they mean the timings. The latency is the result of the timings (given in clock cycles), which at different clock speeds result in different absolute latencies measured in nanoseconds. Then, people may look that the primary timings are matching, but there are also secondary and tertiary timings, a lot of which are also covered by the XMP profile, or are derived from other timings, and are highly dependent on the type of ICs (memory chips) being used on a particular module.

The RAM makers always want to reduce their cost, so over time, they will switch to ICs with higher density (less chips for the same capacity) or to cheaper, better available ICs which can manage the same XMP speed/timings. So they keep the XMP profile exactly identical, but with different ICs under the heatspreader. This can then again cause problems when mixing them, because they'd ideally need different parameters again. And those go beyond "generation, speed, latency timings, voltage". There are dozens upon dozens more parameters that will be adapted to whatever RAM kit is installed during memory training (you can see when the board is doing memory training by the POST phase taking longer). Of course, with two different kits, it will be going to the lowest common denominator, as you rightly expected.

Performance is now more then adequate for what I'm trying to do,

It's good that you are ok with this RAM performance. For things like gaming, this would be a severe bottleneck. But your workloads sound like they would suffer much more from there being less RAM available than from the RAM running at a low speed.

just really expected the BIOS to do this legwork for me and select working memory settings without needing a manual underclock.

Sadly, the BIOS is not that intelligent. It will try to memory-train whatever you (or the XMP) sets, and then it will either fail and you get a message about it, or it will pass and you won't get one. It doesn't intelligently try something that might be more appropriate, that's purely on the user to figure out.
 
Back
Top