MPG Z690 Carbon Wifi DDR5, Random BSODs.

saber88156a02d3

New member
Joined
Jun 7, 2023
Messages
5
Hello i recently started having issues with my computer. Random BSODs and such they all point towards different erros, At first i thought it was my gpu but then one day all of my USB connections just died.
My setup is.
Intel 12900k
MSI MPG Z690 Carbon Wifi DDR
32GB Corsair Vengance (2x16) DDR5 5600MHz CL36
MSI Suprim X 4090
Seasonic prime GX 1300W
Samsung 980 pro (with the good firmware)

No OCs or anything except XMP. The system was stable until like 1.5 - 2 weeks ago when i started having issues so i updated my bios at that point. Then the problems got worse and i had lots of gpu crashes so i went thru the clean install process of nvidia drivers with DDU, It was stable for a few days. Then it started again.

1686162424211.png
1686162307840.png
1686162411288.png

Some random cpu-z / gpu-z.
I dunno what more to say. Working nights so i'll pop in and provide any info needed when i can.
 
What happens if you turn off XMP? Does it work fine then?
It's been running fine for 1 year on XMP. The issue started randomly about 1 - 1 1/2 weeks ago. Yhe biggest hardware change was in Dec when I installed the new graph. But I could try without XMP when I get back home.

When it started lots of the errors where connected to gpu so I did a ddu on the gpu driver reinstalled them clean (at this time i also updated my bios). Then it was fine for a week. Then random bsods again this time they where more windows aligned so I did a clean reinstall was fine for a few hours and the issue popped up again.
 
It's been running fine for 1 year on XMP. The issue started randomly about 1 - 1 1/2 weeks ago. Yhe biggest hardware change was in Dec when I installed the new graph. But I could try without XMP when I get back home.

When it started lots of the errors where connected to gpu so I did a ddu on the gpu driver reinstalled them clean (at this time i also updated my bios). Then it was fine for a week. Then random bsods again this time they where more windows aligned so I did a clean reinstall was fine for a few hours and the issue popped up again.
A couple of thoughts for you, can you try that GPU in a different system to see if the behavior follows it? And I wonder if you're getting file corruption which causes it work work for a week after a clean driver install, then start having issues. Have you tested that RAM with memtest or a similar program?
 
A couple of thoughts for you, can you try that GPU in a different system to see if the behavior follows it?

In this case, i would suggest something different, because not many people's PCs can run a 4090. But he also has integrated graphics. So the easiest thing is to just take out the 4090 for troubleshooting, and run on the iGPU for a while. If the crashes keep happening, it has nothing to do with the 4090. But otherwise we would have a strong hint, pointing to the GPU or perhaps the PSU (although it is quite a good PSU model).
 
In this case, i would suggest something different, because not many people's PCs can run a 4090. But he also has integrated graphics. So the easiest thing is to just take out the 4090 for troubleshooting, and run on the iGPU for a while. If the crashes keep happening, it has nothing to do with the 4090. But otherwise we would have a strong hint, pointing to the GPU or perhaps the PSU (although it is quite a good PSU model).
True, but if you just try with the iGPU and it works fine, you really haven't pinpointed the issue; it could still be either the GPU or the PSU. Of course if the problem persists on the iGPU you have ruled out the GPU, so there's some advantage. Still, the best test would be trying the GPU in another suitable PC, if available.
 
Let's see if there are some gaming enthusiasts among his friends which have a PSU good enough for a 4090 then. I'd run on the iGPU first, and if that's suddenly stable, he will know what to test next. Borrowing a different PSU good enough for a 4090, or taking the 4090 to a PC that can cope with it and test it there.
 
I do still have my 1080 in a box could try that as well. And tonight I will run memtest (no xmp) while I'm at work. Hmm I do have a friend running a 3090 could check with him. Tho he lives 1 hour away. Living rural sucks. thanks for all the replies.
 
I do still have my 1080 in a box could try that as well. And tonight I will run memtest (no xmp) while I'm at work. Hmm I do have a friend running a 3090 could check with him. Tho he lives 1 hour away. Living rural sucks. thanks for all the replies.
I would try it with xmp enabled. If it passes you're golden, if it fails you would need to try again with it disabled. You do normally run it xmp enabled when you're seeing the issue right?
 
Let's check the sensors with HWinfo64. Run it and open "Sensors", then expand all sensors by clicking on the little <--> arrows on the bottom. Also expand the columns of the sensors a bit so everything can be read. Make it three big columns of sensors. In the end, it should be a screenshot with all the sensors visible at once, like this:

yes.png


First do a run for CPU-only load. Make sure your power plan in Windows is on "Balanced". Let it run in idle for a while (couple minutes), so the "minimum" baselines for the values are established. After a short time in idle, produce full CPU load with Cinebench R23, and after the 10 minutes, when the CPU temperatures have stabilized at the highest level, take a screenshot.

Then reset the sensors and take another screenshot for a gaming scenario. Take a game where it crashed before, and play it for maybe 15 minutes or so, just to heat everything up nicely and put some proper stress on everything. Then you can ALT-TAB out or close the game and take a screenshot of the sensors. Make sure the GPU sensors are visible.
 
Did runs both with and with out XMP and both cleared. Yesterday before work i tried some gaming no crashes. It's a very random problem it seems.
the crashing you were having before though, was that WITH xmp? or does it still crash without xmp? also, have you overclocked/undervolted your cpu/gpu at all?

I have also recently had many issues with my Z690 DDR5 system.....i had what i thought was a rock solid and stable undervolt on the CPU, mild undervolt on the GPU and what I thought was a stable a fairly tame OC on the RAM (considering it's a 6400 kit, and I was only running it at 5800). It has been working fine for literally months.....then within the last few weeks I was getting games crashing, BSOD when stress testing and all kinds of issues.

Since settings CPU stock, GPU stock, and RAM stock (4800), I'm not gettign crashes anymore.

I had also tried taming the CPU undervolt, and clocking the ram even lower (5600), but it just doesn't seem to like it anymore.

Thing is, I tested everything for literally hours before. It's been rock solid and stable for so long....and even now, the ram tests actually pass if I turn on XMP, pass if i put my custom OC back.....but games still crash.

I've had that before too, where ram settings would pass every stress test you can find....but then crash games....which is why i settled on 5800 to begin with, because anything else was not showing errors in tests, but was not stable in games.

Anyway, I'm just running everything stock now.....realistically, in real world gaming situations, I haven't noticed a difference.....all the benchmarks will say that my cpu runs hotter under insane and unrealtic loads, and that the ram is running much slower.....but in games i still get the same average frame rates, and temps IN GAMES are exactly the same as they were with the undervolt.

So maybe just try everything stock and see if games still crash?
 
I've had that before too, where ram settings would pass every stress test you can find....but then crash games....which is why i settled on 5800 to begin with, because anything else was not showing errors in tests, but was not stable in games.

This can be an indication that the hot air from the GPU is heating up the RAM modules. Some RAM is known to get unstable at certain higher temperatures.
Also see here.
 
This can be an indication that the hot air from the GPU is heating up the RAM modules. Some RAM is known to get unstable at certain higher temperatures.
Also see here.
It has been hotter in the uk last few weeks.

I do keep an eye on temps when gaming too though, and everything was still same temps as normal. My gpu rarely gets to 65, cpu in games is around 55 to 65, ram is usually around 50ish. When I test the ram, I also have something stressing the gpu to heat it up too.

I think maybe my system must just be borderline passed, I.e. cpu literally JUST passed the binning process for 12700k, the gpu just made it to 3080ti....I actually almost had the ram stable at 6400 by reducing cpu SA voltage recently too (was just testing, as I figured, can't make things any worse.....I know, stupid idea since I was already getting issues now at 5800... but I did read that some cpu actually seem to work BETTER with lower SA.... and it did seem to work, passing all ram tests again.... but eventually games crash, and I did get 1 bsod too.)

So yeah, I'm just gonna keep it all stock now. Finally given up trying haha.
 
Back
Top