Persistent PC Crash - The description for Event ID 153 from source nvlddmkm cannot be found

Status
Not open for further replies.
Joined
Jul 6, 2024
Messages
9
Hi all,

my system began randomly crashing on 02/07/2024 after 1 month or so of regular use. This is a brand new system with the following specs:

B650i Gygabyte aorus Ultra
Ryzen 7800x3d
4070TI Super MSI Ventus 3x
32gb of Gskill Flare X DDR5
Samsung EVO 970 Pro

The crashes seemed random, at times on chrome just browsing / Clicking on the desktop / opening steam / opening excel - there was no extreme load at all on the system. All temps were fine. I was also on the latest nvidia drivers and most up to date windows install. Since then, I have reset the GPU in my system, checked all the power connections and cables, installed a clean version of windows which was a struggle. Went back to nvidia driver 555.85 with DDU and clean install - nothing fixed the issue.

Today I changed to my previous card, an MSI GTX 1080ti and ran a 16 minute furmark stress test and the system is 100% fine. There have been no crashes or issues so clearly the problem appears to be the 4070ti SUPER.

Wondering if anyone has seen similar issues and have other possible fixes?
 
Good point, left that out of the specs.

It's a Corsair SF750 also brand new. I had been using this system for about 1 month before these issues started, gaming a bit on D4 and mostly general PC use. The crashes are happening while browsing on chrome, or just on the desktop and opening the start menu. I wasn't able to test of there were issues in game (I assume there would be).
 
Ok not a bad power supply, kudos for not cheaping out on that! :)
Would you be able to test the card in a different system with different PSU?
Or a similarly power hungry card in your system?
Does Event Viewer give any more detailed information what went wrong?

Have you updated your motherboard's BIOS?
Also is the Windows installation fresh, or did you migrate it from old/previous system?
 
Hi Nichrome.

Testing the 4070 on my old system is on my list, will get to that this week.
Sharing event viewer from a few setups:
1. System mentioned in post where was completely unstable all week: LAtest
2. System with 1080ti at 2 different windows installs (where I also had crashes):
Crash with 1080ti
Crash 1080ti 2

There was a BIOS update this week which I have not updated yet, but before then I was on version F30 which was the latest throughout the testing and previous month.
When crashes originally began it was a 1 month old windows install from a USB installation media created from W11 site. Since the crashes I attempted a clean windows install with the 4070ti installed and twice now with the 1080ti.

On the most up to date system with the 1080TI and fresh windows the system seems to be stable (I say "seems" because every time I even think of writing this in a post the system crashes after). For this attempt:
I installed windows and updated
Disabled integrated AMD GPU (consistent with all systems so far)
Did not install any Gigabyte Bluetooth or Lan drivers
Did not update AMD chipset
Apps installed are chrome and geforce experience
Have not updated from nvidia driver v456.71

Edit: I asked for help in the Microsoft forums during all this mess and the message is this is a nvidia driver issue or GPU - https://answers.microsoft.com/en-us...d/498f6124-8592-4156-9083-03779c137e90?page=3
 
Yep it may be either or both. There are many possibilities to be fair, and once you figure it out any of them will "make sense". Faulty GPU? Makes sense you get driver crashes. Bad drivers? Quite similarly. Faulty RAM? Maybe driver was trying to use a specific block of bytes, that makes sense.
Let us know how testing the 4070 is going to work.

Also keep an eye on how the system behaves with fresh Windows installation. Perhaps that was the problem.
 
I may try the system with a combination of the RAM sticks. With the 4070ti even with a fresh install, on previous nvidia drivers, on a different power outlet in the house - absolutely nothing worked. I checked all the power connections from the PSU to card, check the 12v adapter etc. Nothing made the system stable.

The only issue I am having is that I got similar crashes with the 1080ti on the latest nvidia drivers. Which makes me wonder if it could be a different component of the PC, that plus the very weird ethernet malfunctions worries me about the MB.

Does any of this point to a MB issue in your view?
 
List full detailed system specs, see >>Posting Guide<<



Have you measure voltage in the electricity network when this crash occur?
About the GPU, can you test it in another PC?
Hi Svet, not much of an expert so I have not measured the voltages. Maybe its something I can test.

Working on testing the 4070 in my old system to see if it works.

Board: B650i Aorus Ultra
BIOS: BIOS Version/Date American Megatrends International, LLC. F30, 22/05/2024
VGA: MSI Ventus 3x OC 4070TI Super 16gb
Riser Cable PCIE Gen 4
PSU: Corsair SF750
CPU: AMD Ryzen 7800x3d
MEM: G.SKILL Flare X5 Series (AMD EXPO) DDR5 RAM 32GB (2x16GB) 6000MT/s CL30-38-38-96 1.35V Desktop Computer Memory UDIMM - Matte Black (F5-6000J3038F16GX2-FX5)
This is the Memory in your Computer, not a picture of what type of Memory but the actual sticker on the Ram in your Computer.
Will work on a picture - this is from the memory box
1720460537456.png

HDD/SSD: 1x Ssd Samsung 980 Pro 1tb M.2 Pci-e Gen4 Nvme - 7000mbs
COOLER: Water Cooler Cooler Master Masterliquid 240 Atmos Argb 240mm
Keyboard: At the moment a Logitech G pro TKL
Mouse: Logitech Pro Wireless
OC: None
OS: Windows 11 64bit Home version 23H2
Display: ASUS TUF 31.5", 144Hz, 2K QHD, 1ms, DisplayPort e HDMI, FreeSync, HDR
- Is it a TV?: [No]
- How display is connected to the GPU?: DP
 
Last edited:
List full detailed system specs, see >>Posting Guide<<



Have you measure voltage in the electricity network when this crash occur?
About the GPU, can you test it in another PC?
Managed to test the old system with the 4070 TI Super. I don't have the extensive specs but this is the gist of it:

Windows 11
Will check bios tomorrow
Z390 Aorus Pro, Intel LGA 1151, ATX, DDR4
I9 9900k
NZXT Kraken x63 280mm
Using same mouse and Keyboard: Logitech G pro TKL and Logitech Pro Wireless
ASUS TUF 31.5", 144Hz, 2K QHD, 1ms, DisplayPort e HDMI, FreeSync, HDR
Samsung Evo 970 500gb
PSU not sure, but its a 80+ gold 750w EVGA I believe
2x 16gb Gskill trident Z 3600mhz

Test 1 - System above with 4070ti Super
I tested the system above for around 2 hours with the 4070 TI Super and had no issues, ran furmark and used the PC as I have been using the new one with crashes. All the windows updates were present and I included the system info, furmark results, event viewer in the link
Results: Old system with 4070

Test 2 - New system with 4070ti Super
Since things worked well, I once again tried to reseat the 4070 making sure the PCIEs were well connected and even changed the position of the 12vph adapter so it wasn't pressed against the bottom of the case (the new system is built in a Lian Li A4 H20)
Initially seemed fine, for around 3/4 hours of regular browsing and normal use. Until it crashed while idle essentially 1 hours ago. I kept the screen sleep settings and power off as "never" and noticed the screen go off and back on again. Once this initial issue happened I got 3/4 crashes in a row. Attaching event viewer and mini dump. Before this I had done a furmark benchmark and things seemed more or less fine.
Results: New System with 4070

Test 3 - Ongoing - New system with 1080ti
Since the crashes are back with the 4070 ti I switched back immediately after the crash to the 1080ti and system so far is ok. This is the 7800x3d and 1080ti setup that I will test tomorrow all day. During today it was also working fine.

As these crashes are very random, I'm going to try a full day of the 4070ti in my old system with the i9900k and continue with a full days use of the new system with the 7800x3d and the 1080 to see if I get any crashes. No idea what is going on, hopefully tomorrow with a full day's use the 4070ti crashes in the old system otherwise what the hell is even going on here.
 
Ok I see a potential problem.
Your Lian Li case seems to be using a PCIe riser.
Try testing the system outside of the chassis on a cardboard box, and having the GPU plugged directly into the motherboard.
 
Ok I see a potential problem.
Your Lian Li case seems to be using a PCIe riser.
Try testing the system outside of the chassis on a cardboard box, and having the GPU plugged directly into the motherboard.
Not sure if I have the right setup to test out of the case, worried I might mess something up. I am doing more extensive testing today in both systems.

Still have not been able to replicate the crashes with my old system and the 4070ti super.
 
Last edited:
What Nichrome is really trying to say, is testing it without the Riser cable.
More or less, remove the motherboard from the case, put it on the box for the motherboard, add GPU, connect power supply, etc....and try it out.

We've seen similar issues before, and since it's happening with every GPU so far it seems, at least to some degree, I'd also assume it's the riser cable.

Even if it isn't, if the 1080Ti is PCIe 3.0, vs the 4070Ti being PCIe 4.0, and the cable is 'marginal', that could make it more likely to fail with the 4070 vs 1080.

Just some thoughts. I decided against using a riser cable and alternative mounting, simply because I've seen too many horror stories with them.
 
What Nichrome is really trying to say, is testing it without the Riser cable.
More or less, remove the motherboard from the case, put it on the box for the motherboard, add GPU, connect power supply, etc....and try it out.

We've seen similar issues before, and since it's happening with every GPU so far it seems, at least to some degree, I'd also assume it's the riser cable.

Even if it isn't, if the 1080Ti is PCIe 3.0, vs the 4070Ti being PCIe 4.0, and the cable is 'marginal', that could make it more likely to fail with the 4070 vs 1080.

Just some thoughts. I decided against using a riser cable and alternative mounting, simply because I've seen too many horror stories with them.
Hi there, oh yeah I know what he means.

Its just the thought of removing everything from the case, possibly having to unmount the watercooler makes me concerned I will mess something up and break this even more. I see your point though, riser cable seems to be a possible culprit especially given the 4070 hasnt crashed yet in the old system where it is mounted directly on the mb.

However, I used the build the A4 H20 for a whole month before these issues started happening. During that time I didn't move the pc, didn't change or adjust any of the hardware components. Is it reasonable to assume the rise cable started malfunctioning after 1 month of use with zero problems?

I have a few more options:
1. Memtest86 on the ram
2. Just noticed windows had an update today. So will also update BIOS to the July version.

Other than that, PSU could have some issue, Riser cable as mentioned, something off with the MB.. but all of these I think I will need professional help with.
 
Hi there, oh yeah I know what he means.

Its just the thought of removing everything from the case, possibly having to unmount the watercooler makes me concerned I will mess something up and break this even more. I see your point though, riser cable seems to be a possible culprit especially given the 4070 hasnt crashed yet in the old system where it is mounted directly on the mb.

However, I used the build the A4 H20 for a whole month before these issues started happening. During that time I didn't move the pc, didn't change or adjust any of the hardware components. Is it reasonable to assume the rise cable started malfunctioning after 1 month of use with zero problems?

I have a few more options:
1. Memtest86 on the ram
2. Just noticed windows had an update today. So will also update BIOS to the July version.

Other than that, PSU could have some issue, Riser cable as mentioned, something off with the MB.. but all of these I think I will need professional help with.
I would generally argue that you don't need professional help, but that's entirely up to you. All they're going to really do, is the same things we just suggested.....

Realistically, the idea is to remove the most likely culprit with each step.
1. Test without Riser. If no issues, you found your problem.
2. If still issues, test with the 1080ti. If no issues, then the 4070 is the problem.
3. If still issues, test the 7800x3d in another PC. If no issues, then the motherboard is the problem.
4. If still issues, then the CPU is the problem.

And at that point, you've pretty much eliminated all the issues or components that could be issues, in some way.
 
I would generally argue that you don't need professional help, but that's entirely up to you. All they're going to really do, is the same things we just suggested.....

Realistically, the idea is to remove the most likely culprit with each step.
1. Test without Riser. If no issues, you found your problem.
2. If still issues, test with the 1080ti. If no issues, then the 4070 is the problem.
3. If still issues, test the 7800x3d in another PC. If no issues, then the motherboard is the problem.
4. If still issues, then the CPU is the problem.

And at that point, you've pretty much eliminated all the issues or components that could be issues, in some way.
I appreciate the help, just don't have the right setup or time to start testing each component.

Number 2 I have done..no issues with the 1080ti. Likewise no issues with the 4070 in the other setup. Riser I will need to take in to test, same for the cpu (dont want to risk breaking it).
 
I appreciate the help, just don't have the right setup or time to start testing each component.

Number 2 I have done..no issues with the 1080ti. Likewise no issues with the 4070 in the other setup. Riser I will need to take in to test, same for the cpu (dont want to risk breaking it).
That alone to me, seems to point more likely that the riser setup is not good for a PCIe 4.0 card vs a PCIe 3.0 card.
 
Hello friend, I had the same problem, my config is as follows, because of the rtx 3060 ti and Samsung EVO 970 Pro there was relatively little heating, when the ssd temperature approached 74 degrees there was a similar problem when I installed another video card (I also thought that the problem was it, for example the old 960 and 4060, then everything worked), it helped to install a radiator on the ssd, the temperature dropped by 10 degrees, the problem disappeared)
Hi all,

my system began randomly crashing on 02/07/2024 after 1 month or so of regular use. This is a brand new system with the following specs:

B650i Gygabyte aorus Ultra
Ryzen 7800x3d
4070TI Super MSI Ventus 3x
32gb of Gskill Flare X DDR5
Samsung EVO 970 Pro

The crashes seemed random, at times on chrome just browsing / Clicking on the desktop / opening steam / opening excel - there was no extreme load at all on the system. All temps were fine. I was also on the latest nvidia drivers and most up to date windows install. Since then, I have reset the GPU in my system, checked all the power connections and cables, installed a clean version of windows which was a struggle. Went back to nvidia driver 555.85 with DDU and clean install - nothing fixed the issue.

Today I changed to my previous card, an MSI GTX 1080ti and ran a 16 minute furmark stress test and the system is 100% fine. There have been no crashes or issues so clearly the problem appears to be the 4070ti SUPER.

Wondering if anyone has seen similar issues and have other possible fixes?
 
Status
Not open for further replies.
Back
Top