Owners of 13th/14th Gen Raptor Lake CPUs - Media Reports of serious stability issues

FlyingScot

Well-known member
Joined
Apr 30, 2024
Messages
1,688
This information is being offered with the intention of keeping owners of these CPUs informed, i.e. knowledge is power. Perhaps if enough people become aware of this situation (and the information provided is indeed accurate) then maybe Intel will offer affected users alternative remedies other than the usual warranty replacement process.

Owners of 13th/14th gen i9’s are reporting what looks like rapid degradation over a period of months that leads to sudden instability. Users are then forced to increase voltage and/or drop boost frequencies to stabilize their systems. In some cases, this problem resurfaces and requires the same approach to be repeated. Another major cause of concern raised in the video, and in user comments, is the stability of the Integrated Memory Control (IMC) and the relationship to DDR5 frequencies above 4200. Random instability due to IMC issues, and perhaps degradation of the IMC, cannot be ruled out. This is clearly a story that is still developing. Owners of i9 CPUs, and indeed all Raptor Lake CPUs, should stay tuned. Follow up videos on this story are promised by Steve at Gamers Nexus.

What’s most alarming about this situation is the fact that even users who took every preventative safety measure, i.e. not overclocking, low PL1/PL2, good cooling, even undervolting, etc., are still reporting instability after just a few months of usage, presumably from what could be degradation. And, judging from user comments, these issues may not just be limited to the i9's.


I do not own one of these chips, but I have been following this story with great interest.

:stop: UPDATE 1: For folks who want a quick way to see where things stand at the moment, I suggest you start reading this thread from here. In the first post by forum member, CiTay, you will find several relevant links of importance. I will endeavour to keep adding to this shortcut list as events develop.

:stop:UPDATE 2: By now, many of you will be aware that Intel and the motherboard manufacturers have collaborated to release new BIOS code to address the Raptor Lake instability and degradation issues. Should you choose to update to the latest BIOS (with microcode 0x129) and then experience thermal and/or performance issues, you may find this new guide helpful in addressing those concerns.

:stop:UPDATE 3: [Sep 26, 2024] Intel and the motherboard manufacturers have collaborated to release new BIOS code (0x12B) to address the Raptor Lake instability and degradation issues. For more information, jump to the following post in this thread.

:stop:ADDITION 1: Don't forget to stop by our very own online Raptor Lake Survey, where you can see how others have configured their Raptor Lake systems to balance performance and stability, and to reduce the risk of degradation. You can check it out here.

:stop:UPDATE 4: [May 1, 2025]: Intel releases 0x12F microcode, which includes a fix for voltage management issues for PCs that are left powered on for days on end, with long periods of idle. See more information here.
 
Last edited:
Solution
now I downgraded my bios version to 1.6 to get my audio back but I'm still at risk about the intel failure I guess isn't it?

The new Intel microcode which should prevent any further CPU degradation (from the moment its been applied via BIOS update) is projected for mid-August, so basically in 2-3 weeks. No current BIOS update really solves anything yet, the only things they do is try to implement the latest Intel recommendations for certain BIOS settings, and implement the μCode version 0x125 (μ = mikrós / micro, meaning "small" in Greek), that one fixes a bug which may have slightly contributed to the instability issue, but is not the root cause. The root cause seems to be "exposure to elevated...
I think the issue won't be that small. Granted, some of the affected customers, like companies running Minecraft servers, have a pretty unusual load scenarios like 24/7 low-threaded full load (just on one or a few cores) and have massive failure rates sometimes. So they might be running a scenario that provokes constantly elevated voltages for parts of the CPU, which then greatly accelerates the degradation to the point they become unstable at stock speeds.

But still, we have to acknowledge, what first got this problem some wider attention was normal users experiencing crashes when loading a game. Crashes that seemed to have to do with the GPU at first, or which they attributed to bugs in the games, but which turned out to stem from the CPU causing errors during decompression workloads. So it doesn't take much other than normal gaming use to potentially cause degradation. And outside the tech community, the cause of such crashes might not be widely known yet, so i can easily see a situation right now where a bunch of people get weird instability, and they blame it on everything but the CPU at first. Heck, they might be needlessly replacing other parts.

And why should they immediately think the CPU is deteriorating? It used to be that CPU far outlived their "useful life". Let's say that after ten years at the very latest, it's time for a new CPU if you do anything other than office work or if a new Windows version forces you to upgrade. When did we ever have instability from CPU degradation during that time, or even years beyond, just from normal use without overclocking? Intel CPUs used to be bulletproof, before Intel started pushing things too far (using high stock frequencies which are outside of the natural comfort zone of that silicon, similar to what i posted here). So once this story is more widely covered and filters through to the average end user, we will see more and more people realizing that it's really from the CPU.

Now, if a user doesn't experience any instability so far, and somehow learns through the media that the mid-August BIOS update is crucial to apply (and updates to it), can they use this CPU for over a decade without any problem, like it was in the past? Probably, yes. But how many users will learn about the BIOS update with the new microcode? A lot of people i built PCs for, whenever i change something in the BIOS while they are present, they have no idea what on earth i'm doing. They could maybe perform a BIOS update if it's done from Windows, but then for example enabling XMP or things like that, no way. And again, how can they know about the BIOS update when they don't visit any tech sites and probably don't know what a BIOS or microcode is in the first place.

So i think the issue of "failing to get the fix to the people" will be one thing that could potentially make a lot of people's CPUs show instability over time. They put together a system, update the BIOS once after they built it (if that), and that's about it. Or they buy a pre-built system and never touch a thing in the BIOS. Those people might appear on forums and elsewhere in the coming years, asking why their system has developed an instability.

We don't know the true scope of the problem, true, but you could imagine it to be either way - smaller than we think, or larger than we think.
 
I would add this point. In my post here I summarize IgorLab findings of the variation in VID tables. I would use that as my guide. I seriously doubt that all Raptor Lakes are going to fail prematurely. But I would say this. Given what we already know, it’s fair to speculate that any CPU that flirts with voltages around 1.4v or higher is surely at risk of some kind of accelerated degradation. This would mean most, or all, of the 13th and 14th i9’s, some of the 14th gen i7’s, a little less of the 13th gen i7s and a small percentage of the i5’s. That’s where my instincts take me. Of course, nobody really knows yet (maybe Intel does) if there was a manufacturing defect in addition to a design flaw. If you haven’t watched the videos I posted above, do try to watch them. I think they will help answer you’re question, although not entirely because this is a developing issue.
 
I currently set my PL1 to 125 watt, causing Cinebench R23 to drop from 37k to 32k (PL1 was 200) but it keeps my temps way lower.
As the MAG Z790 Tomahawk Wifi Max still doesnt have the 0x125 microcode update, I rather stay on the safe side for now. Ran 2 cinebenches and installed nvidia drivers twice, still nothing. Just some weird app lag which might be caused to the junk I have installed on this PC.

Will tinker on once the microcode gets pushed, hopefully even the august patch.

my VIDS really try to suck up 1,45, but with lite load on 9, undervolt of -0,110, it never gets above 1,33. Hopefully it will not fry my CPU before the patch

Also note, something I think is VERY important, as FlyingScot said, not ALL CPU'S DIE! if that was the case, every single sold unit would have to be recalled or RMA'd. A single crash in an unreal game is no sign that the CPU is dead, as most other games and apps work perfectly fine (say, one UE5 game crashes but the rest doesnt).

What I also get from all this information, is that when this bug occurs, and there is damage to your CPU, it turns into a slightly more expensive i7 14700k, is this a correct thought?
 
it turns into a slightly more expensive i7 14700k, is this a correct thought?
That’s one way to look at it! ;) Yes, even CPUs that have degraded to the point that they are too unstable in their current configuration can often be salvaged by reducing core clocks, or the ring speed or, in more extreme cases, manually disabling E-core clusters or individual P-cores. And of course, that’s in addition to less serious options like reducing PL1/PL2 and trying to lower voltage. Although, an already degraded CPU will probably have lost its tolerance for lower voltages. If anything, you would normally have to increase voltage. So the former options would likely be the better course of action. And, of course, once you find yourself having to take such drastic measures, how an earth could Intel deny your warranty claim, assuming you are still within the 3 year window. Speaking of warranty, now that’s something Intel could do to show that they are acting in good faith - which is to extend the standard warranty to 4 or 5 years. I mean, if they are so confident in the August microcode patch then what would they have to lose? The alternative is that AMD will need to make a much better deal with TSMC for all the extra chips they’re going to be selling.

Bottom line, I really won’t worry too much at this point. You have already been very proactive in reducing the risk. Let’s hope that the August patch then fixes what us users cannot, which is those strange momentary spikes above 1.4V. I have confidence they will be able to do that.
 
Last edited:
I currently set my PL1 to 125 watt, causing Cinebench R23 to drop from 37k to 32k (PL1 was 200) but it keeps my temps way lower.

Well yeah, for a 14900K, a PL1 of 125W coming into effect after a minute, that seriously puts the brakes on for this CPU model (at least for full load), but as long as you tweak it further once the BIOS with the new microcode is out, that's ok. I understand you being cautious at this point. Later you can adapt it to your cooling capabilities again (which i hope are higher than that).

What I also get from all this information, is that when this bug occurs, and there is damage to your CPU, it turns into a slightly more expensive i7 14700k, is this a correct thought?

I don't think the bug occurs spontaneously, the bug is there in the microcode all this time. It then depends on your individual CPU, maybe its silicon quality, luck, usage scenario and amount of time, wether this bug manifests in a CPU degradation, and wether that degradation is signifcant enough to get any instability. Of course not all CPUs will fail, but for all we know, all CPUs might be on the path to failure with the buggy microcode. Depending on all those factors, it can take anywhere from months to years for it to manifest in a problem.

Lastly, the CPU, for example your 14900K, won't turn into a more expensive 14700K if it's affected. It will still try to perform 100% like a 14900K if you don't change anything in/with the BIOS, it will just be slightly unstable while doing so. Yes, if you "nerf" it a bit by lowering the clocks for example, it might regain stability (the main two methods of gaining stability in any IC is either to lower the clocks and/or to raise the voltage, the latter might not be the best idea for apparent reasons). But otherwise, the CPUs will keep trying to be the CPU model they always were.
 
If you manage to undervolt your CPU, or lower the lite load, without getting the CPU unstable, thats a good sign too. What I gathered is that the LL needs to be raised to about 16 (thats why its the new default?) before it gets stable again, so when you can set the LL to a lower setting, in my case LL9 is still stable, thats a good sign I take it?

I'm asking these questions, and making these comparisons, because there is A LOT of doom and gloom around these CPU's. Like, "your PC and house will catch fire, you will never be able to play a modern game, you get BSOD's at any given time even when it's turned off" kind of statements. These make me very anxious and keep me thinking "maybe I should spend another 1100 bucks to replace mobo, RAM and CPU to an equal AMD rig".

FlyingScot, Citay, you two do a VERY good job nuancing things with the words "slightly" and often saying "don't worry too much". This thread is a world of difference from the Reddit folk. I cannot thank you guys enough for all you taught me and how helpful your information is to ease my mind a bit.
 
when you can set the LL to a lower setting, in my case LL9 is still stable, thats a good sign I take it?

The CPU being stable on a lowered voltage is always a good sign, and a good thing to attempt in the meantime, as well as in general. As speculated here, the upcoming fix might end up being nothing more than some generalized voltage cap/limit for certain parts of the CPU, and then also implementing the latest Intel recommendations for certain BIOS settings. So there will be two things that are quite generalized. While they can account for the different CPU models, they can't account for the differences between the CPUs of each model, so the fine-tuning will always be up to the user. It's possible that trying to lower the CPU Lite Load mode is among the best prevention methods for CPU degradation that you can do. I always promoted to try to lower this setting, i think it's an easy and effective way to improve how any Intel CPU operates, if you keep it high enough that the CPU is still fully stable.

FlyingScot, Citay, you two do a VERY good job nuancing things with the words "slightly" and often saying "don't worry too much". This thread is a world of difference from the Reddit folk. I cannot thank you guys enough for all you taught me and how helpful your information is to ease my mind a bit.

Of course. I think i speak for both of us, we try to help wherever we can, and with hardware problems, especially when they are just potential problems that haven't materialized yet for you, there's never any reason to panic. Anything can be solved eventually with a methodical approach, and that which can't be solved can be RMA'd if worst comes to worst.

Intel might panic a bit right now, but here on the forum, we aim to keep the panic to a minimum.
 
I had overlooked your post. Very interesting observations regarding CEP. It likely suggests that one should only ever consider enabling it if you are undervolting via manual offsets. And, of course, even then it appears to just lull you into thinking that your overly aggressive undervolt is stable when in fact it is not, I.e. because of its clock stretching ability.
You could also not use the CPU Lite Load modes/presets and instead set control to Advanced and dial in your preferred values for the AC and DC Loadline. I haven't tested this but I suppose that then the values should be respected and not influenced at all by whether CEP is enabled or not.
 
You could also not use the CPU Lite Load modes/presets and instead set control to Advanced and dial in your preferred values for the AC and DC Loadline. I haven't tested this but I suppose that then the values should be respected and not influenced at all by whether CEP is enabled or not.

I have to disappoint you there, i just tested it on the Z790 system i'm currently testing after building it for someone. Doesn't matter if you use CPU Lite Load "Normal" or "Advanced", once you lower the value enough, IA CEP will rear its ugly head and cut your performance in half, literally. Set IA CEP to [Disabled] and the performance is back to normal (or even better than normal, because with lower voltage, the CPU can clock a bit higher sometimes, depending on the power and temperature budget).
 
I ran into some issues with MSI PRO-Z690-A WIFI DDR4 + i5-13600K. Not sure if this is related to the CPU problem.

In particular, I noticed that if I started a Steam game, the WiFi network would become very unstable. Running a `ping 8.8.8.8` command would show significantly higher latency.

I tried installing the latest BIOS version 7D25v1I1(Beta version), and it appeared that the situation was improved when tested on Windows 11. However, the same issue still persisted when tested on Ubuntu 22.04.

As networking probably also involves compressing and decompressing data, perhaps this is a sign that my i5-13600K is also suffering from the degradation problem? Has anyone here tried the RMA process with Intel? What kinds of evidence you need to provide?
 
Now now, perhaps let's not play connect-the-dots to this specific issue with any problem we encounter. The WIFI/WLAN connection becoming unstable can have a myriad of reasons that don't necessarily have to do with this issue. Once you see crashes when loading a game, then we could investigate further, but higher latency on the network, i've never seen that mentioned before anywhere in connection to this.
 
I have to disappoint you there, i just tested it on the Z790 system i'm currently testing after building it for someone. Doesn't matter if you use CPU Lite Load "Normal" or "Advanced", once you lower the value enough, IA CEP will rear its ugly head and cut your performance in half, literally. Set IA CEP to [Disabled] and the performance is back to normal (or even better than normal, because with lower voltage, the CPU can clock a bit higher sometimes, depending on the power and temperature budget).
Yes, I didn't mean anything about the influence CEP has on performance. Just that if you set in the MSI Bios CEP = [Enabled], this will also modify the impedances of the CPU Lite Load Mode that you use i.e. they will simply have other values once you enable CEP. And that if you want to make sure you have a certain pair of impedances regardless of CEP Enabled or Disabled, you should use Advanced and set them there.
Personally, from my quick testing with CEP Enabled on Mode 7 (60/60) I got about 3% score decrease in Cinebench R23, so maybe AC Loadline was not low enough to cut things in half, like you mention, but I don't doubt it would under certain conditions.
 
I had CPU Lite Load Advanced on AC 30 / DC 103 (this AC value is stable for that CPU, and this DC value matched VID-to-VCore well), and as soon as the first rendering pass in Cinebench finished, the result bar only went half the way with IA CEP Enabled. Disabled, and boom, twice as fast again.
 
I don't think the bug occurs spontaneously, the bug is there in the microcode all this time. It then depends on your individual CPU, maybe its silicon quality, luck, usage scenario and amount of time, wether this bug manifests in a CPU degradation, and wether that degradation is signifcant enough to get any instability. Of course not all CPUs will fail, but for all we know, all CPUs might be on the path to failure with the buggy microcode. Depending on all those factors, it can take anywhere from months to years for it to manifest in a problem.
I wonder if the date of manufacture of the 13900k matters. I bought a 13900k in late 2022 and have had zero problems. most stable PC i've built to date. This is even with using prime95, small FFT, AVX2 enabled. The only thing I will say is pretty early on I set the power limits to pl1/pl2=253w and even at maximum load on prime95 temps dont exceed 82C with the cooling setup I have. I also have to wonder if silicon quality plays a role, and if there was some point when they started binning i7 quality chips as i9s
 
Silicon quality playing a role would only be logical. The worse the silicon quality and the harder it's pushed, the higher voltage it requires for stability to begin with.
 
I thought I’d share my recent thoughts on this situation for those who might be interested.
I have seen an explosion of what I would term “click bait“ on YouTube, etc. in the last couple of days relating to the Raptor Lake issues. I’m sure many of you are seeing the same thing. On the one hand, these are not helpful in the least as they are no doubt driving up the panic. But on the other hand, they are the direct result of what an awful job Intel have done with communication and customer assurances. It has become very clear that there is a lot riding on this August patch. And I for one will be very interested to see what efforts Intel makes to get the word out to less savvy users via System Integrators like HP, DELL, Lenovo, and the smaller custom PC builders. IMO, Intel needs to coordinate an email campaign with S.I.s, retailers and motherboard manufacturers to let customers know of the issue and then direct them to a dedicated webpage of instructions on how to update their system. If this issue is as large and widespread as we might fear then it’s not just Intel on the hook, it will be these other companies, too. While I’m still confident that the August patch will help, I strongly suspect that the post-August world will be a very different one from the pre-August world if Intel bungles this mitigation rollout. Because as bad as it might be for us users, it will be catastrophic for Intel’s reputation and finances for this issue to drag on much longer. Intel simply cannot afford to put a foot wrong on this one. They are hemorrhaging money on the manufacturing side of their business and they have AMD getting ready to launch their new generation of client and server CPUs, which is still Intel’s cash cow. So as bad as we users might feel at the possibility of having to replace a CPU, and maybe a motherboard, etc. at our own expense (if the worst happens) Intel has far far more to lose. I hope that makes some of you a little less anxious. One way or another, Intel is going to have to make this right.
 
Last edited:
Ive also seen some arrow lake benchmarks suggesting that it banked so heavily on AI performance, that general single and multithreaded performance is marginally faster to even considerably slower than 13th and 14th gen. If this were to be true, they dont have a solid new stable product to outperform raptor lake at things like rendering, compiling and compressing.

Companies waiting to just upgrade to arrow lake dont want to downgrade by doing it. The consumer world is screaming AMD superiority, but for me as a video editor, I have way more benefits in the Intel chip, and so do other creators.

This causes me to believe that Intel is really banking on the microcode to fix it and keep selling raptor lake until the AI processor caught up to those speeds
 
I just noticed that you say you have deactivated TVB. I think that TVB works the opposite of what you might think. Someone please correct me if I wrong, but I think that if you disable TVB you still get the boosting behavior - but now regardless of temperature. This would drive up your voltage requirements under certain loads. I'm not sure if TVB does anything if you set manual Max Core Ratios via a manual overclock. I'm not sure if you disabled TVB for stability reasons, but it might be better to try and keep it enabled.

EDIT: Ignore my statement above. It’s likely that the above statement is not relevant in the MSI world. There are also several TVB settings, and it’s likely that I got my wires crossed.
Edit : This was supposed to be a private answer to FlyingScot, not meant for the thread. Not sure how this happened...my mistake !
 
Last edited:
Back
Top