Yes, I’ve read a lot about CPU degradation. I was using the latest version for about a year, and later I became more interested in the topic and kept updating, so I think I reacted in time.
As mentioned, the BIOS versions that came out around July/August '24 were the first ones with any real attempts against voltage spikes in the Intel microcode (albeit incomplete ones, as has been discovered later). So that's the BIOS 7D89v1D1(Beta version) or 7D89v1D for your board, with µcode 0x129. Anything prior to that would've still hit the CPU with the full extent of the voltage spikes.
So not only is the version you ran for about a year important, but also, we must not forget, all the action from Intel and the board makers only came as a result of people's CPUs becoming unstable for no apparent reason. Since early 2024, people were reporting about this. By mid-2024, they had no idea about the cause and tried various things. Only in Q3 of 2024, they were beginning to address the root cause, with the Intel microcode revisions against voltage spikes. So even for people who applied the appropriate BIOS versions right away, their 13th/14th gen CPU might not have gotten away completely unharmed.
However, the one thing that is in your favor, your problems only started to appear a few days ago. It might've been unwise to not keep updating the BIOS, since you were well aware of the degradation issues, because the fixes kept encompassing more causes of voltage spikes. But as for saying that this must be CPU degradation, taking your latest reply into account, is not so cut and dry anymore.
This computer had high temperatures from the beginning - my friends and I were surprised that it was reaching 90-100°C, but I assumed it was normal because some people say it’s safe and that you can use it like that.
It's normal and not normal at the same time. Normal in a way that Intel pushed these CPUs way too far from factory, as I allude to in my
guide, and the board makers - for a long time - letting the CPUs off the leash way too much, with inadequate presets for the power limits. So if you just installed your new system, you maybe update the BIOS, and then select "Water cooler" in the cooler selection prompt (which is really the power limits prompt), then the MSI BIOS would completely max out the power limits, letting the CPU completely off the leash.
What does this mean for the 14700K? A native power draw
well over 300W under full load, if left to its own devices. Of course, not even most high-end AIOs could deal with heat like that, so with most of the available coolers, the CPU would firmly be in thermal throttling range, hovering around 100°C and downclocking itself, because the cooling can't get rid of the heat fast enough. This is what most people see if they don't adjust things properly to their situation, and this is part of the reason I wrote my guide in the first place.
Because what is not normal: When running it like this, you are relying on an emergency mechanism. The lack of cooling (at least for using maxed out power limits), combined with a CPU model that has been pushed to the extreme (for lack of other ways of gaining enough performance), leads to a CPU that's constantly trying to save itself from dying, once you apply a high enough load. Bouncing off the thermal limit is not a good way to have the CPU act with full load.
Now, you have two ways of counteracting that: Lowering and/or limiting the power draw of the CPU, and improving the cooling.
You already tried to undervolt, and with any CPU Lite Load mode below 11 causing instability, you know that your CPU doesn't have much undervolting potential. This could be because of a slight CPU degradation already, but this also could be normal. After all, MSI chose the default values for CPU Lite Load in order to make all CPU samples of varying quality be stable. Some CPUs of the same model need a bit more voltage, some a bit less. But what this already tells you, no matter which undervolting method you try, you won't get far. In fact, if Mode 10 is already unstable, and Mode 11 is the first one that's stable, then I would use Mode 12, to have some stability headroom. And that's basically where the undervolting adventure ends with this CPU, there is no need to try VCore negative offset or anything else, it will all be limited in the same way.
That leaves us with the cooling. And there, for a 360mm AIO, this is not performing up to par. You have set 200W Short and 175W Long duration power limits, yet the AIO is struggling this much, when it should easily be able to deal with 200W continuously? Something can't be quite right. Either the fan curves need to optimized further, or something is the matter with the AIO itself.
Your photos of the system, it seems like a reasonable setup, two intake fans in the front, one exhaust in the rear, and the AIO top-mounted as exhaust, this is standard setup that should deliver good cooling performance. So you need to take a look at the
fan curves, doesn't matter if you set them in the BIOS or in a software. At around 80-90°C, the fans on the radiator should start to go to full speed. For the case fans it depends, also on the fan models and whatnot, sometimes they can be a little less than full speed.
What you can also check, the CPU temperature in the BIOS, just letting it sit for five minutes. With such an AIO, it should be in the 30s, if not even high- to mid-20°C (depending on room temperature of course, assuming around 20-22°C). If you see well over 40°C there, that would already somewhat point to an issue with the AIO. Perhaps double-check how it's mounted, clean the surfaces with isopropanol and use fresh thermal paste, stuff like that. Because the AIO should do better than this. It could be something out of your control, but you first need to check the variables you can control, like the fan curves and the mounting.