Close

Register Now!

To Get More Info and Daily Reward.

Please login or register.
Pages: [1]   Go Down

Author Topic: MSI Z170I GAMING PRO AC - WHEA_UNCORRECTABLE_ERROR bricked system  (Read 12338 times)

0 Members and 1 Guest are viewing this topic.

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Windows 10 no longer works, I immediately get WHEA_UNCORRECTABLE_ERROR

It initially started while playing Broforce, the audio stuttered, things froze
I then re-started, continued playing, this happened again
I then re-started, this happened right after Windows 10 loaded, at that moment I suspected this was a hardware/BIOS issue
(Windows 10 was at the Balanced mode)

My second suspect would be the Realtek drivers

Currently my $3500 system is pretty much a brick

I'm running the 1.2 Version of BIOS, at V1, 3DMark was freezing and the OS was failing, switching to 1.2 fixed this issue

Other components:
EVGA FTW 980Ti (79% ASIC, so it should be one of the good ones)
Corsair RM650i PSU
6700K
3000MHZ 2x 8GB Corsair Dominator's (Happens with XMP, without XMP)
Samsung 850 Pro Raid0

The OS was a bare installation with only the drivers from Intel, Nvidia, Realtek and no bloatware
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Other errors are:
CLOCK_WATCHDOG_TIMEOUT
SYSTEM_THREAD_EXCEPTION_NOT_HANDLES (got this after reseting CMOS right now)
Logged

Svet

  • T9246ED
  • Administrator
  • *****
  • Offline Offline
  • Posts: 73310

does it worked fine before?
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

does it worked fine before?

Kind of

Anyway, I found the cause of the issue: XMP

With XMP enabled, CPU reaches 90C's at Cinebench, it's usually around 87C's
With XMP off, CPU is stable at 70C's, it usually moves around 67C's

So even though XMP is for memory, it causes the CPU to heat

I just hope the damage to my CPU isn't permanent

I booted with a Win10 USB disk, ran chkdsk, found+fixed a problem, after that I was able to start using Win10 again

My theory is, the XMP issue caused an overheat/instability, which caused Raid0/file-system to fail, which chkdsk fixed
-or- The issue is still there, I just haven't experienced it again yet

In any case, it's obvious that MSI's XMP for this motherboard should be avoided
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

It is most likely ring and SA voltages applied for DDR-3000.
80c is not a reason for a crash, and 70c is also very high for cinebench, what cooler do you have?
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

It is most likely ring and SA voltages applied for DDR-3000.
80c is not a reason for a crash, and 70c is also very high for cinebench, what cooler do you have?

Thanks for the reply

I have a Noctua D9L that I installed very delicately, we also built 2 identical machines with my brother, the temps are identical, yet his system doesn't crash

The issue is definitely the Auto voltages of the motherboard, I don't get how the auto voltage even in BIOS can be 1.27V, it seems too high since the stock is 1.2

I set the stock voltage to 1.2V, it fixed the issue for now, but I think the CPU is extremely weakened, the motherboard did it's damage

Ran Prime95 for 10-15 minutes, there were no issues, there were also no chkdsk/sfc issues, I think I also mis-interpret the chkdsk output initially, it was motherboard+cpu all along

Going to test gaming now, I would appreciate voltage setting advice, I would like to prevent motherboard from over-voltaging anything else, currently I only set the CPU to 1.2, everything else is default
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

With voltage set to 1.2, the temps are at 55C at load by the way, I get 846cb from cinebench, it was 866cb with Auto at 67C, and 926cb with XMP at 90C+

The motherboard is overclocking and overvoltaging the system with these Auto settings without informing the user

I'm going to request a replacement CPU from MSI Turkey, yet I'm 99% sure they won't just send me a replacement CPU to mend the potential damage
Logged

Svet

  • T9246ED
  • Administrator
  • *****
  • Offline Offline
  • Posts: 73310

here is beta .134 version if you wanna try [at own risk]
* E7980IMS.134.rar (6329.39 kB - downloaded 194 times.)
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

here is beta .134 version if you wanna try [at own risk]
[ ERROR: SPECIFIED ATTACHMENT MISSING ]

Thanks for sharing this, however with more/specific info, this doesn't really help

I tested the 1.34 BIOS anyway, at Auto+XMP, the values still doesn't seem right, even with a small benchmark nudge, the CPU power usage closes 140W, since I don't overclock, it should be around 95W, is this normal?

I didn't push and cause another issue, so Windows 10 posts, at this point I don't want to let the motherboard damage the components further, I would really appreciate some specific help
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

I tested the 1.34 BIOS anyway, at Auto+XMP, the values still doesn't seem right, even with a small benchmark nudge, the CPU power usage closes 140W, since I don't overclock, it should be around 95W, is this normal?
Excess voltage for stock is quite normal, you do not OC but by default the motherboard enables "Enahanced Turno", which means max turbo ratio on all cores at full load, + excess voltage.
Same goes to my CPU and board, you can simply under volt it as you did.

140W looks right if we go by AnandTech:
http://www.anandtech.com/show/9483/intel-skylake-review-6700k-6600k-ddr4-ddr3-ipc-6th-generation/6
Teir test show 100W more total than idle system power with OCCT which only use AVX, Prime95 use FMA3.
Normal applicaions/games should not be that high.

When you mention "values" be more specific which values, e.g. vcore, DRAM voltage, ring voltage etc.
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

I set all the basic OC voltages to the default values informed at the 1.34 BIOS

I started running Memtest86, there were no issues with 1 CPU, yet with 4CPU/8CPU's (Tested with HT on/off) - the Test6 consistently fails, my guess is the 2nd CPU is now faulty, however with HT, sometimes 3 sometimes 7 reports errors

Tested the second identical system I mentioned, I also set it to 1.2V to prevent this issue from ever happening, luckily that one is functioning well

I'm not sure what to do at this point, I definitely have to replace the CPU, I'm not sure whether I should replace the motherboard too

Going to test things with each of the DDR4's alone now, to make sure it occurs on both, with the assumption that they can't both be faulty, to put the blame on CPU/Motherboard with near 100% possibility
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

Clearing CMOS after BIOS update is better than setting indivifual settings to defalt.

You only have one CPU (with 4 cores). There is a chance that one core is faulty, however memtest perform very simple tasks aimed to stress memory not the CPU. Chances are one module is faulty, test them individually.
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Clearing CMOS after BIOS update is better than setting indivifual settings to defalt.

You only have one CPU (with 4 cores). There is a chance that one core is faulty, however memtest perform very simple tasks aimed to stress memory not the CPU. Chances are one module is faulty, test them individually.

By default I mean CPU default, not "Auto" - Auto's of the motherboard causes a BSOD real quick (VCore "Auto" that is, but I have others set to their default values too, one of them was 0.9 for example, forgot the others)

Memtest's Test 6 is failing randomly, on different CPU cores it seems, as I've been running things further, only the Test 6 failed up until now

As I read online, just test 6 failing might be a sign that the CPU Memory controller being faulty, or depending on the address range, it might be something else, yet my address ranges seem random too, attaching a photo:


I don't think it's a memory issue, as I consistently reproduce the BSOD/shutdown scenario with Auto VCore, yet 1.2 Vcore is relatively stable
Here is my build: http://pcpartpicker.com/p/MqnZYJ
Got Corsair Dominator's as they seemed to be the most reliable ones
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

Thinking is not is not an option, test modules one by one
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Swapped ram modules of 2 systems, lets call the systems system_clean + system_bsod and call the rams ram_clean + ram_bsod

Memtest86
The ram_bsod's produced errors at system_clean
The ram_clean's didn't produce errors at system_bsod (didn't test too length, I might re-visit this)
ram_bsod/1 was previously tested at system_bsod - so tested ram_bsod/2 at system_clean, hoping it would produce errors, it didn't
tested ram_bsod/1 again at system_bsod, it still didn't produce any errors

So as it is, only the ram_bsod's fail when they are used together in dual channel, if I can't single out one of the ram_bsod's as faulty, I'm going to re-test ram_clean's at system_bsod

If ram_clean's fail at system_bsod, it would mean there is something wrong with the motherboards (or RAM compatibility), I don't want this either, I really don't want to dis-assemble things, really want this motherboard to work, there are no alternative ITX motherboards either
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

You said identical systems, the RAM is not identical?
Have you tested both ram_bsod modules in the same slot?
I take it CMD16GX4M2B3000C15 is the RAM you have? It is on the board compatibility list, if ram_clean still produce no errors best just RMA the RAM.

No one wants to take his computer apart, but we do if we must. To lower the chance I would have to, I do all the tests before I place the motherboard in the case.
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

You said identical systems, the RAM is not identical?
Have you tested both ram_bsod modules in the same slot?
I take it CMD16GX4M2B3000C15 is the RAM you have? It is on the board compatibility list, if ram_clean still produce no errors best just RMA the RAM.

No one wants to take his computer apart, but we do if we must. To lower the chance I would have to, I do all the tests before I place the motherboard in the case.

It's 2/2 divided of this RAM package: http://www.amazon.com/Corsair-Dominator-Platinum-3000MHz-CMD32GX4M4B3000C15/dp/B00Q85WCE8/ref=cm_cr_pr_product_top?ie=UTF8

I haven't been doing slot based testing when testing single sticks, tested both sticks in random slots, both sticks and slots tested individually, I also haven't tried swapping positions of the RAM sticks in a dual setup

After my last message, I ran memtest86 Test6/7's with ram_clean's on system_bsod, there were no errors after 5-6 runs
I also tested ram_clean's on system_bsod with "Auto"+XMP settings, settings which overheat things too much, there were still no memtest issues, tho the heat was there, didn't get a chance to run benchmarks on Win10

My brother was getting anxious to use his PC, so I returned ram_clean's to system_clean, at this point I'm left with only system_bsod and ram_bsod

Since I tested both sticks individually and together, I'm out of ideas at this point, together they fail, individually they seem to work

I'm open to testing/benchmarking suggestions

I think I'm going to buy some HyperX's locally if I can't solve this issue tonight, I'm wondering whether the "Auto" settings for RAM stuff might be causing the issue, a lot of people online claim they set RAM parameters manually and achieve a reliable system that way a) I'm not sure what parameters to set b) It seems the RAM should just work with non-XMP defaults - so even if I make things work, I might be making a faulty RAM not fault - I;m not sure whether it's something I want to do, so buying the HyperX's makes more sense (I don't like buying stuff locally as there are very few options, really hate to let go off the Dominators)
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

If one set fail on both and the other doesn't, I don't question any further.

BTW this ram is targeted at X99 systems according Corsair site http://www.corsair.com/en/dominator-platinum-series-32gb-4-x-8gb-ddr4-dram-3000mhz-c15-memory-kit-cmd32gx4m4b3000c15
Although there are 2 DIMMs kits that are aimed for both, we cannot know if they changed something (e.g. SPD) or they are simply just 2 DIMMs of the same type.

See if you can trade this kit for 2 kits intended for Z170, by both or either MSI and/or Corsair.
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

I'm extremely curious, so I question things as much as I can
It's easy to replicate the issue on a dual setup, running Prime95 for 30 minutes with any setting replicates the issue, Win10 starts throwing DLL errors etc. - restarting into Memtest86 shows the issue
Did the same thing with the ram_clean/system_clean, it didn't occur, which points further to a RAM issue

However, the irritating thing is, doing the same thing with single sticks of ram_bsod's doesn't reproduce the issue, I even tried XMP etc. - I tried anything I could try, no issues

In any case, I ordered these: http://www.kingston.com/dataSheets/HX428C14SBK2_16.pdf - Only Savage's were available for fast shipping locally, and 2400MHZ C12, 2666MHZ C13, 2800MHZ C14 was available, I chose the 2800MHZ one, somehow it seemed like the more solid one for non-XMP usage, as the XMP/non-XMP timings are close - the 2666MHZ one has an odd 45ns tRCmin - otherwise I was going to choose that one - later on decided to choose the 2800MHZ one, as it's stable XMP timings are close to the standard DDR4 timings of 15-15-15 - at this point I'm aiming for stability only

These Savage's are not the motherboards tested memory list, I'm assuming it's because they just debuted, but they were the only decent ones available locally
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

When the new RAM's arrive tomorrow, which BIOS should I use for testing?
I currently have the unlaunched 1.34 one provided

I read that MSI sometimes releases new BIOS'es with improved RAM compatibility, this is why I'm asking
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

Avoid flashing BIOS while you can, keep what you have and see how it goes.
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Hey everyone, so I got a Trident Z 8GBx2 3000Mhz kit today, first one I bought had a broken capacitor, with great difficulty I got a replacement and installed it now

Initially memtest was ok, didn't let it run completely, booted to Win10 to test some basics, set CPU to 1.23V and DRAM to 1.2V manually from BIOS, to prevent BIOS from doing funky voltages, the remaining voltages are attached

I want to make sure these voltages look right before I start torture/stress testing, currently running a default memtest instead

(Ran 5-6 Cinebench's, no Prime95, the score was 912cb with no XMP and these voltages, better than 896cb with the Dominator's, no issues so far, but like I mentioned, haven't started stress testing)

VCCSA is 0.125V, that doesn't seem right, does it?
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

VCCSA is 0.125V, that doesn't seem right, does it?
Nor does CPU I/O, DIMM, and ring I can't see might be VIN4 which doesn't seem to be too right either.
Also motherboard/AUXIN1/2 don't look real.

You may want to ask Martin about it http://www.hwinfo.com/forum/
If you have Command Center installed you may have better readings.
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Nor does CPU I/O, DIMM, and ring I can't see might be VIN4 which doesn't seem to be too right either.
Also motherboard/AUXIN1/2 don't look real.

You may want to ask Martin about it http://www.hwinfo.com/forum/
If you have Command Center installed you may have better readings.

I didn't think of this before, but I guess I should also try setting one of those explicitly and check the value (safe value for one of them) - in the meantime, I will also check them with MSI Command Center

Apart from these, are there any voltage I should set explicitly to avoid Auto overclocking (Like CPU Voltage for example, I love the dynamics with 1.23V, performs similar to ~1.3V Auto voltages, runs much cooler)

The reason I'm asking this is because I'm afraid there is a voltage there I don't know about which will fry some inner components/RAM when left at Auto
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Set SA/IO manually to safe values, checked with hwinfo64, hwmonitor, command center, none reads the values (MSI Command Center only reads the GT values as Auto, but no actual value)

At this point, I think the sane thing to do is to take a leap of faith and use CPU 1.23V, DRAM 1.2V, and GT Auto - does this sound logical?
(GT Auto as I have no idea what else to put there)

My main objective is to achieve an extremely stable system and try to avoid what I experienced initially (in the off chance that auto voltages etc. caused the issue in the first place, although it was likely a faulty RAM)
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

In Command Center, click information the HW Monitor that's where you see the voltages.
No need to set anything manually if not proved to be excessive.
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

I checked that Tab but as far as I remember it was extremely basic, just some main temperatures and no detailed ones like GT/IO/SA etc.
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

Not that window then.
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

To conclude this issue, after installing the Trident Z, I've did some stress testing, memtesting, gaming, re-installed the OS just in case the faulty memory corrupted things before

So for the time being, Corsair/Dominator's takes the blame - I will update this thread if the issue strikes again, otherwise silence = no issues

I would update the OP too, to let people know about the solution, yet as far as I see the forum is buggy, can't edit anything

I might even try XMP/Auto settings in 2-3 months when the dust settles

Thanks a lot for all your help during this issue
Logged

Woomack

  • PRIVATE FIRST CLASS
  • ***
  • Offline Offline
  • Posts: 9

Regarding the error message in 1st post - it's related to CPU instability or general stability issue. Some of the reasons are below.
I don't think that memory is faulty but more like it needs correct settings. Most memory kits on the market are designed for X99 motherboards and have slightly different profiles what is causing them to lose stability on Z170. I had a chance to test ... let's say many DDR4 kits and recently most I'm testing on that MSI board. Here is link to my tests/reviews if anyone is interested.
If you are looking for memory which is working on XMP profiles then G.Skill Ripjaws V and Trident Z are so far the best from kits I was testing. Geil Dragon was working good too. Kingston was working fine but I had only Fury DDR4-2666 kit. I still have to test Kingston Savage DDR4-3000 which I received yesterday.
Corsair kits have some issues on Z170 motherboards, Maybe they just repacked X99 kits without proper testing, hard to say.

I was using MSI Z170I Gaming Pro AC for some time for tests and now it's in my daily PC. It's really great board for memory overclocking but BIOS still has some issues. I was checking couple of BIOS versions ( I got betas from support ) and on all above 1.10 is problem with power limit. When you leave max wattage and current at default/auto then CPU is throttling or system shuts down under full load.
On new versions is also huge vdroop even though in BIOS description there is info it's solving vdroop issues. On my CPUs when I set 1.25V then under full load it's running at 1.20V. This is one of reasons why you can see mentioned Windows error.

I just finished reviewing Trident Z 2x8GB DDR4-3200 16-16-16 memory kit which on ASUS Maximus VIII Hero motherboard couldn't run stable above DDR4-3333 while on MSI Z170I Gaming Pro AC is running @3600 17-18-18 without bigger issues. If you are overclocking DDR4 on Z170 then you know how hard is to even boot 8GB modules above DDR4-3400 on most motherboards.
I was also able to boot 2x4GB kit above DDR4-4000 mark but my memory is too weak to stabilize it.

I know that part of my post is kinda off topic but I just wanted to let you know that board is great, just needs better BIOS.
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Regarding the error message in 1st post - it's related to CPU instability or general stability issue. Some of the reasons are below.
I don't think that memory is faulty but more like it needs correct settings. Most memory kits on the market are designed for X99 motherboards and have slightly different profiles what is causing them to lose stability on Z170. I had a chance to test ... let's say many DDR4 kits and recently most I'm testing on that MSI board. Here is link to my tests/reviews if anyone is interested.
If you are looking for memory which is working on XMP profiles then G.Skill Ripjaws V and Trident Z are so far the best from kits I was testing. Geil Dragon was working good too. Kingston was working fine but I had only Fury DDR4-2666 kit. I still have to test Kingston Savage DDR4-3000 which I received yesterday.
Corsair kits have some issues on Z170 motherboards, Maybe they just repacked X99 kits without proper testing, hard to say.

I was using MSI Z170I Gaming Pro AC for some time for tests and now it's in my daily PC. It's really great board for memory overclocking but BIOS still has some issues. I was checking couple of BIOS versions ( I got betas from support ) and on all above 1.10 is problem with power limit. When you leave max wattage and current at default/auto then CPU is throttling or system shuts down under full load.
On new versions is also huge vdroop even though in BIOS description there is info it's solving vdroop issues. On my CPUs when I set 1.25V then under full load it's running at 1.20V. This is one of reasons why you can see mentioned Windows error.

I just finished reviewing Trident Z 2x8GB DDR4-3200 16-16-16 memory kit which on ASUS Maximus VIII Hero motherboard couldn't run stable above DDR4-3333 while on MSI Z170I Gaming Pro AC is running @3600 17-18-18 without bigger issues. If you are overclocking DDR4 on Z170 then you know how hard is to even boot 8GB modules above DDR4-3400 on most motherboards.
I was also able to boot 2x4GB kit above DDR4-4000 mark but my memory is too weak to stabilize it.

I know that part of my post is kinda off topic but I just wanted to let you know that board is great, just needs better BIOS.

Thanks a lot for sharing this experience, one quick but important question:

Do you think it's possible that my CPU/motherboard is faulty?
(CPU was my initial suspect, it would need to be ~DOA as the temps were always below 90C, yet after I replicated the issue on the second machine, I eliminated the CPU/motherboard)

Other than this, I've tested the faulty pair on 2 identical systems, they faulted, while the non-faulty pair worked on both, which points to memory issues, but I agree that the lack of Z170 testing with these memories might be to blame too, it's just strange that the issue happens on 2 sticks and not the other 2 sticks, the 4 came in the same package

I agree with your assessment, better to buy memories that are built with Z170's in mind, in fact I got 2x8GB Trident Z 3000Mhz's now, my current aim is to achieve a stable system, so I'm using these at non-XMP for now: http://www.gskill.com/en/product/f4-3000c15d-16gtzb (No issues, stress tested + memtested, + gamed)

Strangely, I also have 2x 8GB 2800Mhz HyperX Savage's, so I'm curious how your results are going to turn out, my brother still uses the Dominator's, I'm either going to get him some Trident Z's too, or replace his Dominator's with Savage's, yet obviously he isn't interested, since we tinkered with the systems too much (Intending to return the Savage's unopened)

Do you suggest I urge him to replace the Dominator's?

I'm currently using the 1.34 BIOS myself, my brother is on the public 1.2, so far both systems are stable, his system was stable from day 1, mine just stabled after the Trident Z's, did a lot of things, the BSOD's are no more, hoping it keeps on going this way, staying away from XMP as with this motherboard it bumps the CPU temp by 15C for me, no need for that much heat

I also see that less-than-set voltages with the BIOS's, I have the CPU at 1.23V, it's usually around 1.2V, might put the DRAM at the same voltage too
I still think the Dominator's are faulty tho, the issue I had was there from BIOS 1, I only updated after experiencing the issue, it persisted on all BIOS settings, all voltages, after this issue I researched Dominator's a lot, it seems Corsair likes to mix and match IC's a lot, there is no way to know which IC's I have, so I concluded that it's safe to put blame on Corsair in this matter
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

By the way, I decided to give XMP+Auto-CPU-Voltage another try

With Prime95 running, the cpu temp was around 72C, the DRAM was at 33C's, yet I got a shutdown, shortly after I started Prime95, like 1-2 minutes later, it was pretty shocking

Tested XMP alone, Auto-CPU-Voltage alone, didn't experience the same shutdown, although I didn't push things

So, overall, my system is probably still not stable, but on the bright side, with CPU set to 1.23V and DRAM set to 1.2V with XMP off, I didn't have any issues (gamed, tested for days), however these issues are still bugging me, as they might be the signs of another issue that manifests itself like this

I wish Woomack shared more of his wisdom, but I'm guessing he isn't subscribed to this thread, might PM him later on
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

I think either my CPU/Motherboard or PSU is a not-fresh egg (not rotten :) http://pcpartpicker.com/p/dBCnf7

At this point the sane thing to do is to accept this and move on, I also agree with Woomack, the BIOS could/should be improved a lot (I believe the Auto settings/dynamics are too funky)

XMP should've worked with default BIOS settings, G.SKILL also lists the MSI Z170I as compatible with these RAM, but with my machine, anything relatively non-standard-ish doesn't work, as far as I researched, the situation isn't too uncommon, many people reported similar issues with XMP over the years (the issue might not be specific to XMP)

Anyway, I've stress tested the CPU/GPU/RAM for 1-2 hours, with Prime95-mixed + Furmark, no issues, cool temps all around (1.23V CPU, 1.2V DRAM, no XMP, 60C CPU, 76C GPU, 55C Motherboard (PCH climbs very slowly) ), I will consider this a victory and move on

(I've attached an Event Viewer screenshot, it seems the Kernel Power 41's were being recorded even when things were flawless, they might be unrelated, but attaching the log just in case)
(I was buying Trident Z's for my brother too, cancelled that order after this testing, if those RAM works for him, it would be beyond stupid to risk things with new RAM, his system was capable of XMP+Auto too, although he doesn't stress test like me, his regular usage is a stress test on his own, but we have his system at 1.23V too currently)

(Failed to screenshot enough of the UI but e1dexpress is an Intel Network card disconnection error, it's always near the Kernel Power errors)
Logged

Chike

  • LIEUTENANT GENERAL
  • ****
  • Offline Offline
  • Posts: 4250

Details of kernel power and network events?
Logged
Motherboard: ASUS Z97-K R2.0 BIOS ver. 0903 2/26/2016
CPU: i5-4670k
Cooler: CNPS10x Performa
Memory: F3-2400C11D-16GXM
GPU: EVGA GeForce GTX 950 FTW ACX 2.0
SSD: 1. Samsung PM981 512GB, 2.  Samsung 840 EVO 120GB
HDD: WDC WD3200AAKS-00L9A0
PSU: Antec 500W VP500PC
OS: Windows 10 Home 1903 64-bit / Windows 7 Home Premium 64-bit
Case: Compucase 6C60

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Details of kernel power and network events?

Attaching 2 more screenshots of the Event Viewer

I also checked the second identical system, it doesn't have any KernelPower events reported, no similar errors/warnings either

The weird thing is, I only had one shutdown when I re-tried XMP+Auto with TridentZ's, I don't know why the other KernelPower41's are there and the KernelPower41 error is pretty vague

My theory: Weak CPU or Weak Motherboard, which doesn't let XMP+Auto work

So far didn't experience any shutdowns with CPU set to 1.23V and DRAM set to 1.2V with XMP off, after I replaced the faulty Corsair RAM's, I didn't experience any BSOD's either, no memtest issues too, at this point, the shutdown I experienced with XMP+Auto yesterday with my new TridentZ's is keeping me investigating

Edit: As far as I deduct, Fast Boot seems to be failing, the system is probably losing power fast after startup, that's why I didn't even notice those KernelPower issues, as they happened too fast, as I gather online, some people tried to tie these issues to faulty mobo's before, however nothing conclusive
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

So I've been seeing this KernelPower errors more, registers like daily

I also noticed that my Windows time and timezone was lost today, I wonder whether two things are related

What's strange is that I don't experience these loss of powers myself

One theory I have is that these Event Viewer issues are unrelated to my previous issues and they happen when the system sleeps or when the system is booted without the monitor connected, tho this theory doesn't explain the loss of time/timezone
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

As a small update, I found the cause of the KernelPower issues, when there is no monitor when you post, the OS logs this error, since I use my monitor with 2 systems and since the system boots in 2 seconds, this issue happens from time to time as the system boots faster than the monitor can switch

So the only remaining issue is memory incompatibility or another underlying cause, I researched Z170/MSI/XMP issues, the issue seems to be common, it would be great to get a BIOS update aimed towards solving the memory issues, it would also be great if the IO/SA voltages were readable, I currently can't read them from anywhere, I think they are too high with XMP, that's why the CPU probably shuts itself off
(On all the issues I've read, the solution was to get replacement memories tho, while everyone agreed that the issue was with the motherboard BIOS)
Logged

WaltC

  • FIRST LIEUTENANT
  • ***
  • Offline Offline
  • Posts: 338

As a small update, I found the cause of the KernelPower issues, when there is no monitor when you post, the OS logs this error, since I use my monitor with 2 systems and since the system boots in 2 seconds, this issue happens from time to time as the system boots faster than the monitor can switch

So the only remaining issue is memory incompatibility or another underlying cause, I researched Z170/MSI/XMP issues, the issue seems to be common, it would be great to get a BIOS update aimed towards solving the memory issues, it would also be great if the IO/SA voltages were readable, I currently can't read them from anywhere, I think they are too high with XMP, that's why the CPU probably shuts itself off
(On all the issues I've read, the solution was to get replacement memories tho, while everyone agreed that the issue was with the motherboard BIOS)
Reading through this thread I had the following impressions

*PSU problems
*Overvolting problems
*Overclocking problems

As for the overvolting/overclocking, bear in mind that nobody warranties their hardware to do either, although many people do both and often, for some reason I have never understood, seem to think their hardware is warrantied to do both (it's not...;))  Some hardware will let you do both, some will not, but none is guaranteed to overvolt and/or overclock. 

You can damage a cpu by a phenomenon known as electromigration and the smaller the production process is for a given cpu the more vulnerable to electromigration it is--the less over-volting it takes/the less over-clocking it takes before there is permanent damage.  Just something to be mindful of.

It's also possible your ram might be bad...because if it is then even when you clock it down to slower speeds it will still cause errors, just not as many.  Also, by saying your system boots in 2 seconds, I gather what you mean is that it is coming out of sleep mode--the monitor error is caused by the fluctuating power levels not handled perfectly by the drivers.  I would suggest changing that and simply not letting the system sleep at all because hardware drivers often have trouble with sleep/wake states.  I never let my systems sleep (all desktops)--they are either on and operating under Cool'nQuiet, or shutdown completely.  (Unless you have a laptop you want to sleep to save battery--but then, in that case, may the Good Lord help you...;))
Logged
MSI Gaming Pro Carbon  x470 UEFI bios 2.80
Corsair HX-850 PSU (70a x1 12v rail)
AMD Ryzen 5 1600 @ 3.8GHz, voltage set, 1.385V
RAM 16GB 2x8GB, Patriot Viper Elite @3200MHz 16 16 16 36 1T, Gear Down, BankGroupSwapAlt, def voltage
Stock AMD cooler (95W version--thanks AMD!)
LG MultiDrive DVD writer SATA
Boot: Samsung 960 EVO NVMe 250GB (UEFI boot partition)
2x 1TB WD Blue 7200 rpm
ST2000DM SATA III 2TB
ST4000DM004 S3 4TB
AoC U3277PWQU 3840x2160 monitor
AMD RX-590 8GB/RX-480 8GB Crossfire
Realtek 1250 sound w/Nahimic
Win10x64 UEFI (Insider's builds)
Secure Boot ON

**It is well-known that I don't make mistakes, and so if you encounter a mistake in anything I have written then you can be certain that I did not write it...;D

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Reading through this thread I had the following impressions

*PSU problems
*Overvolting problems
*Overclocking problems

As for the overvolting/overclocking, bear in mind that nobody warranties their hardware to do either, although many people do both and often, for some reason I have never understood, seem to think their hardware is warrantied to do both (it's not...;))  Some hardware will let you do both, some will not, but none is guaranteed to overvolt and/or overclock.  

You can damage a cpu by a phenomenon known as electromigration and the smaller the production process is for a given cpu the more vulnerable to electromigration it is--the less over-volting it takes/the less over-clocking it takes before there is permanent damage.  Just something to be mindful of.

It's also possible your ram might be bad...because if it is then even when you clock it down to slower speeds it will still cause errors, just not as many.  Also, by saying your system boots in 2 seconds, I gather what you mean is that it is coming out of sleep mode--the monitor error is caused by the fluctuating power levels not handled perfectly by the drivers.  I would suggest changing that and simply not letting the system sleep at all because hardware drivers often have trouble with sleep/wake states.  I never let my systems sleep (all desktops)--they are either on and operating under Cool'nQuiet, or shutdown completely.  (Unless you have a laptop you want to sleep to save battery--but then, in that case, may the Good Lord help you...;))

Thanks for the input, I grew weary of anything Corsair, tho the PSU is a RM650i, I've checked but couldn't find any negative reviews, as I read a lot of similar XMP issues, I eliminated PSU myself

I personally undervolt instead, the CPU is at 1.23V for example, the min voltage the BIOS gives is normally 1.26V, too much for no gain (1.26V-1.4V, better keep it constant at 1.2V)
I also think BIOS's Auto voltages might have caused this issue in the first place, they were too funky, kept power usages too high, setting a constant voltage soothed things

Anyway, as for the cause of the KernelPower's, it's a cold boot, with fast boot, it takes 2 seconds for Win10 to load, I agree with sleep things tho, better prevent system sleep

The XMP shouldn't be considered an overclock in normal circumstances, although it's technically an overclock, it's guaranteed to work by both the mobo and the memory

As I also mentioned, this happens with the replacement TridentZ memory, I'm not going to replace the replacement tho, better use SPD speeds

TL;DR: KernelPower's are misleading, the remaining issue is that the mobo can't stably XMP
Logged

Svet

  • T9246ED
  • Administrator
  • *****
  • Offline Offline
  • Posts: 73310

Have a try this new BIOS who should helps with this issue:
https://www.dropbox.com/s/tbc32l04mtiwgfs/E7980IMS.143?dl=0
Logged

kaansoralTopic starter

  • SECOND LIEUTENANT
  • **
  • Offline Offline
  • Posts: 119

Thanks a lot for checking up on me/this-thread Svet <3

I received the BIOS from MSI support too today, as I had an XMP related ticket, indeed it seems to solve the XMP issues, I've stress tested the system for 1-2 hours, also gamed for 1-2 hours, no issues

At the start of this thread, I was very anxious about these kind of issues, but as I learned now, they occur usually because of small reasons and there are almost always easy solutions through BIOS (worst case scenario, underclocking manually a bit)

Tested the latest BIOS with Auto+XMP, 1.2V+XMP, 42x-1.23V+XMP, 44x-1.26V+XMP (ICCmax flag - returned to 42x), all performed well
Logged
Pages: [1]   Go Up