Intel X99 Motherboard Goes Up in Smoke For Reasons Unknown
This morning I woke up bright and early to benchmark some DDR4 memory kits and found myself waking up not to Folgers in your cup, but the smell of burnt electrical after loading the XMP profiles on a memory kit and restarting the system. Let me tell you what happened, the best I can.
On Friday I spent the day wrapping up the benchmarks on the Kingston HyperX Predator DDR4 16GB (4x4GB) 3000MHz quad-channel memory kit (part number HX430C15PBK4/16) that runs at 1.5V. This morning I wanted to get another kit of DDR4 memory tested, so the system was powered down, the CMOS was reset and a G.Skill 16GB (4x4GB) 3000MHz memory kit (F4-3000C15Q-16GRR) was installed and powered up the system. It posted fine, so I went into the UEFI and set it to run at the only XMP profile on the kit. The UEFI changes were saved and the system restarted. It was during the next seconds that both the board and the processor would be killed off in a rather unspectacular death. The system came up, hung for a very short time and then powered off with a audible click of the Corsair AX860i power supply. If you’ve ever heard the loud click of the Over Current Protection (OCP) shutting down the PSU you know exactly what click I heard. Now when I press power button on the motherboard the system clicks after being on for a split second. I unplugged all the cables on the power supply and did the built-in self-check and it passed with flying colors. I swapped out the PSU with a backup Corsair AX860i and the same click was to be heard. After clearing the CMOS, removing the memory, SSD and video card the system still wouldn’t post. At that point in time I switched to a non-digital power supply (Corsair AX1200) and it did the same thing although this time the OCP took a little longer to kick in. There was some audible crackling noises, followed by some smoke near the CPU VRM heatsink. So, the heart shattering smell of burnt electronics filled the room and I knew my day wasn’t going to be a good one.
I removed the board from the test bench and started to do a visual inspection and couldn’t see anything wrong with any of the components on the front or back of the board. I know where the smoke came from, so I removed the VRM heatsink and the burnt electrical smell got stronger. There was some discoloration new to where one of the mosfets sits on the thermal pad, so clearly it was a failure of CPU voltage regulation system and one of the eight 60 Amp phases (Dr. MOS IOR 3550M mosfets) has appeared to fail.
It isn’t burnt badly at all, but you can some of the signs of an electrical failure on the second power phase from the bottom.
Looking at the board we can see that the failed component in question is part of the PQ1004, which is part of the VCCIN or basically the processor input voltage. Crap! On these Haswell-E processors, Intel has moved the voltage regulation on-CPU as part of the new Fully Integrated Voltage Regulator (FIVR). Previously there were five separate input voltages the motherboard handled: Vcore, Vgpu, VCCSA, VCCIO, and the PLL. On Intel Haswell-E processors all five internal power rails are pulled from the single VCCIN and the components on ours just had a nuclear meltdown.
You can tell things got pretty hot as there were actually solder balls where they weren’t supped to be!
Our worst fears were confirmed when we pulled out our backup ASUS X99 Deluxe motherboard and put the original Intel Core-i7-5960X processor in and the system wouldn’t post. The boards debug display showed Q-Code 00, which is a bad sign. We tossed in our backup Intel Core i7-5960X processor and the system booted up just fine and we are off and able to benchmark again. The bad news is that I managed to kill an ASUS X99 Deluxe motherboard ($398.99 shipped) and an Intel Core i7-5960X processor ($1049.99 Shipped) after using it for less than two weeks, which is a bit unusual and why I am sharing information about this failure to the readers of Legit Reviews. It is not an everyday occurrence where $1450 in hardware gets put out to pasture.
LR isn’t the only site that has had a board go up in smoke as Michael Larabel over at Phoronix had an X99 board go up in smoke as well. He was not using the same brand of motherboard or power supply model, but to see X99 boards failing this early in the game is alarming.
I’ve been in contact with ASUS, Intel, Corsair and Kingston and no one is exactly sure what happened to our system. Was running 1.50V on the DDR4 memory too much? Is there something wrong with the VRM design or did we have a bad component on our motherboard? Was the power supply faulty? We aren’t sure, but we are going to be overnighting this board back to ASUS Taiwan on Monday (9/8/2014) and we have arranged for Corsair to put our PSU on their scopes and test equipment to make sure it is working properly.
We’ll let you know what happens if we find anything out in the weeks ahead!
Update 9/9/2014 – ASUS has received the X99 Deluxe motherboard that failed along with the power supply that was in the unit at the time of the failure. An ASUS employee will be taking the PSU to Corsair for testing on 9/10/2014. The board will be looked at here in the US and then will be shipped to ASUS HQ in Taiwan. In the meantime we have two ASUS X99 Deluxe motherboards being used by staff members that are still up and working properly and we have been told by ASUS that they have replicated out exact system and that they tested it overnight and haven’t experienced any failures. We’ll keep you updated and will post the actual Chroma test results from the PSU once we get them from Corsair. It was also learned in the past 24 hours that the ‘high’ DDR4 voltage (1.35 to 1.5V) that we were using on the board shouldn’t have caused any issues since the memory controller in Haswell-E can actually support DDR3 and DDR4, although only DDR4 is being implemented. DDR3 memory kits run at 1.5V, so that almost eliminates that from being the issue. The good thing about posting publicly about this failure is that all the companies involved are taking this seriously and are working overtime to ensure there isn’t an issue somewhere. We are really glad about that as we have recommended the board, processor and power supply and want to ensure that the platform works if our readers purchase the parts based on our recommendations.
Update 9/11/2014 – ASUS HQ in Taiwan has the motherboard and is looking at it now. ASUS and Corsair were unable to get the power supply tested on Tuesday and scheduled a time with Corsair to get the power supply on the Chroma tester on Friday 9/12/2014. This was the earliest that the Chroma tester was available for use during business hours.
Update 9/13/2014 – Corsair along with ASUS tested the pair of AX860i power supply that Legit Reviews was using on the test bench at the time of failure on Friday the 12th. Both power supplies passed the initial Chroma test passes, but we learned something that we previously did not know. When the Corsair AXi series of power supplies came out in 2012 they featured a single rail design. Corsair switched to a multiple rail design for the power supply series in 2013 (previously unknown to us). This is obviously a significant difference in the design of the power supply. We also learned that the earlier single rail power supplies did not have OCP enabled by default. One would have to install the Corsair Link Software package and manually set the OCP limits manually for that to function on the earlier models. We were not using the Corsair Link Software on the test bench, so therefore our power supply could have 90A or more potentially running down the rail. This might have exacerbated the damage to the CPU’s VR circuit if there was a bad component or solder ball joint present.
We looked around at Newegg, Amazon, Scan and other major online retailers and have the Corsair AX860i as having a single +12V rail. Heck, even Corsair.com has it listed as having a single rail! Hopefully now that this has come to light the folks at Corsair can update their own site and get retailers to properly list the specifications of this PSU in the listings.
ASUS is still working with the failed board and is going to be replicated our setup with the power supplies in Taiwan. Yes, these two power supplies have now been shipped to Taiwan where ASUS HQ will be able to test our first revision AX860i Power Supplies as the ones they were testing with this past week trying to replicate our system without failure was done using the new post-2013 power supplies. The exact cause of the fault is not known, but much is being learned by everyone and most of it is valuable information that will help the community. We’ll report back with more once ASUS is ready to give us an update that is ready for public consumption. That will happen after additional testing is done on the board over the days ahead.
Update 9/16/2014 – We are still working with Corsair to find out more on the firmware update that was done on their power supplies back in 2013. We have asked for dates and power supply lot numbers, so users can find out if they have one of the original ‘old’ AXi series power supplies that has no OCP by default. We also pointed out to Corsair that there is no mention of this in the instruction manual and that many users might not be aware that their flagship PSU has features that aren’t enabled unless they do so manually. From the sounds of it Corsair just updated the firmware and went to a multi-rail configuration. We’ve talked to several people about this issue and it was unclear if there was a hardware change and that is still being looked into. The bad news is that the firmware is not end-user upgradable. We have asked Corsair what if anything current customers can do since the firmware can’t be upgraded in the field. If you have an AXi series power supply we highly suggest downloading the Corsair Link software and programming the OCP setting.
Kingston Technology contacted us today and informed us that they will be lowering the voltages on the pre-production DDR4 memory kits that were sent out at 1.5V to 1.35V when they are shipped out to consumers. Kingston never shipped any DDR4 memory kits at 1.5V and won’t be. It doesn’t appear that the memory running at 1.5V had anything to do with our failure, but ASUS is still testing. We haven’t heard from ASUS in the past 48 hours and last we heard they were still looking into things in Taiwan now that our ‘old’ power supplies without OCP have arrived. ASUS said they will be giving us an official statement about the failure when the research is completed and we hope that will be sometime soon.
Update 9/17/2014 – Replacement ASUS X99 Deluxe motherboard was delivered (11 days from point of failure to replacement board being delivered).
Update 9/18/2014 – Corsair has gotten back to us with some answers to some questions that we asked earlier this week. It turns out Corsair shipped AX760i/AX860i/AX1200i power supplies for about four months before they changed the firmware on them without notice. The firmware is not field upgradeable and Corsair will not be offering exchanges for anyone with an ‘older’ model that wants to swap out a PSU for one with the latest firmware on it. Corsair also said that by the motherboard makers [ASUS] own admission, the X99 Deluxe motherboard was the root cause for the failures. Corsair also said this which we will directly quote: “Would an OCP-defaulted AXi or a competitor OCP-enabled PSU have save the CPU? Were skeptical, but maybe.” So, right now it looks like the board had a failure and then when the system was restarted the PSU without OCP may or may not have taken out the CPU through the boards failed VR circuit. We are still waiting on ASUS to give us an official statement as to what happened to the board and were told that a typhoon in the region this week has slowed things down. In the meantime here are some answers to a Q&A that we gave Corsair that you can take a look at.
– When did Corsair change the firmware on the AXi series of power supplies?
AX760i/860i implementation date 3/15/2013 Lot#:13119560
AX1200i implementation date 3/8/2013 Lot# Lot#:13099520
Corsair shipped the AX760i/AX860i/AX1200i for about four months before they changed the firmware on them. If you bought one of these models when they first came out you likely have one with old firmware. The Corsair AX860i first was made available for sale with Amazon on November 1st, 2012, so just a heads up to early adopters.
– Can you please highlight what all changes with the new firmware?
PSU set to multi-rail (which by definition is OCP).
– So, you went from a default configuration of one +12V rail with no OCP to a virtual multi-rail setup with OCP enabled by default?
Yes.
– Why was this change not made public?
We saw no need for an announcement. The PSU design and its features stayed the same and this isnt a design fault.
– Can end users with the original PSU design update their firmware at home?
No.
– How can an end user know what firmware is on his/her PSU? (Can users identify by the serial number what PSU they have? )
By the serial number. The first four digits are the date code. The first two digits are the year and then the next two numbers are the week of the year that the power supply were produced. The image above shows a Corsair AX860i Power Supply with serial number 1249954 that was made the 49th week of 2012 and would be running the original firmware.
AX760i/860i implementation date 3/15/2013 – First Lot number was: 13119560
AX1200i implementation date 3/8/2013 – First Lot number was: 13099520
– If users cannot upgrade the firmware at home, can users exchange their PSU for a model with OCP enabled by default?
No.
– How many Amps does the OCP default to on the AXi series. I heard it is different for each PSU?
By default, 40A. This is configurable.
– I was told that Intel Haswell-E processors are using up to 47A when overclocked to 4.4GHz and that it exceeds the OCP on some PSUs. Some motherboard makers are telling us to stay away from certain PSU’s. What are your thoughts on this?
When you have a PSU with multiple +12V rails, OCP can easily trip if the CPU is overclocked and running over load. This is why Corsair PSUs with Link Digital allow the user to disable OCP and why all other Corsair PSUs feature a single +12V rail.
– ASUS designed the VR circuit on their X99 platform with 60A components. Corsair came out with the AXi series in 2012 with an adjustable OCP that was off by default. Was Corsair foreseeing a situation in the future where end users could customize the OCP setting depending on what motherboard they were using?
Initially, Corsair was simply following our existing trend of providing power supplies with a single +12V rail. Since OCP is most beneficial during the initial build stage of putting together a PC, it made sense for the PSU to have the OCP on by default and therefore we decided to make the change.
Update 9/22/2014 – ASUS informed Legit Reviews that they will need 2-5 more days before releasing an official statement on the failure.
Update 10/01/2014 – ASUS is needing more time to discuss what they have found out with Corsair before making a public statement. We have been told that it will be another day or two until they will be able to say something. It sounds like some issues have been found during testing and the results are being looked into by both companies.
Update 10/6/2014 – The two replacement Corsair AX860i power supplies were just received today, which just happens to be one month to the day from the platform failure. ASUS informed us this morning that Corsair is requesting more information on their test setup and that needed more time again. We are hopeful we’ll get an answer soon, but it appears that ASUS and Corsair are having issues with what the test data relates to with the failure. It sounds like multiple things went wrong and of course no company wants to take more blame than they have to.
Update 10/9/2014 – Just received word from ASUS that the Corsair engineering group received the additional information that they wanted about the testing ASUS did and now ASUS is waiting to hear back. If ASUS gets an answer tomorrow they are hopeful that they will be able to provide a detailed response on the failure or failures that they believed happened to out platform on Monday (10-13-2009).
Update 10/13/2014 – We talked with ASUS and Corsair over the past several days and there was mention of needing a few more weeks to figure things out. It appears that Corsair is disagreeing with the ASUS findings about the power supply we were using and the testing of it at ASUS HQ. Legit Reviews has been kept in the dark for the past several weeks and can’t really do anything other than wait. Right now it appears that something might have been off with our power supply. Corsair gave us the Chroma test reports from their testing, but we have not been given anything by ASUS. We don’t know what was seen in the latest round of testing, but we do know that ASUS tested the PSU on a system and not just a PSU test machine. What did they find out that Corsair disagrees with? It feels like some people are trying to drag this out and hope it blows over. Is it working? We are still seeking answers from both ASUS and Corsair.
Update 10/22/2014 – No update from ASUS USA or Taiwan, but we did find out that someone at ASUS Singapore posted up a comment at Hardwarezone about our failure. ASUS@SG stated that our board failed due to a faulty pre-production BIOS/UEFI. They claim that the bug was discovered by Intel and the fix was done by ASUS before the first official UEFI release. Now Intel is being blamed? The same ASUS eCustomer Service Center employee then followed that post up with a post stating that Intel could have had a bad batch of Intel Haswell-E processors. Very interesting. Legit Reviews still is still waiting on our official answer from Taiwan. It is disappointing to see possible answers to our issues on other sites as ASUS hasn’t said anything about a bad BIOS/UEFI to us since this whole ordeal began.
Update 10/22/2014 Part 2 – ASUS USA told us they aren’t sure why ASUS@SG would post such comments and told us that is not the primary reason for our board failure. They didn’t deny having a UEFI/BIOS bug though, so this is starting to get interesting. Could we be getting close to finding out why our ASUS X99 Deluxe motherboard failed? Hopefully ASUS will give us something since they have representatives telling people what happened on forums, but then the ASUS employees we are working with are telling us that is not the truth. Ugh!
Update 10/23/2014 – ASUS released UEFI 1004 for the ASUS X99 Deluxe today and we have been told that this update includes an EC (Embedded Controller) Firmware update that fixes something discovered by our board failing here at Legit Reviews. We don’t have the official response from ASUS yet, but Legit Reviews highly suggests that all ASUS X99 Deluxe owners update to UEFI build 1004 due to the fixes implemented in it for the way the board power is being handled. The build date on this UEFI is 10/16/2014, so it has been around for a week before it was made public. ASUS also reprogramed the memory tables after receiving new microcode from Intel. That made a world of a difference on our board when running memory kits beyond 3000MHz with 1T Command Rates. Here is a list of the key changes:
ASUS X99-DELUXE BIOS 1004 Change Log:
1. Update EC FW
2. Fix crash free issue
3. Fix Xonar card compatibility issue
4. Revise Thunderbolt memory resource
5. Enhance Xeon CPU compatibility
6. Rebuild SteamOS boot option
We have been told to expect the final answer on our failure from ASUS on Friday.
Update 10/24/2014 – Legit Reviews was just sent the failure analysis response on our ASUS X99-Deluxe Motherboard. You can read it in its entirety below:
Hi Nate,
We have determined the primary cause of failure for the pre-production ASUS X99 Deluxe you were testing on September 6, 2014 along with a secondary cause gathered during the investigation phase. Our initial analysis of the VRM Phase-4 MOSFET/Driver package failure is a bad solder point that was also present at the VRM Phase-3 location resulting in the failure you described along with the presence of solder balls. Additional analyses lead us to believe this was the secondary cause for the failure described.
After extensive testing and collaboration with leading power supply manufacturers and our VRM supplier (International Rectifier) it was determined that the new VRM design on the X99 Deluxe board needs a firmware update to balance start-up and shutdown power loads and sequencing across the MOSFET/Driver packages. We determined that higher loads were placed on the VRM Phase-4 package when the processor was drawing less than 70 amps of current during start-up and shutdown sequences based on the original firmware. This along with some older power supply OCP/Shutdown anomalies results in a bad component combination that randomly (very) leads to a VCCIN spike and power surge on the VRM Phase-4 MOSFET/Driver package that could cause component failures. We still have not replicated this failure in our test labs after thousands of hours of testing across a significant number of component combinations. However, we believe this was the primary cause for the initial board shutdown and the solder point issues exacerbated the subsequent component failures in the manner you outlined when using the secondary power supply. The good news is that we have a solution to this potential issue.
We are releasing a new EFI (build 1004) today that addresses this issue by balancing start-up and shutdown power loads across all VRM Phases when the processor is drawing less than 50 amps. This will greatly mitigate the chance of a VCCIN spike or power surge in rare instances based on extensive testing. Our new balancing/sequencing rules will decrease overall power efficiency results by a few percent based on processor loading under 50 amps but otherwise the boards overall performance will not change. In addition to the new power rules, EFI release 1004 features a host of performance improvements with significant improvements in the area of memory overclocking using the 100 strap at speeds up to and past DDR4-3300.
We highly recommend that all users of the ASUS X99 Deluxe board download and install EFI 1004. This EFI is available from our support site starting today – X99-Deluxe EFI 1004. Please follow the proper instructions for updating your EFI. The update guidelines are available at ASUS USB BIOS Flashback Guide or follow the instructions in the user manual when utilizing USB BIOS Flashback or EZ Flash 2.
ASUS is firmly committed to supporting our class leading X99 Deluxe motherboard that features unmatched performance and options like our patent-pending OC Socket, 5-way Optimization for one click overclocking and fan control, 3×3 802.11ac, Fan Expert 3 and Crystal Sound 2.
Sincerely,
ASUS
When our ASUS X99-Deluxe motherboard failed back in September during a system restart things just didn’t add up. We’ve spent tens of thousands of hours testing hardware components over the past 12 years and we posted up our findings in the event that others in the enthusiast community might have similar issues. When this post went online it got mixed feedback from many in the community. ASUS asking us to take it down until the cause was figured out was expected, but we were shocked that other hardware reviewers in the article posted negative comments (still shown below) about how we were doing ASUS an injustice. Unfortunately, we weren’t the only people in the world to have an ASUS X99-Deluxe motherboard fail and almost all the failures reported online were by folks that also were using Corsair AX/AXi power supplies. Something was obviously going wrong with that combination and no one could figure out what was causing the failures at first.
ASUS went out and bought Corsair AX860i power supplies to try to replicate our problem, but it was later discovered that the power supplies being sold today were different than the ones we were using. It turns out Corsair changed the firmware on their power supplies and the change was rather significant. The firmware changed the default configuration of the power supplies in this series from one +12V rail with no OCP to a virtual multi-rail setup with OCP enabled by default. This obviously didn’t cause the board to fail, the bad solder point was, but it could have contributed to the processor blowing up after the solder failure as OCP wasn’t enabled on our PSU.
Our failure caused ASUS to gather power supply manufacturers and their VRM supplier (Iternational Rectifier) and they looked at the new VRM design on the X99 Deluxe motherboard for weeks. They went through the design from top to bottom and figured out that they did indeed need to balance start-up sequences as the loads weren’t even. Guess what VRM Phase has the highest load? Yes, the one that failed on our board during a shut-down/start-up sequence. It appears that our Corsair AX860i power supply was one of the power supplies that had these anomalies and between the PSU design and the original VRM balancing/sequencing on the ASUS X99 Deluxe motherboard were just right to cause the failure we saw in the field. ASUS has released UEFI 1004 that changed the way the balancing/sequencing is done on the ASUS X99 Deluxe motherboard. They believe this will prevent any failure from happening to others and we are excited and happy that ASUS was able to come up with a fix. ASUS has proven that they will go the distance to make sure that their boards are stable. We are also glad that we published this article. We learned a great deal by doing so and it brought a situation to light that might not have been otherwise.
To summarize:
- ASUS X99-Deluxe Motherboard owners should immediately update to UEFI 1004. (No other ASUS Intel X99 board is impacted since the ASUS X99 Deluxe had a unique VRM design and firmware setup)
- Corsair AXi Power Supply Users Should Use OC Link Software to enable OCP on their PSU if it was made before March 15th, 2013.
- The ASUS X99-Deluxe Motherboard is safe to buy. We got another ASUS X99-Deluxe board up and running the day this one failed and it has not failed. We had some DDR4 stability issues with memory kits running 3200MHz, but ASUS fixed that with new memory tables in UEFI 1004. This board is one of the most thoroughly looked at Intel X99 boards on the market today and that should bring comfort to many. We know many of our readers were waiting on this situation to end before buying a $392 board and we don’t blame you.
This concludes our ongoing coverage of our ASUS X99 Deluxe motherboard failure. We’d like to thank our readers that stood up behind us on day one and laugh at those that lashed out for posting this article. Legit Reviews was created to bring the truth out and to make the enthusiast community better. This article is an example of what drives Legit Reviews and we hope our readers appreciate our vision. Also we’d like to thank the many companies that were involved in solving this issue. You could have left us in the dark, but you didn’t. You trusted our information and spent time and money to come up with a solution to the issue. You don’t get support or trust like that with all companies, but ASUS, Corsair and Intel are industry leaders for a reason.