Question NOT GUARANTEED RESOLVED, BUT SEE POST #50: Source of intermittent, occasional critical stop crash and shutdown -- Event ID 41. Could it be RAM?

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
I've posted a thread recently in "Power Supplies" about the possibility that a bad UPS was the cause of a critical stop that would occur about once every two weeks.

My sig system has been refitted with a Kaby Lake 7700K, and I swapped out four TridentZ 3200 RAM sticks for two (2x16) which "seemed" to have the same specs: DDR4-3200, 14-14-14 and 1.35V.

Perhaps a couple months after installing and testing, I had this critical stop on June 27, 2021. I had gone through a makeover after damaging the motherboard with a static charge to a USB-charging vaping pen. (Stupid -- I know). In July, I made the mistake of leaving a non-bootable install CD in the optical drive, and the system wouldn't boot because it would halt at the drive enumeration phase, trying to figure out what was going on with the optical disc. I put the problem on hold for a few months, then recognized the issue with the CD, took it out, and everything was fine. I'd also replaced the PSU just days before resolving the CD problem. Critical stops continued, and it was not too long before I realized the first had happened -- as I said -- in June.

So eliminating the PSU, I began to suspect the UPS and swapped it out. Things seemed as though this had rectified the problem, but no -- the critical stop occurred again after two weeks, and a few days ago.

I ran the INtel diagnostics on the processor, and everything is good. The motherboard was replaced at the time of the vaping-pen accident. Further, I can't imagine how the motherboard would cause these intermittent crashes. I conclude that it could be the RAM, possibly the video card (but not as likely), or a driver or software misconfiguration or conflict. The BIOS is up to date. All the other drivers are up to date. I'm running the latest feature upgrade to Windows 10.

Some web forums and threads indicate RAM as a source of the problem. In my case, the crashes seem more serious than what others describe as blue screens. The system shuts down. I cannot cold-boot by hitting the switch. I have to unplug the PSU after switching it off, wait a minute, then plug it back in and switch on. NOte that all the overclock settings have been set back to the ASUS "optimal defaults", but the problem persisted.

THE RAM

I looked carefully to replace my 4x8 set of TridentZ with a 2x16. Never mind my profligate and unnecessary spending. I just wanted to do it. The original modules were 3200 14-14-14 GTZ kits. The replacement 2x16's were 3200 14-14-14 GTZR. There was no reliable information about compatibility with my Z170 board from the G.SKILL configurator page. I sent an e-mail to G.SKILL: their techs are always quick to respond and very helpful. But this time, their tech said "Oh! It will PROBABLY work. You may have to loosen the timings and run them at a lower speed. . . " At first, everything seemed fine at the spec settings, but I remembered the e-mail response. Checking the web again, there was never any mention that the GTZR's were compatible with the Z170 boards. They seemed to be made for the Z270 or later.

So I think I should swap the original GTZ RAM sticks back in, perhaps even just one pair (2x8).

Anyone with comments or insight? An error that occurs every couple weeks takes patience to monitor and cure. So far, no hard disk corruption or other problems.

One more thought. For years I've been using Romex Primo-Cache to speed up SATA storage, and I therefore inclined to 32GB of RAM or more to use in L1 caching. There's also a 256GB Sammy 960 EVO NVME drive as L2 cache, and I haven't attended much to keeping it trimmed. So I've deleted my caches for the time being until I get this resolved. Could the caching drive have something to do with this? I don't know. Could it be the graphics card? It seems tip-top, but I can't be sure until I test that component as well.

As for the motherboard. It would seem to me that a problem with the board would show itself in more frequent and consistent crashes. No matter but for the potential inconvenience: I have a spare returned by ASUS under warranty RMA. Of course, I'll have to go through reactivations with software. I don't think I lack any in the way of spare hardware to fix this, PROVIDED that it's a hardware problem. It will be a pain-in-the-*** to swap out the motherboard. Just because . . . . it will be a P-I-T-A, that's all . . . .
 
Last edited:

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
What BSODs are you getting? Nirsoft's BlueScreenView is helpful for compiling a neat list.
That's just it, Mikey. I always use BlueScreenView when I can. But these shutdowns don't create mini-dumps or blue screens, so bringing up BSV just shows a blank window of nothing but the title and menu bars. If there were mini-dumps, and IF it were a RAM or driver problem, there would be indications in BSV. There would be a stop code indication right on the Blue Screen window message if I actually got a "Blue Screen". But -- no mini-dump, no indications, can't be sure what it is. It apparently wasn't the PSU, or the UPS. CPU checks out OK. I'm hoping that it's just RAM, or I'll have to pull expansion cards (NVME and an SATA controller) and partially disable my system for a few weeks before I'd know anything. A graphics card swap won't be a problem, but the storage devices will be inaccessible until I can be sure.

Again -- I can't imagine a 1-year-old motherboard throwing crashes like this every two weeks. The mobo was only about two or three months old before it started to happen. There would be more indications of malfunction than "power event" shutdowns. When it happens, the system is at idle. I'll come back to my machine after a few hours away, and it will either be shut down as I described, or the login screen will be waiting for my password. But only about every two weeks -- occasionally after one week or so. RAM is constantly in use -- it's "volatile". That's my hope right now. Sticks are easy to pull and replace. And RAM swapouts don't require re-activation of the OS or the software . . .

Whether the fix is easy or troublesome, I need to nail this down -- not for being an "enthusiast" but rather for daily, regular, serious personal and business computing. It's my guess that a lot of mainstreamers might not focus keenly on shutdowns that occur every two weeks. To me, it's an indication of a problem that needs to be resolved.
 
Last edited:

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
Yo, Mikey and all reading members.

I just discovered something that might have to do with swapping out 8GB modules for 16GB units.

I was thinking to change the RAM speed back to what it should be. I'd dropped it down to DDR4-3100 from 3200, remembering what G.SKILL tech support had said in their e-mail response last spring. I thought to set it back to 3200, so I did a review of BIOS settings when I made this change.

There is a setting I'd never paid attention to. It was labeled "DRAM current capability" -- the default setting at 100%, with options of 110, 120% etc. The built-in explanation in the BIOS states that a higher setting may be needed for "OC'd" RAM or HIGH DENSITY RAM modules. And it dawns on me that all the TridentZ sticks I'd used were "OC'd" even for the spec RAM speed, but this new set of 2x16's has posi-lutely abso-tively got to be "high density". So I bumped it up to 110%, and we'll have to wait and see.

The BIOS description also specifically noted that an insufficient setting would cause the system to immediately shut down. That's at least consistent with my observations -- "immediate shutdowns" such that one must fiddle with either the CMOS reset pins or the PSU just to get it to boot up -- as one might experience with an unstable OC setting.

I can "report back" here in two weeks or so. This is going to take time, whether I found something with the BIOS setting or parts need to be swapped. It isn't going to happen until at least a week hence, possibly, likely -- two weeks.

This all started when I got careless charging a cannabis vaping pen. People could say "what have you been smoking, for being stupid enough to charge such a device on a PC USB port?" The answer there is explicit in my admission. But if it hadn't happened, I'd still be running a perfectly stable system with an i7-6700K and four TridentZ RAM sticks. I just decided to get loose with my wallet and swap in more new hardware than was needed for just the motherboard failure after the USB controller went south more than a year ago.
 

fidgety

Junior Member
Jun 11, 2020
1
0
11
Hi, if you can afford to not use the PC for a couple of days, (you said you used it for business) I'd recommend running memtest86 for 48 hours. If it finds a problem, you've probabily got the cause of the BSOD (or at the very least you have a problem that needs to be fixed). The free version of memtest86 will need to be restarted once in a while, I'm guessing a couple of times a day for 32GB.
 

mikeymikec

Lifer
May 19, 2011
17,816
9,815
136
You mentioned event ID 41. According to MS, if a Bugcheck occurred, you ought to be able to find the bugcheck code in those entries. I'd also disable the automatic reboot on system failure option which is annoyingly enabled by default. I'd go through the logs of the last few times such a system 'blip' occurred and find out whether Windows just thinks it wasn't shut down correctly (so a spontaneous reboot/power loss is probable), or whether an actual crash occurred, etc.

memtest86 wouldn't go amiss. v7.4 can be configured to repeat forever if you want to go as long as @fidgety suggests, but the default option of 4 passes in my experience tends to find memory problems if they're there.

I'm not used to troubleshooting memory issues with lots of high-end RAM. There are two tactics that I employ:

1 - test all the RAM and see if anything bad happens.
2 - break down the testing of the RAM into portions (e.g. one at a time).

Option 2 becomes necessary if option 1 goes awry in order to figure out which module(s) are problematic.

Another potential option is to switch off the aggressive timings and go for something really plain and simple with the RAM.

Why did you switch RAM btw?
 

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
You mentioned event ID 41. According to MS, if a Bugcheck occurred, you ought to be able to find the bugcheck code in those entries. I'd also disable the automatic reboot on system failure option which is annoyingly enabled by default. I'd go through the logs of the last few times such a system 'blip' occurred and find out whether Windows just thinks it wasn't shut down correctly (so a spontaneous reboot/power loss is probable), or whether an actual crash occurred, etc.

memtest86 wouldn't go amiss. v7.4 can be configured to repeat forever if you want to go as long as @fidgety suggests, but the default option of 4 passes in my experience tends to find memory problems if they're there.

I'm not used to troubleshooting memory issues with lots of high-end RAM. There are two tactics that I employ:

1 - test all the RAM and see if anything bad happens.
2 - break down the testing of the RAM into portions (e.g. one at a time).

Option 2 becomes necessary if option 1 goes awry in order to figure out which module(s) are problematic.

Another potential option is to switch off the aggressive timings and go for something really plain and simple with the RAM.

Why did you switch RAM btw?
I originally had the idea that 32GB in a two-stick, high-density kit would allow me to tweak the command-rate to 1. It is a noticeable performance improvement, or at least I'd noticed it when I'd done it. It didn't seem to work as I'd anticipated, so I didn't move forward with more tweaking to get it to work. In fact, my suspicions about the cause of this problem might be the root of the difficulty with the command-rate tweak.

Again -- I could've addressed the "vaping-pen-static-charge" accident by simply replacing the motherboard under warranty RMA. I was in too much of a hurry. At the same time, I had stimulus money that we didn't need, and I began collecting parts. I have enough to build a complete twin to this system, including the computer-case, but I haven't had the time. Of course, some would say, why build another using the same 5-year-old Gen 6 or 7 technology?

But even if I build the twin, once done -- I'd only need to swap out the motherboard, RAM and processor to upgrade to a hexa- or octo-core Rocket Lake or current generation. These Skylake cores are plenty fast for me and what I do with them.

Now I have the idea that I could add a second kit of DDR4 2x16=32GB for a total of 64, eliminate the L2 cache and 256GB NVME I use with Primo-Cache. But the RAM would have to be perfectly error-free.

I use HCI-Memtest-64 for testing. This RAM had been thoroughly tested through 500% or 5 iterations when I first put it in the system. The first critical stop error likely occurred within a couple weeks of the new-parts upgrade.

All the symptoms -- how it shuts down, even infrequently -- and the BIOS item "DRAM current capability" -- make the BIOS item suspect. Before I do anything more, such as you rightly suggest, I need to determine that this is -- or isn't the problem. But it seems likely. As I said, also addressing Fidgety's remarks, a 500% HCI-Memtest-64 is pretty thorough. Let's see if the "current capability" tweak resolves this.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
I had high hopes I wouldn't be coming back to this thread for at least two or three weeks, thinking to announce that my BIOS setting fixed the problem. No cigar. System crashed/rebooted today, only 4 days or so after the last time.

I'm going to pull one RAM stick at a time -- re-socketing the remainder in the appropriate slot. If it's not RAM, I'll have to pull other components.

So -- time to follow MikeyMikec's prescriptive diagnosis plan -- all very sensible. Still need to check the network driver for update. Everything else is current . . .
 

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
In the process of trying to fix this, I thought I'd ask for insight on the issue of my Kaby Lake's IMC or memory controller.

When I first installed this 2x16GB TridentZ GTZR kit a year ago, I didn't pay attention to how an "auto" setting would affect the VCCIO voltage for the IMC. I began by running HCI Memtest-64 through about 3 iterations or "300%" on the 32GB of RAM, which means the IMC was being run at the "auto" voltage setting for maybe 2 days. I soon checked later to find that the "Auto" setting was giving the VCCIO 1.3V. Most advice about the VCCIO voltage says that this is the upper limit of safety for the IMC.

Anyone have any ideas about how two days running memory test on the KABY IMC at 1.3V might damage the IMC? Other information says that too much VCCIO might take a month to damage the processor.

In the meantime, I'm going to swap out the GTZR RAM for a known good set of 2x8=16GB. I really hope that my problem isn't the CPU. The only thing I could swap in there is the original i7-6700K, which I was saving to build a second system as needed or desired.

As I said, increasing the "DRAM Current Capability" from 100% to 110% apparently didn't help, but there was an observable change in the time between Critical Stops. 4 days is the shortest time between crashes that I'd seen over the months I was seeing them happen.

UPDATE: I'm hoping that my problem is the TridentZ GTZR 16GB RAM modules. Here, the goal of testing them is secondary. The primary objective is to prove that the RAM is indeed the problem -- not which module. I put a 2x8 TridentZ in to replace the GTZR's. I can wait a couple weeks to see if there are any more critical stops. If not, I can pop in a set of 2x16 RipJaws and test those thoroughly, then send the GTZR's for RMA replacement.

Otherwise, I have to worry that the problem is not RAM, but something else.
 
Last edited:

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
April 25, 2022

The last critical stop was 5 days ago. It occurred a mere 4 days after that previous to it, and before that, it was happening between 7 and 14 days apart. It seemed to show some dimension of random variation. To the point -- now that I've swapped in a known good set of 2x8 TridentZ RAM, I have another two weeks or so to wait before I know the answer I seek for sure: it's a RAM problem with the 2x16 TridentZ GTZR sticks.

This morning, I did another web query about "symptoms of defective RAM".

The article listed about 8 symptoms.

1. Infamous Blue Screen of Death. [My critical stops wouldn't throw blue screens. The computer would just shut itself down. But my symptom still falls within a general category that includes #1. I noticed the setting for "Restart and REcovery", needing to be changed to allow small memory dumps -- which was not previously the case. So I changed it to that setting].

2. Sporadic PC freeze. [ This is also part of the general category, but in specific it has not happened. Again -- the shutdowns are as described in "1". Sometimes, the system would just shut down; other times, it would reboot to the logon screen.]

3. Declining PC Performance -- particularly slow or sluggish behavior. [Bingo. I could put my system to sleep with maybe ten EDGE tabs open, my accounting software up and running together with my document management software. Raising it from sleep (or even hibernation, for that matter) would show the system slow to respond to mouse clicks on "Start" or menus therein. The software would respond slowly. I would have to "restart" to stop this behavior.]


4. Attempting to install new programs fails. [I haven't seen this; but I haven't been installing new software frequently.]

5. Random reboots. [Again -- a description of the shutdowns, sometimes occurring as described in "2."]

6. Files get corrupted. [As far as I can tell, performing disk checks on my boot drive -- not yet.]

7. Missing RAM. [Nope -- it would show up in BIOS properly to the full amount, and equally -- in a monitor program like HWInfo64.]

8. Computer beeps, or unexplainable beeps. [Nope.]

My symptoms seemed to overlap 1,2,3 and 5 -- four out of eight (although there are other miscellaneous -- unknown to this PC.]

So far, the various symptoms -- for instance, the sluggish behavior -- are no longer there with the replacement RAM. If the system continues another 2 weeks with this RAM, then I assume I've found the trouble.

We'll have to wait and see. YOU can wait and see, or wait until I come back to this thread to report. Or don't hold your breath and don't wait at al. After all -- it's not your problem: it's mine . In this strategy, I'm the one who must wait . . .
 

CuriousMike

Diamond Member
Feb 22, 2001
3,044
543
136
A random crash every 4-7 days would be driving me absolutely nuts - if you've swapped P/S, motherboard and RAM... what else is there?
Software?

If you could afford / borrow a second machine to test these pieces in it might help narrow it down.

If its a missions critical machine (business) I'd have tossed it in the eHeap long ago and just replaced it with a laptop / other completely new machine.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
A random crash every 4-7 days would be driving me absolutely nuts - if you've swapped P/S, motherboard and RAM... what else is there?
Software?

If you could afford / borrow a second machine to test these pieces in it might help narrow it down.

If its a missions critical machine (business) I'd have tossed it in the eHeap long ago and just replaced it with a laptop / other completely new machine.
Actually, that's almost what I did. I bought a laptop this summer just to configure for the work I was doing on the desktop, as a backup. And I've used it that way, as necessary. Because I had this agenda in mind, it was a hasty purchase and I might have had something more to my liking for 60% of what I spent. I bought a $1,700 LG 17" Gram. I wish I'd bought an $800 Acer Nitro and spent another $200 on extra NVME and RAM.

Again, to recap -- this all started after the vaping pen accident a year ago, when the board, processor and RAM were replaced. It would've been more prudent for me to simply keep the RAM I had. Still waiting to prove my suspicions about the RAM, I'm growing confident with each day that passes that RAM is indeed the problem.

I can see your point -- "put it on the eHeap" and take it to the recycler. But it had just been such a great machine until the vaping pen accident. I'd put a lot of work and planning into building it. Since the processor has been abandoned by MS for support by Win 11, the system has perhaps three years before support ends. On the other hand, I've been running some Win 7 systems, and I've got a friend in Virginia who's running one of those and an old XP setup. OR maybe it was the OS following XP but before 7. I've even forgot the name of that OS. Oh! Vista.

I've looked at all the drivers, and updated the Intel LAN driver. Graphics is up to date. All of the last or final versions of the other drivers have been installed. Unless I can get a dump file, it won't be so easy finding a software problem. I think there are some items, however, that I'd rather not have and which can be uninstalled.

Well, given this strategy of "needing" to use it in the interim, and simply determine whether or not RAM is the problem, I have to take a day at a time. As I said, if it gets through three weeks without a power event, I can almost be sure I've solved the problem and fixed it.

So I'll certainly be coming back to these threads as I find out more . . . .
 

mikeymikec

Lifer
May 19, 2011
17,816
9,815
136
Knowing what the BSOD messages are would be of some help. BSODs happen for a tonne of reasons, just knowing one has happened means 'something bad happened': not exactly premium-grade evidence to act on.
 

Shmee

Memory & Storage, Graphics Cards Mod Elite Member
Super Moderator
Sep 13, 2008
7,450
2,490
146
At this point, my guess would be motherboard or CPU. Like it could be a partially bent pin, or a flaky memory controller. Have you tried a CPU stress test? I would also run memtest86 to confirm memory is ok, as mentioned already.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
Schmee, you could be right, and I have spare replacement parts. As I've said elsewhere, swapping RAM is much easier than taking the box apart, pulling the mobo, reinstalling a CPU, and putting it back together. Then -- there are a handful of activation obstacles for OS and software. So you can see why I'm hanging my hopes on RAM as the culprit. This is day six, with no critical stops. another eight days will be two weeks. If it goes a month without an error, then the known-good RAM that I swapped in identifies both the problem and makes the correction.

Nothing to do but wait and see, as I said earlier.

Mikeymikec -- Yes! You are absolutely, positively correct about that aspect. I'd LOVE to have a dump file that I can see through BlueScreenView, but it hasn't created any. I've set the parameters in windows to create a Mini-dump, and maybe I'll get one if it crashes again. BSV usually will indicate enough information to see if it's hardware or software.

I still think it was the RAM sticks. None of the peripheral symptoms that would arise when I was getting these crashes now show at all. No sluggish behavior, first of all. That was definitely a symptom of RAM causing a problem.
 

mikeymikec

Lifer
May 19, 2011
17,816
9,815
136
Schmee, you could be right, and I have spare replacement parts. As I've said elsewhere, swapping RAM is much easier than taking the box apart, pulling the mobo, reinstalling a CPU, and putting it back together. Then -- there are a handful of activation obstacles for OS and software. So you can see why I'm hanging my hopes on RAM as the culprit. This is day six, with no critical stops. another eight days will be two weeks. If it goes a month without an error, then the known-good RAM that I swapped in identifies both the problem and makes the correction.

Nothing to do but wait and see, as I said earlier.

Mikeymikec -- Yes! You are absolutely, positively correct about that aspect. I'd LOVE to have a dump file that I can see through BlueScreenView, but it hasn't created any. I've set the parameters in windows to create a Mini-dump, and maybe I'll get one if it crashes again. BSV usually will indicate enough information to see if it's hardware or software.

I still think it was the RAM sticks. None of the peripheral symptoms that would arise when I was getting these crashes now show at all. No sluggish behavior, first of all. That was definitely a symptom of RAM causing a problem.

Just knowing the BSOD message (the bit in all caps) is the bit you need the vast majority of the time (I can't remember the rest ever being relevant for my needs). Memory issues tend to give themselves away with random BSODs but there are some ideal ones like MEMORY_MANAGEMENT and PFN_LIST_CORRUPT. Hopefully you've already disabled the 'auto restart after BSOD' setting so you'll actually be able to read the message.

What's the SMART data like on your internal drives? Any dodgy sectors of any sort? CRC errors?
 

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
Just knowing the BSOD message (the bit in all caps) is the bit you need the vast majority of the time (I can't remember the rest ever being relevant for my needs). Memory issues tend to give themselves away with random BSODs but there are some ideal ones like MEMORY_MANAGEMENT and PFN_LIST_CORRUPT. Hopefully you've already disabled the 'auto restart after BSOD' setting so you'll actually be able to read the message.

What's the SMART data like on your internal drives? Any dodgy sectors of any sort? CRC errors?
Hmm . . . inventory . . . . Two Sammy 960 NVME-s -- 1TB and 256GB -- latter for caching. All the caching has been paused, and the caching NVME has been wiped clean. One 2TB Crucial SATA SSD, a 1TB 2.5" HDD and a 2TB 2.5" HDD -- latter for daily Macrium backup. Windows finds no issues with those -- any of them. CrystalDiskInfo shows all of them "Good" in "health status", which I assume would draw from the Smart info of each one.

Of course, the prevailing wisdom in something like this is to remove all the PCIE cards, then add them back in one at a time until the error re--appears -- if a PCIE card is the problem. One could swap out the graphics card, and of course, the RAM is easy. But this sort of thing would be even more frustrating for seemingly random errors that occur with a frequency of only 14 days to 7 days.

Basically what you say is absolutely correct. I'd want to see the blue-screen stop code. For instance, I think a string ending in "9C" would point to RAM (my memory is failing -- not the RAM -- MY memory of stop codes. But all that is available online to supplement my aging memory . . . ). I'm going back into the "recovery" screen to see if auto restart is enabled.

We're at "day 7" with no events on the known-good RAM. I figure in another two weeks, I can either start worrying about another possible hardware source, maybe get a BlueScreenView that will point to some flaky driver or software, (etc.), or proclaim that RAM caused it, problem solved.

Otherwise, how ya gonna replicate a problem to occur within a couple hours, when these critical stops have occurred mostly between 7 and 14 days apart?

JUST A SIDE BAR: VirtualLarry often chastises me for holding on to dated hardware too long. More than once, he's advised "Why don't you get a current-gen processor and board?" Well, I have a lot on my plate right now. Until the last couple years, my annual personal budget included $1,000/annum for hardware and software, even if I didn't spend all of it.

So I'll post a general question on the "PC building" forum. I'll call it "The Last PC", like certain movie titles.
 
Last edited:

VirtualLarry

No Lifer
Aug 25, 2001
56,402
10,083
126
JUST A SIDE BAR: VirtualLarry often chastises me for holding on to dated hardware too long. More than once, he's advised "Why don't you get a current-gen processor and board?" Well, I have a lot on my plate right now. Until the last couple years, my annual personal budget included $1,000/annum for hardware and software, even if I didn't spend all of it.
For the record, all of my Ryzen PCs crash occasionally. It's just a thing with Ryzen, I guess. Sigh. Or my mining is really just too hard on it.

Edit: Regarding PC part aging and mysterious crashes - "bathtub curve" - it's a real thing.

Edit #2: Friend had a refurb Dell-based Sandy Bridge gaming PC that I had assembled for him a few years back, had a mobo go bad on him fairly recently. ("Bathtub curve", again.)
 

mikeymikec

Lifer
May 19, 2011
17,816
9,815
136
None of my Ryzen builds have a tendency to crash. Admittedly only one is high-end (5800X), the rest are 2x00G or 3x00G.

Re old hardware - as far as I'm concerned, it's good until it starts to have weird problems, at which point one should troubleshoot it as if it were new, IMO.

Re 7-14 days apart - I'd say if it clears a month without issues then I'd start gaining confidence that the problem is fixed. Two months is pretty safe. Do you know for a fact that it's about 7-14 days apart (e.g. does the Windows Reliability Monitor back you up on that)?
 

CuriousMike

Diamond Member
Feb 22, 2001
3,044
543
136
None of my Ryzen builds have a tendency to crash. Admittedly only one is high-end (5800X), the rest are 2x00G or 3x00G.

Yeah, I've run a 1700 for years without issue.
My 5600X was run hard for about a year before being set aside as the gaming machine.
My main box, a 5700G, is my daily work machine and driver... never turned off (but I do use sleep on it. Hmm. I did replace the Gigabyte mobo with an Asus because the Gigabyte would lose bluetooth on awake. But that's not Ryzen.)
 

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
None of my Ryzen builds have a tendency to crash. Admittedly only one is high-end (5800X), the rest are 2x00G or 3x00G.

Re old hardware - as far as I'm concerned, it's good until it starts to have weird problems, at which point one should troubleshoot it as if it were new, IMO.

Re 7-14 days apart - I'd say if it clears a month without issues then I'd start gaining confidence that the problem is fixed. Two months is pretty safe. Do you know for a fact that it's about 7-14 days apart (e.g. does the Windows Reliability Monitor back you up on that)?
I'm pleased to answer your question more precisely. Again, in review, motherboard, processor and RAM were replaced in late spring 2021. Not noticing the first critical stop in June, I mistook a problem with a CD forgotten in the optical drive for a hardware source of failing to boot. Removed the non-bootable CD in November after leaving the system shut down and on the "back-burner" for the elapsed months.

A STATISTICAL SCATTER -- CURRENTLY AT DAY 8 OR 9 WITH NO ISSUES

6-17-21
11-25-21 -- [PC was set aside from July 15 through November, planning to deal with a problem found to be caused by non-bootable install disc in the optical drive.]
12-21-21 time-lapse= 26 days
2-6-22 time-lapse= 47 days
2-18-22 12 days
2-23-22 5 days
3-8-22 13 days
3-16-22 8 days
3-25-22 9 days
4-1-22 7 days
4-16-22 15 days [set BIOS DRAM current capability to 110%]
4-20-22 4 days [Swapped out RAM for known good pair, set BIOS back to 100%]

MEAN or AVERAGE = 14.6 days

I can look at the variance or standard error for my own curiosity, and I can come back and post it here. EDIT/UPDATE: Standard Deviation: 13 days + or -. Kick out the first two observations as "unrepresentative" or "outliers" -- it should be significantly less . . . I think this would mean -- assuming a 13-day STD DEV, that I'd be 66% confident in a month's time -- 27 days. Using the time-series less the first two events, 66% after a shorter time.

The first two observations may be outliers, or they may be a beginning of a trend, perhaps showing progressive deterioration in a s suspect component -- perhaps the RAM.

As another poster noted, I can only wait a month or more and re-evaluate. I've posted a "question" thread on "PC Building" asking advice for an ASUS motherboard, current or last year's K processor, and G.SKILL RAM. But all the essential components currently installed are a year old, including the RAM. Kaby Lake was delidded/relidded by Silicon Lottery with Grizzly Conductonaut before SL announced they were going out of business because Intel had returned to use of Indium solder . . . Given the fact that they've had 10,000 customers and only about 10 problems using liquid tape with the Grizzly, I'm not as suspicious about it as some might be.

SIDE NOTE -- I see some posters are inclined -- and suggest -- to throw the dated-tech parts in the e-Heap, and as I said, I'm beginning to investigate a mobo-CPU-RAM bundle to swap in or to install in an exact twin of the subject machine -- case, fans, PSU -- everything. All new parts.

All good things come to those who wait. [Hannibal Lector -- to Clarisse Starling in "Silence of the Lambs"]

4-29-2022 It will take longer than ten days to see if any trendline has been broken, but the sucker is running smooth so far.
 
Last edited:

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
April 30, 2022

I guess I thought I was "on a roll", and that swapping in known good RAM would resolve my troubles.

So today, I was sitting here at about 8AM (just now), and the system went through a shutdown again. This is about the second or third time that there was a message on the screen when it auto-rebooted. The motherboard anti-surge protection has always been enabled in BIOS, and so after this type of event -- those two or three times -- it reported a "power supply surge -- [everything protected' -- press F1 to enter BIOS setup".

I'm not ready at the moment to build a new PC -- "The Last PC" as I suggested on my thread under "PC building". It's not the money. I want to research the parts I choose, and should only have to buy processor, motherboard and RAM (but that's really enough now, isn't it? Probably just under a Grand.]

So I"m faced with alternative next steps.

I have the pristine RMA replacement motherboard sitting in its box. I also have the original i7-6700K processor. I also have a spare Seasonic Titanium PSU. I don't have a lot of time in my average day, though.

1. swap another PSU into the system, wait and see.
2. Swap back in the original graphics card -- a twin Gigabyte OC mini GTX-1070
3. Swap the motherboard; keep the Kaby Lake processor
4. Swap the motherboard and the Kaby for the Skylake.

And of course, we all agree -- this COULD be some driver or software problem, but no mini-dump files, and not out of the woods with the hardware.

3 and 4 mean I'll have all the activation annoyances, probably with my Win 10 OS (but it wasn't so much trouble the last time), my Windows Office Pro and a few other things.

What a g*****n pain in the a**.
 
Last edited:

VirtualLarry

No Lifer
Aug 25, 2001
56,402
10,083
126
So, are you saying, that these aren't actually power-off crashes, but Asus Anti-Surge forced emergency shutdowns? That's a different ball of wax.

Either your PSU is going out of spec, or, if the mobo is "old", equally likely that the Asus anti-surge controls on the mobo are going out of spec / tune, and that you should simply disable them in BIOS for peace-of-mind.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,732
1,461
126
Roger, Dodger, Mikeymikec

Your scenario cannot apply here. I stopped using power strips, even surge-suppressor models, back in 1994.

UPS battery backups are an insurance policy. The premiums are paid in occasional battery replacement, and sometimes, often when the electronics is flakey or goes south, but sometimes when I'm in a hurry and given the price of batteries, I just buy a new one.

All my units are APC. Some, as for this troublesome PC, are "personal" models. I've got two of the 1500VA server models -- one for my server (which won't work with Personal PowerChute under Win 2012 R2), the other for my Home Theater or entertainment electronics.

We've been through all that for swapping UPS systems on this box. I'm also wondering again about the PSU -- which was new in November. You see, I'm old, and my recollections are getting flakey. I had said that the first instance of this problem occurred before I swapped out the PSU. But I vaguely remember an instance when I was puttering around in my room, and I may have accidentally put my hand on the PC power switch -- causing an "event".

Meanwhile, I"m examining strategies for swapping additional parts. I'll start with the PSU (again!). This is obviously not a RAM problem. I'll swap the original graphics card back into the system.

After that, there are choices. If I swap the Kaby Lake to put back the Skylake that was in there at the time of my January 2021 vaping-pen accident, I won't have to worry about software activation issues. If I swap the motherboard -- I will. The processor swap requires removal of the motherboard anyway -- to pull the heatsink.

I'm going to think about this today. Our house is a clutter, owing to my departed brother's growing disability in his last years and my Moms' bed-bound state and dementia: I'm the only Merry Maid here. Other unoccupied parts of the house look like homeless encampments or just junkyards. I've got stuff piled up in the dining room and the living room -- effectively, my bedroom. But my garden patio is clean, covered against rain and sun, with electrical access and a table large enough to pull apart a PC.

I will move with great deliberation and calm confidence. . . .
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |