Windows 10

I haven’t been on the Insider track since my last clean install (I had to do a clean install a few months ago because something on the Insider track broke Windows Update for me and I was stuck on a build from last Fall that refused to update).

I downloaded the latest ISO and did one of those restores that only repairs Windows and doesn’t touch anything else, but I’m still getting the random reboots. How random? Well I played Bioshock Infinite for about 4 hours a couple days ago with no issues. Then I launched Portal today and it immediately rebooted. Then I launched Portal and played for an hour without issue. Then I launched a game in PCSX2 and it immediately rebooted. Then I put the PC to sleep and when I woke it up it immediately gave a BSOD and rebooted.

If there’s any common thread that I can think of it’s that for the most part it happens when I’m launching a game, exiting a game, loading a new level, or bringing the computer out of sleep mode. It does also happen when I’m just, say, walking around in Skyrim, though.

God I really didn’t want to do another clean install already…

That sounds less like an OS/driver problem and more like a RAM or PSU problem. That said, driver issues can be insidious, so I wouldn’t overlook the possibility.

Those were my first two guesses as well. All memory tests came back clean and I purchased a new 700 watt eVGA PSU which didn’t solve the problem (but I’ve been wanting to replace my crappy old PSU anyway).

I just ran DDU and installed nVIDIA’s 375.57 drivers, so we’ll see if that solves the issue.

I’ve had driver’s cause that kind of odd behavior, and not just GPU drivers either. Oddly enough I have a wolf king keyboard really mess with my computer once… completely uninstalled that puppy. And of course a BIOS update is my last step if there is one available.

Is it randomly rebooting? Or is it blue-screening and rebooting?

It’s only blue screened once. Every other time it’s a black screen reboot.

I have also been having some weird issue with Windows 10. Does windows 10 have “stealth” updates that require multiple reboots to be applied? By stealth, I mean that its not telling me its applying an update.

This is what has been happening to me. Every once in a while when I boot up my computer, it will start booting until I see the windows logo and then it will reboot. Sometimes it will do this 2 or 3 times and then finally windows will load.
After this, I have shut down and rebooted windows to see if it repeats, and it does not.
Once booted, my computer will run rock solid without any issues at all.

I thought it might be a hard drive issue and downloaded some software and it says everything is fine.

Does anyone know what is causing this? I am hoping its just some kind of windows 10 stealth update and not a hardware issue. Again, it only happens every once in a while.

A related thing that might be part of the cause is that a few months ago I got an APC UPS. I have the monitoring software installed that will shutdown my computer if there is a powerfailure and my UPS will run out of power. The last time this booting thing happened, my UPS started beeping during the boot up process. Maybe it somehow is a cause?

Sort of. It auto installs them for me. I know they’re going to update when I reset or restart the computer which I rarely do. If you’re asking if Windows 10 asks if you want to install an update… not that I can see. I would have turned that off and updated when I felt like it.

The reboot thing you’re talking about, that I have not experienced. With an update it just tells me it’s updating, no guess work.

Do you see anything if you check reliability history? Just search for Reliability from the start menu.

Updates on Windows 10 will sometimes auto-reboot your system, but that shouldn’t happen more than once or twice a month at most - and in the anniversary update you should see notifications prior to it happening. And they’d only require one reboot, not two or more like you’re describing.

What you’re describing instead sounds like crashes, which would be reflected in reliability history if so.

That is what I kind of expect, that if windows was applying an update during boot-up, it would do its usual, thing like “Updating Step 3 of 3 99% Complete” or whatever.

Ill have to search for that Reliability from the start menu. I was wondering if there was something that would tell me what was going on. My computer never auto reboots once I am on the desktop. Its just during the initial startup from a cold boot that this happens every once in a while.

The clean driver install didn’t solve my problem. Just rebooted in the middle of playing Skyrim again.

Here’s my reliability chart! Keep in mind I just restored the system so in reality this actually all started on October 5th, not the 29th.

http://i.imgur.com/KUhLR6M.png

Some of the entries give me stuff like:

Except I don’t see any memory.dmp in the Windows directory.

The majority of them just say “Windows was not shut down properly”. The event viewer is full of critical errors that just say “Event 41, Kernel-Power”.

Well, you could try submitting the dump files to MS. I know there’s a forum where you can do that. I googled it the other day.

The other option is to bite the bullet, do a true clean install after nuking the partition.

If you still have an issue after that, you can be pretty certain it’s a hardware issue.

That particular stop error (0x116) appears to be associated with a graphics driver crash.

The 41 kernel-power always shows up with an unexpected shutdown, no information to be found there.

It certainly could be the video card then. The stop errors that I’m seeing here are 116, 119, and 10e, which are all video card related.

Try to underclock your card, reduce its voltage, and/or increase fan speeds. Use MSI afterburner.

Alright, I ran that reliability thing. It simply lists a hardware error. No other information. Is there some other way I can figure out what happened? This reliability report only goes back to 10/6/16. That day also has a similar report, although it is the only other day with an hardware error report. Both days have multiple instances reported at the same exact time (per day, not that each day’s failure occurred on the same time as the other day).

I just found that there is a bit more detail if it means anything to anyone:
Source
Windows

Summary
Hardware error

Date
‎11/‎3/‎2016 4:33 PM

Status
Not reported

Description
A problem with your hardware caused Windows to stop working correctly.

Problem signature
Problem Event Name: LiveKernelEvent
Code: 141
Parameter 1: ffffbd846cb59010
Parameter 2: fffff806ebafd15c
Parameter 3: 0
Parameter 4: 4
OS version: 10_0_14393
Service Pack: 0_0
Product: 256_1
OS Version: 10.0.14393.2.0.0.256.48
Locale ID: 1033

Well it runs FurMark without incident (gets up to about 80c with the fan climbing to only 53%), so I figure the GPU itself is fine, but I suspected the video memory so I looked for some GPU memory tester programs and the two I found didn’t produce any negative results.

However, I downloaded eVGA’s OC Scanner tool, and using the “Furry E (GPU memory burner::3072MB)” test, I can get a repeatable reboot once it climbs to 3500~ MB during the loading phase. Does this indicate the problem, or is this a normal side effect of the 3.5 GB + 500 MB memory issue from my GTX 970?

Re: stusser

Is the purpose of that to induce a crash or prevent one? It’s definitely not heat or fan related because it often crashes when launching a game when the card is idling at 40c.

I have been doing some research, I found a application called whocrashed. In a way, I am pleased, because it means that my main hardware is fine, but my relatively new NVIDIA card may not be :(

On Thu 11/3/2016 4:33:44 PM your computer crashed
crash dump file: C:\WINDOWS\Minidump\110316-25234-01.dmp
This was probably caused by the following module: nvlddmkm.sys (nvlddmkm+0x962130)
Bugcheck code: 0x116 (0xFFFFBD84676544A0, 0xFFFFF806EC302130, 0xFFFFFFFFC000009A, 0x4)
Error: VIDEO_TDR_ERROR
file path: C:\WINDOWS\System32\DriverStore\FileRepository\nv_dispi.inf_amd64_848dea456d3c865e\nvlddmkm.sys
product: NVIDIA Windows Kernel Mode Driver, Version 375.70
company: NVIDIA Corporation
description: NVIDIA Windows Kernel Mode Driver, Version 375.70
Bug check description: This indicates that an attempt to reset the display driver and recover from a timeout failed.
A third party driver was identified as the probable root cause of this system error. It is suggested you look for an update for the following driver: nvlddmkm.sys (NVIDIA Windows Kernel Mode Driver, Version 375.70 , NVIDIA Corporation).
Google query: NVIDIA Corporation VIDEO_TDR_ERROR

Now, what to do about it? Its odd though since my computer never has issues once it has booted up.

Do the stuff I said. See if that fixes it.

I underclocked the card as much as it allowed and put the fan speed at 75% fixed and…it still crashed at the exact same point. There doesn’t seem to be an option to lower core voltage—only increase it.

So I guess the interesting thing here is that, while this crash is completely repeatable and certainly appears to be an issue with video memory, I’m not sure it explains the other crashes. Because there’s no way that, say, Portal is using 3.5 GB of video memory during start-up (or ever). So maybe that means it’s just trying to utilize the “bad” memory addresses? If that’s the case, then why don’t the 1 GB and 2 GB stress tests ever fail? Why only the 3 GB test?

The card is still under warranty at least, so if I can rule out a driver issue I can send it back and get a replacement.