Screwy hard drive? Bad memory? Gremlins?

One of my new computers has been acting up. It crashes frequently, whether I’m in a web browser, a 3D intensive game like LOTRO, or even an old 2D game like Warlords Battlecry III. Sometimes, it even crashes and does weird things when I boot it up with error messages about “failing to initialize application” properly, after which is simply hangs on a black screen with a cursor until I hard reboot with the power button. At one point, it booted into a low-res 16bit color screen. At another point, it seemed like the network stuff was entirely gone. It’s as if Windows is occasionally not finding certain drivers.

It’s a new install of Windows, without much stuff on the system beyond a few games, the DIVX codecs, and Firefox. I ran full virus scans using AVG and an online McAfee check, both of which came up clean.

I’ve been running error checking on the disk, which bring up some odd messages. For instance, during the “verifying files” stage, it’ll return this sort of thing:

Deleting corrupt attribute record (128, “”)
from file record segment 112463

or:

Deleting orphan file record segment 124606

During the “verifying indexes” stage, it’ll return this sort of thing:

Deleting an index entry from index $0 of file 25.
Sorting index $I30 in file 73957

or:

Recovering orphaned file (filename.doc) (7049) into directory file 6676

During the “verifying security descriptors”, it’ll return something like:

Insterting data attribute into file 42316.

I have no idea what any of this is, but fortunately, the system is under warranty by Dell. I suspect bad memory, or a bad hard drive. Does anyone have any suggestions for how to zero in on the problem before I work my way through the script monkeys at Dell?

-Tom

Work your way through the script monkeys and see what they say.

http://www.memtest.org/

Yeah, I’m letting Memtest run for a few hours. But I figured maybe someone would see those checkdisk errors and say something like ‘That’s totally a bad hard drive!’

-Tom

Those errors coupled with the other behavior sounds more like something is feeding your hard drive bad data than the hard drive being bad itself.

I would be tempted to replace the HD with another one, ghost over your current HD image, and see if the errors continue.

Obviously if you have known good spare parts and can swap them around, that’s one way to narrow your field.

Just based upon gut instinct, I am thinking maybe power supply, although at this point it could be a lot of things. Good luck with that.

to check the actual hard drive you’d do a chkdsk /r to check for badclusters. Or go to the manufacturer and download a disk checker.

the error messages you’re getting are just chkdsk telling you you have file corruption. you have them cause you crashed while writing data.

your problem is system instability - the errors in file system corruption are just a symptom.

if you DO get a bad cluster, that’s an automatic RMA. modern drives (like the last 10 years) have a reserve of clusters so the good ones get mapped over the bad ones invisibly. It’s not like the old days of 20 gb hd’s where you just mark it with chkdsk and go on your merry way.

You have Gremlins…

That’s a seriously difficult problem to troubleshoot without spare parts. I’d start with temperature, clean fans, monitor the motherboard and power supply voltages. Memtest only puts part of the system under load. Once everything else boots up, if you have a power supply that’s not stable, you might see freaky stuff like this.

Does your MB have a monitoring utility so you can see what the temps and voltages are?

He has spare parts that are called three other known-good-identical PC’s

How would I know that? That wasn’t in any of his posts. That doesn’t change my advice either. I’ve had PSU that caused major disk corruption and CPU that overheated, both of which can cause that symptom.

If it’s under warranty I’d let the Dell “Techs” have a crack at it. After they’re done screwing everything up they’ll send you a new machine.

If it’s under 30 days he can just return the thing no questions asked right?

The worst thing that could happen is he returns it, they reimage windows on a new hard drive,and they reship it back. it ends up being a bad ide cable, a lousy PSU, bad ram, a screw stuck somewhere, etc etc and nobody ever finds and Shoot club is very unhappy

sorry fidgaf, didn’t mean to sound snarky. should added a smilie. Tom has had a few posts recently about the Saga of new Pc’s for Shoot Club which have been pretty amusing

He can return it no questions asked, but he pays shipping.

edit: Oh, I didn’t read your whole post. He can return it no questions asked and pay shipping. He cannot (or is not supposed to be able to) request an exchange. The only reason they are supposed to do an exchange is because they troubleshot the problem, found a solution, and it required a replacement of a motherboard or power supply.

I can’t even remember the last time I’ve seen such chkdsk errors. This is NOT just caused by an application crashing while writing – NTFS can handle that without corrupting the file system and creating orphaned directory entries. Even when I had a hard system lock-up or spontaneous reboot under XP, the following chkdsk never found any trouble.

So either the entire system is fucked up (e.g. unreliable RAM) or there is a hardware defect in the drive subsystem. That includes the hard disk itself but also cables, the controller on the motherboard, and perhaps the power supply.

So, having never seen Memtest before, I was letting it run its merry course, racking up a list of red stuff on the bottom half of the screen. At one point, I looked a bit closer at all the info and see an entry for “Errors”. Memtest had clocked about 60,000 of them.

So it looks like bad memory. Dell is sending me new RAM. Later today, I’ll swap out the RAM from another system to make sure that addresses the problem.

-Tom

Carry the RAM around in antistatic baggies and for god’s sake don’t rub your feet on the carpet :) also, unplug power supply before yanking ram out. Off is not off. there’s power running on the PCI slots (otherwise wake on lan and power from keyboard wouldn’t work. ram slots could have power too)

Usually, the power supply can be turned off with a switch on the PS itself, next to the power plug. You should turn off this switch but not unplug the power supply because the connection to the wall outlet provides grounding for the case!

Wear rubber gloves.

Why would you say rubber gloves? I know rubber is a good electrical insulator, but it kills manual dexterity. I also remember something about rubbing a rubber ballon with a plastic comb for lots of static. Seeing as there’s a lot of plastic parts on the PC I wouldn’t try that.

Chris Nahr’s real smart thought, good point!!

Don’t forget the goat leggings and the voodoo stick.

Obviously Dell forgot to be nude while making that PC.