BSODs graphic

BSODs Stink. But They Are Solvable Problems, usually;)

Over the last 8 months I’ve experienced something rare, and precious, a BSOD-free existence.

In computer terms I typically play at the fringes of “safe”, I find the best/value hardware I can to custom build fast but not overly expensive computer systems and then push these systems to their limits. And I install LOTS of functionary software to do very specialized things, like file encryption and image manipulation/compression, etc…

The more components, either hardware or software, put into a computer always increases its chances of developing a quirk, some kind of conflict which leads to crashes and BSODs (Blue Screens of Death).

BSODs are a computer’s last-minute resort to “save” itself from a potentially permanent hardware failure/overheat–they offer a lot of technical data spilled across the blue-screen that is supposed to help us figure out the cause. Before BSODs computers would crash without any kind of explanation–sometimes fataly!

But, why are BSODs so much rarer these days?

BSODs were much more common back in the pre-Windows XP era. As hardware grew up so did their ability to handle “glitches”. Modern systems running Vista/7/8 often are able to save the computer from an impending crash and simply shut down the offending conflict (software or hardware driver) before getting to a BSOD. Sometimes they even tell you the cause–yea!

However, a few days ago I encountered a BSOD. And it continued to occur, with differing system errors involving memory and driver resources, until I was able to isolate the cause, a faulty RAM module. After finding the specific bad memory module, and replacing it, my computer was again up and running. Total fix=about an hour.

This led me to the question, why was it so easy to find the cause? In the past I’d spend hours/days/weeks trying to debug crashing systems and often had to resort to EXTREME measures to deal with the problems. So why was yesterday’s problem so simple and almost nonintrusive? Dumb blind luck–Maybe?;)

The answer=experience. I have had so many variations of problems throughout my twenty years of dealing with systems, and their crashes, that I can sorta sense the potential root causes. And after so many “failures” at fixing problems in the past, I can almost(?) approach each new one without FREAKING OUT!

So I’m giving you some advice in dealing with these less common, but no less frustrating, experiences of BSOD and other hardware/software issues in modern computer systems.

  • Take a Deep Breath and stay calm. Attempting rash and emotional “fixes” usually is a bad thing and often makes the problems worse. The problems are often simple and the solutions require patience and reasoning. FREAK-OUTS have led to countless extra hours/days to fixes because the cure often ends up WORSE than the disease.
  • Think. Look at possible causes. Have you recently installed any hardware/software? Anything added anything could’ve caused these issues. If your computer is bootable and you can at least make it into “safe” mode within Windows sometimes a simple system restore to an earlier state may fix the issue.
    • Anything recently installed or added (even Windows updates) can impact system stability. The operating system is a go-between between the computer’s hardware and the software that we use. Software developers write code that the operating system can translate for the hardware’s use to get the desired results. And if the instructions between the software/OS/hardware get “fouled” because of improper or conflicting code it can cause hardware instability and BSODs.
    • If uninstalling the hardware/software doesn’t work, sometimes a system restore to a pre-installation date will correct the issues. And system restores are usually reversible–just make sure you let the computer complete the process, as aborting a restore CAN actually break an operating system beyond repair–forcing a reinstall of the entire operating system and all of the hardware drivers and existing software.
    • Software/driver updates can sometimes cause these problems. As the instructions between the various hardware components and the operating system change there’s always a risk of an unknown conflict developing. Rolling back on the upgrade/install is often easy–but it does require patience and being thoughtful about it.
  • Have you had a power outage? Check the cables; make sure nothing came un-plugged (like a mouse/keyboard/monitor); make sure the power cable in securely plugged into both the computer and the wall. Sometimes power problems can cause disk drive issues/errors and a simple scan disk can correct the errors.
  • A common cause of computer failures is power/heat issues–especially in small systems like notebooks. If you’ve had the computer a really long time (many years) often the fans get soiled or stop working causing over-heating. This would require a replacement fan install–an often simple but technical repair.
    • Power supplies that convert the power from the AC outlet to the DC power needed to run the computer do wear out. As the power supplies lose efficiency the power requirements of the computer hardware are often not met and this leads to instability. Power supply replacement is, again, often simple but technical.
    • Notebook batteries can also wear out. Many modern batteries have LED gauges on them to indicate lifespan remaining, or on-screen gauges that with a simple mouse click show whether the batteries are still able to hold a charge.
  • RAM module failure is a very common issue. If the system memory is unstable or corruptible everything you perform on a computer can result in crashes and BSODs. There are simple software tools like memtest86+ that can be easily burned onto CD/DVDs and can detect most RAM problems easily. RAM replacement is often the easiest “fix” for a computer system–entirely based on ease of access into the case and motherboard. Again this requires technical prowess/knowledge.
  • Other hardware cards like graphics and sound cards can cause these problems. Disabling these hardware in Windows Device Manager can possibly help troubleshoot the causes. Of course disabling the graphics cards would need another source of the video for the computer monitor and should only be done by someone familiar with computers and how they work. If one of these cards is the problem, it may require a small investment but it is usually a quick and painless fix.
  • Any issues regarding processor and motherboard failures require much larger investments in time and money. The boards usually have to be removed, all the add-on cards and power supply as well. Modern CPUs are firmly mounted with high-powered and specifically designed heat-sink/coolers which need the motherboard removed to get at.
    • Any motherboard replacement is an elaborate procedure that may need a partial system reinstall as motherboards contain the chipsets that interact with all the components and the drivers between the old motherboard and new one may conflict in Windows.
    • One should never replace a CPU or Motherboard unless they are absolutely sure of the need.

Computers are very complex systems, and this inherently makes them often difficult to troubleshoot when problems develop. But most problems don’t start without some warning. Usually hardware failures start progressively. Lock-ups, program failures, sound and video artifacts or glitches may appear days/weeks before the actual BSODs or hard crashes begin. If you’ve noticed these problems in advance it can greatly help diagnose the problems when it becomes more immediate. How a system fails gives clues about the causes.

In the end, a failed computer is not the end of the world. If you have access to another internet capable device, you can often find simple solutions/diagnostic options from Googling the issues. Remember, the posts of a multitude of users from twenty years with an unlimited number of configurations will not yield EXACT results. You should look at solutions from people using similar hardware/software/OS. And never do anything invasive without consulting many sources or perhaps a friend or co-worker that’s more technical/computer literate first. But there are lots of resources and many “free” software diagnostic tools available.

An operating system reinstall may seem like a good thing after hours/days or unsuccessful troubleshooting. But what happens if you reinstall the operationg system and the problems reemerge? All of the system settings and software installed is gone, but the problem that forced you to remove them is still there.

I do recommend a clean system sweep every so often to remove software/OS bulk–but modern tools like CCleaner can help with this without having to completely wipe a system and start from scratch.

Get comfortable with technology. It is meant to serve our needs and we should not work for IT. However, we do have to be able to root around within it from time to time to adjust to the glitches that keep these devices from functioning 100%.

If the problem is “un-fixable”, and your patience is wore out, a simple replacement computer from Dell/Dell Outlet or even the local box-store is usually only a few hundred dollars away. I build crazy systems because I like too. But for 99% of the world, a computer is a simple tool for simple tasks, so the latest/greatest (and most expensive) is not only unnecessary, it is unwise.

With some time and some patience most every problem a computer user faces is surmountable.

I look forward to the day when these technological monstrosities marvels are no longer needed and we get simpler, and more reliable(?), devices for our software/computing needs.

Chad Schulz

October 2013