Meherchilakalapudi.. writes for u….

Just another WordPress.com weblog

Archive for the ‘Identifying and Troubleshooting System Component Errors’ Category

Identifying and Troubleshooting System Component Errors

Posted by meherchilakalapudi on February 9, 2009

Identifying and Troubleshooting System Component Errors

When you strip a computer system down to its basic components, there are not really very many components that we have to worry about, at least in terms of troubleshooting. In fact, many system errors can be isolated to one of a very few system components. The real trick when troubleshooting is being able to identify what failure symptoms are associated with what component. For instance, how does a system behave when the memory is failing, or a processor? Over time, you will be able to quickly identify common failure symptoms and easily isolate the component at the root of the problem. Don’t expect to develop this level of troubleshooting awareness right away; it takes time working with computers before you will be able to do that. To give you a few ideas of what you are looking for in terms of component failure, we will take a look at some of the more common system components and the signs and symptoms of failure, and a few ideas on how to correct those problems. Making It Easy When a computer or component fails, the mind of a PC technician will often become full of potential problems and complex resolutions. In reality, the solution to most of our computer problems is far from complex. When approaching any system problem, look for the simple things like an unplugged cable or an incorrect user name and password combination. Keep it simple and look for the easiest and most obvious solutions first. For example, if a monitor fails to turn on, check that it is plugged in before replacing the video card. This probably sounds ridiculous, but you might be surprised at how people go out of their way to make things more difficult than they need to be. In the CompTIA exam, be ready for questions that look for the easiest solution first. I/O Ports and Cables The ports on the back (and front) of a PC are the source of many troubleshooting tasks, as are the cables that plug into them. Fortunately for the PC technician, the associated troubleshooting steps are relatively straightforward. Troubleshooting Cables However it happens, cabling can always seem to find a way of failing. Sometimes the cables get shipped that way, other times they get inadvertently damaged. Troubleshooting cable-related problems is not always easy and is often overlooked. If you suspect a cable-related problem, there are a number of tests you can perform. Most common is to swap out the cable with a known working one to see if the problem is remedied. If you do not have the extra cable with you, the faulty cable can be temporarily swapped out for one that works from another device. When inspecting cables for problems, consider the following: • Bent or damaged cables Any bends or crimps in the cable could indicate damage to the cable. Better to replace than to wonder. • Loose connectors Not all cables are equal, and some are simply not as durable as others. Test the ends; if they are loose, they might not be making an adequate connection. • Cable length There are standards that specify how long certain types of cable can be. These maximum cable lengths are quite adequate for most cable types, but they can be surpassed. To prevent exceeding these limits, be aware of your cable’s length requirements. • Cable location Signal integrity can be compromised depending on where the cable is located. Cables near air conditioners, motors, or any large electrical devices might have interference concerns. This does not, of course, apply to media such as fiber-optic cable. NOTE It is a good idea to have known working spare cables on hand to swap out in a moment’s notice when troubleshooting cable-related problems. Troubleshooting Ports Troubleshooting ports is a little more involved than troubleshooting cabling because in addition to the physical aspect, there is also a BIOS or software consideration that can affect the configuration. Serial Ports Serial ports are one of those PC components likely to cause you few problems. Typically they either work, or they don’t, but mostly the former. That said, when serial or COM ports are causing you problems, there are a few more steps in the troubleshooting process than there might be with other port types. In addition to the normal considerations like interrupt request (IRQ) assignments, it is also possible to configure serial port characteristics from within the operating system. The settings for a serial port can include the number of data bits, the number of stop bits, and the parity settings for the port. The most common setting for these parameters is 8,1,None, as in 8 data bits, 1 stop bit, and no parity. On Windows systems, the configuration for the serial ports is managed through the Ports icon in Control Panel. NOTE When you are having a problem with a device such as a modem or mouse that is connected to the serial port, don’t discount the settings for the port as a possible cause. Parallel Ports In the real world of PC troubleshooting, you will not find yourself spending a lot of time troubleshooting parallel ports. Of course, now that we’ve said that, the parallel port in your system (and ours too, most probably) is bound to fail. The parallel port is the 25-pin female DB connector showing on the back of your system. Most often we use the port for printers but it can also be used for many other external devices such as scanners, Zip drives, and even external CD-ROMs. More Info For a picture of 25-pin DB connectors, refer to Chapter 1. When troubleshooting a parallel port connection, there are three key things to keep an eye on: the physical cable, the cable connections, and the settings for the parallel port in the BIOS or operating system. The best place to start when troubleshooting a parallel port problem is with the physical cabling. This includes verifying that the cable is securely attached to the computer and the connected device, and that the connection points do not have bent pins. An often-used strategy when testing the physical connectivity is to swap out the cable itself with a known working one to quickly eliminate a broken cable as the cause of the malfunction. Once you are convinced that the cable and the physical connections are not at the root of your problems, you will need to verify that the parallel port itself is recognized by the operating system. One area to check is the system’s resources in the BIOS and the operating system. The parallel port typically uses I/O address 378 and IRQ 7 (LPT1). If other components are attempting to use these resources, the parallel port might not work. Barring any resource conflicts with other devices, the parallel port should be visible and accessible from within the OS. If there is a resource conflict with IRQs or I/O addresses, they will need to be modified before you can access the parallel port. However, parallel port conflicts are rare because they are reserved and, by default, not used by other devices. If there aren’t any resource conflicts, verify that the parallel port has not been disabled in the BIOS. NOTE If the parallel port is not motherboard mounted and has stopped working after you have been rummaging around inside the case, double-check that you have not inadvertently unplugged the parallel port cable from the motherboard. USB Ports Like many of us, USB devices rely completely on plug and play. The end result of this reliance is that when it comes to USB devices, there is little we need to do to control or configure them. In your travels you are likely going to find that USB is one of those technologies that works most of the time, especially with newer operating systems. However, for those times when you just can’t get the USB working, consider the following: • Installation For the most part, the USB devices will install easily. If they do not, you will need to check in Windows Device Manager to be certain that the USB controllers are functioning correctly. If any of the entries listed under USB Controllers is displayed with an exclamation point in a yellow circle, it is not functioning and those USB devices cannot be used. The problem can often be corrected in the system’s BIOS. From within the BIOS, verify that the USB controller is being assigned an IRQ address. • Device drivers Being Plug and Play, USB will automatically try to install the drivers for your USB device. When the OS cannot find the drivers, you will be prompted to supply the necessary drivers. Ensure that you are using the correct and most recent drivers from the manufacturer’s Web site. • Firmware In normal operation, USB devices are added and removed from the system in seamless operation. Sometimes, though, you will encounter situations where this does not happen. For example, you might find that when a USB device is removed and then re-added, two instances of that device are detected by the system. To get around such peculiarities, you might need to update the firmware for the USB device and, in fact, might even need to update your BIOS firmware in the process. • Power Some USB devices require external power to operate. An example of such devices are external hard disks. Other devices such as USB memory sticks require very little power and plug directly into the system, drawing the power from the system itself. When troubleshooting, ensure that the devices that require external power are in fact receiving it. IEEE 1394/FireWire Troubleshooting considerations with FireWire ports are very similar to those associated with USB in that you need to check provision of power, cabling and physical connections, and software configuration. If you are using a hub, you need to make sure that the hub is powered on. Infrared Like most of the other ports discussed here, infrared ports have both a software and hardware configuration. If you are experiencing infrared connectivity problems, you will need to check the port configuration of both the sending and receiving devices. Some additional considerations with infrared ports are that you must check the connectivity between the two devices. In nearly all cases, a direct line-of-sight path is required between the two devices. Any obstacles between the two points, no matter how small, can affect the performance of the infrared connection. In addition, the two devices must be basically in-line. A generally accepted rule is that the two devices must be within 30˚ on both a horizontal and vertical axis. Further, you should also consider that the range of infrared in data communication applications is relatively short. If you are having problems, try moving the devices closer together—the closer the better. SCSI By SCSI ports, CompTIA is almost certainly referring to the fact that SCSI is a commonly used interface for external devices such as hard disk arrays, tape drives, and even scanners. These external devices connect to the system using external SCSI ports. Some of the types of ports used for this purpose were discussed in Chapter 1. Using external SCSI devices involves many considerations, not the least of which is that all of the normal SCSI installation issues such as configuration of SCSI IDs, device compatibility, and bus length must be observed, in addition to the following: • Port compatibility A number of different cables and connectors are used with external SCSI devices. You will need to make sure that you have the correct cable for the application and that it is securely ¬connected. • Termination External SCSI devices extend the SCSI bus beyond the confines of the system case making it necessary to remove termination from the SCSI host adapter, which will no longer be the end of the SCSI bus. Because SCSI troubleshooting can be somewhat difficult, in addition to these considerations you should also check other SCSI specifics such as cable length, correct assignment of SCSI IDs, power for external and internal devices, and the configuration of the SCSI host adapter, which is discussed later in this chapter. Hardware Loopback Plug Hardware loopback plugs, also known as hardware loopback connectors or loopback adapters, are simple devices designed to redirect outgoing data signals back into the system. The simple effect of this is that the system believes that it is both sending and receiving data. The loopback plug is used in conjunction with diagnostic software that will test the incoming and outgoing signal. The end result is that the port can be tested to see if it is sending and receiving information correctly. It is a simple and easy way to see if a port is working correctly. Hardware loopback plugs are available for a number of different ports including serial ports, RJ-45 ports, parallel ports, and SCSI ports. Figure 2-1 shows a serial loopback adapter. Figure 2-1. A hardware loopback plug tests the functioning of a port. TEST SMART For the exam, recall that the function of the loopback plug is to test the functioning of a system’s ports. Motherboards Since the motherboard is the major component in the system, problems with motherboards have a tendency to impact the operation or performance of the system in a significant way. One of the most common sources of problems with motherboards is the BIOS/CMOS. BIOS/CMOS Discussed in Chapter 1, the BIOS maintains the system’s hardware settings, and the CMOS is a type of memory that holds these settings. The CMOS is nonvolatile memory and retains the BIOS settings using an onboard battery. One of the more common errors you will hear about regarding the BIOS or CMOS is the loss of system settings or system time. This is almost certainly the result of a dead or dying CMOS battery. When the battery fails, the BIOS settings stored in the CMOS are erased when the system powers down. The fix is simply to replace the CMOS battery. TEST SMART An indicator that the CMOS battery might be going is the CMOS checksum error. This error is often displayed on boot-up when the CMOS battery begins to fail. A second problem with the BIOS has to do with installing new hardware on a system with an older BIOS. The BIOS might not be able to accommodate newer hardware and so cannot be used by the system. This is often seen with new hard drives when the entire capacity of a hard drive is not recognized by the BIOS. The only solution in this case is to refer to the manufacturer’s Web site and download an update for the BIOS. Updating the BIOS is referred to as flashing the BIOS. There will not always be an update for the BIOS, and it might be that the hardware you are trying to install will not completely work with the current BIOS. Your choice from there is to update the entire system. WARNING Flashing the BIOS incorrectly might cause irreparable damage to your system. Ensure that you follow the manufacturer’s guidelines when flashing the BIOS. POST Audible/Visual Error Codes When troubleshooting a system, one of your best friends along the way might prove to be the power-on self test (POST) process. The POST process provides a mechanism to test whether the hardware you installed is actually recognized by the system. In essence, the POST is essentially a self-diagnostic routine that ensures lower-level hardware is present and accounted for. A full POST routine runs every time the system performs a cold boot, which is turning the system off and then right back on again. We have all seen this process when our system powers up and the system’s memory is counted and the drives detected. This is all part of the POST magic. TEST SMART A complete POST check is performed only from a cold boot of the system. A warm boot will run through portions of the POST process, such as the memory check, but not the entire thing. As we discussed in Chapter 1, anything that is amiss can be reported by the POST with a series of beeps, or information messages. Interpreting beeps can be very difficult as the beep codes are not universal among BIOS manufacturers. To discover the exact beep codes for the BIOS you are using, you might need to refer to the BIOS manufacturer’s Web site. To give you some idea of what to expect from the beeps, Table 2-1 provides examples of beep codes for AMI and Phoenix BIOSs and their meanings. Keep in mind, the ones used on your system might vary and you should consult system documentation to ensure that you are “reading” the codes correctly. TEST SMART Each and every POST process must verify the processor, video, and memory. POST Cards As a PC troubleshooter, it is essential to understand what is going on in the POST process. Sometimes, it is difficult to understand the POST error messages and in many cases a system might lock before giving any error message. In such cases, we have another tool we can use to troubleshoot the POST errors—the POST diagnostic adapter card. POST diagnostic cards work with any operating system and display detailed information on the POST process. POST diagnostic cards are expansion cards that are plugged into the system, and once the system is restarted, error messages are reported via onboard LEDs. POST diagnostic cards are very useful and can isolate a problem easily, and in the process might prevent upgrading or replacing the wrong component. Table 2-1. Examples of POST Beep Codes for AMI and Phoenix BIOSs Number of Beeps Error Possible Cause 1 short beep Memory refresh error Faulty memory or incorrectly installed memory 2 Parity error Faulty or incorrect memory 3 Base memory error Faulty or incorrectly installed memory modules 4 Timer not operational Possible main board failure 5 CPU error Faulty or incorrectly installed processor 6 Keyboard error Faulty keyboard or keyboard controller 7 Processor exception interrupt Possible faulty processor 8 Video card error Faulty or incorrectly installed video card 9 ROM checksum error Incorrect version of BIOS ROM 10 CMOS shutdown error Faulty main board 11 Cache memory error Faulty or incorrectly installed cache chips 1 long, 3 short beeps Conventional or extended memory test failure Faulty or incorrectly installed memory 1 long beep Successful completion of post System should boot normally Numeric Error Codes In addition to the auditory beep codes, during the boot process we might also see a series of displayed error codes. Of course, we would rather not have these error codes flash on our screens but they can certainly make troubleshooting a bit easier. Numeric error codes work in classes or ranges. For example, motherboard numeric errors range from 100 to 199. Table 2-2 shows the numeric code classes and Table 2-3 shows some specific error codes you won’t want to see within those classes. Table 2-2. Numeric Error Code Classes Error Code Classes Description 100–199 Motherboard errors 200–299 Memory or RAM errors 300–399 Keyboard errors 400–499 Monochrome video errors 500–599 Color video errors 600–699 Floppy drive errors 1700 Hard drive errors Table 2-3. Specific Numeric Error Codes Error Code Description 161 Dead CMOS battery 201 Failed or incorrectly installed memory 301 Keyboard not plugged in or is faulty 601 Floppy drive or floppy drive controller has failed 1101 Serial card is faulty 1701 Hard disk drive controller is faulty The error codes listed in Table 2-3 are not used in modern Pentium-class systems; rather, we use simple text message error codes. Makes you wonder why we didn’t do that in the first place. The text messages will tell you straight what the issue is. For instance, you might receive something like “HDD Controller Failure” or “Floppy drive failure.” Just how easy can they make it for us technicians? Peripherals Any piece of equipment that is connected to a PC could be considered a peripheral. For that reason, including it as a heading in the objectives makes it difficult to determine what CompTIA is referring to in terms of troubleshooting procedures. So, rather than trying to cover troubleshooting steps for a specific peripheral, here are some general points to consider: • Don’t forget the system, don’t discount the peripheral device One of the first things you should determine when troubleshooting a peripheral device is whether the problem is with the peripheral or whether it is with the system. You need to determine which it is at the first opportunity as it will have a major bearing on your investigations and potential solutions. • RTFM—Read the free manual Almost without exception, computer peripherals come with a manual that is laden with all manner of facts. Very often, they also include a troubleshooting guide. Given that, you would think that the manual would be the first stop for someone troubleshooting a problem. But it isn’t. For some reason, many techs feel that reading the manual is akin to giving up. It isn’t, and don’t be drawn into ignoring this most useful of resources. • Visit the manufacturer’s Web site If you are having a problem, chances are that someone has had it before you. In this case, it is likely that the manufacturer has looked at the problem and formulated a solution. In many cases, a trip to the manufacturer’s troubleshooting knowledge base will yield the information you need to cure your peripheral problems. There is more general troubleshooting information later in this chapter. Computer Case In the great scheme of computer problems, cases are unlikely to be the cause of very many headaches. Apart from the fact that computer cases are generally supplied with power supplies, which we will discuss next, they are basically little more than a metal box. That said, they do perform a very important function in that they provide the physical means to hold all of the computer components in sufficient proximity that they can be connected. The design of the computer case also has a lot to do with the thermal dynamics of the system. The processor fan can pull the excess heat away from the processor but that heat then needs to be dumped out of the box. That is why computer cases have vents and grills on them. An additional consideration for cases is that they come in many shapes and sizes. Smaller cases that have the space for, say, three or four storage devices might not make the ideal case for a server system. In contrast, a server case with space for 10 or 12 storage devices, additional fans, and a large power supply will be intrusive, from both a space and noise perspective, in a general office setting. When it comes to cases, one size does not fit all. NOTE Some higher-end systems have tamper sensors on the case. If the cover is removed while the system is running, the tamper switch senses the removal and immediately cuts the power. Of course you should never remove a cover while the system is running, but the fact that tamper resistant switches exist should discourage you even further. In terms of straight troubleshooting, the biggest consideration with the computer case is that the covers and any blanking plates are in place. Without them the cooling characteristics of the system can be affected. Power Supplies As you will see in Chapter 3, power supplies are one of those components that we do not spend our time fixing; we simply replace them. Power supplies contain capacitors that hold a charge and thus deliver a potentially fatal electric shock. With that in mind, our job is not to repair the power supply but simply to isolate it as the cause of a problem and replace it. When it comes to power supply failure, the power supply itself might fail, which is easy to determine because the system does not get power and nothing starts. Another common symptom of a failed power supply is a system that periodically turns off. This is often because the power supply is not supplying enough consistent power to the system. The only solution in both instances is to replace the power supply. A little more subtle is the power supply fan. These fans often begin to fail well before the power supply unit itself fails. The result of a faulty fan is a unit that cannot adequately cool itself, and either the power supply will burn out or the system might shut down unexpectedly. Power supply fans typically make a lot of noise when they are beginning to fail, often giving the technician ample time to replace the power supply. Also, some motherboards now come with sensors that can trigger an alarm if the fan stops operating or the processor starts overheating. TEST SMART A computer system that periodically and spontaneously reboots can mean that the power supply is not providing enough power to the system to keep it going. WARNING If just the fan fails inside the power supply, you might be tempted to simply replace the fan and carry on. This is not recommended because you will need to open the power supply to do so, which as we described earlier can be a hazardous proposition. Even if the fan fails, it is a best practice to replace the entire unit. Using a Multimeter A multimeter is used for troubleshooting or testing power-related issues in the system. It’s used to test four key power areas: AC voltage, DC voltage, continuity, and resistance. The multimeter itself uses two connector probes, one negative and one positive, an analog or digital display, and a switch to select the type of power test you want to perform. There are many ways of using a multimeter, but one of the most common operations is testing the power output from the power supply, which is a relatively simple process. To test the power output: 1. First, turn the PC off, but leave the power cable connected to the wall outlet. 2. Take one of the power connectors (the ones that power a hard disk, floppy drive, or CD-ROM are the easiest) and insert the black probe of the multimeter into one of the black wires on the power connector. 3. Insert the red multimeter probe into the red wire of the connector. Set the multimeter to 20 volts and turn the PC on. The readout of the multimeter should be around +5V. 4. Turn the PC off and repeat the procedure, this time connecting the red wire of the multimeter to the yellow wire of the power connector. When you power the PC on, the multimeter should give you a reading of approximately +12V. In each case, some deviation in the voltage is acceptable, but not too much. In the case of the 5V test, the figures should be between +4.8 and +5.2V. In the 12V test, they should be between +11.5 and +12.6V. Any other figure indicates that you are either measuring the power incorrectly, or that the power supply is having a problem. In the world of PC troubleshooting we typically use the multimeter to measure the voltage, continuity, and resistance (ohms) between two points. Figure 2-2 shows an example of a digital multimeter. Figure 2-2. A digital multimeter is used to test the power in a system and its components. Measuring Resistance When troubleshooting, measuring resistance is one of the elements you are going to become more familiar with. To test the resistance of a device such as a power connector or cable, you must first ensure that there is no power running through it. For instance, if you are testing a power cable, ensure that it is completely unplugged from the system and the wall before starting. If you are testing a component that is inside the case, you must ensure that the power supply is unplugged from the wall socket. Once the power is confirmed to be off, set the multimeter to measure ohms and touch the multimeter probes to the circuit you want to test. The result will be displayed on the multimeter dial or display. To test a cable such as an IDE cable, place one of the probes into pin 1 of the cable and another into pin 1 on the other end. If the cable is working, you should receive a reading indicating that there is continuity. If no reading is present or if a faulty reading is presented, there might be a faulty connection with the probes or a bad cable. We wouldn’t suggest throwing out too many “faulty” cables until you really know how the multimeter works. Measuring Voltage If you suspect that a power supply or a battery is faulty and want to test your theory, the multimeter is your tool. When measuring voltage with the multimeter, you need to ensure that the positive side of the multimeter connects to the positive side of the target device and the negative to the negative—kind of like connecting the jumper cables to your car battery. Ensure that the multimeter device is set to measure voltage and when the probes are attached, the voltage will be displayed. WARNING Be sure that you fully understand the procedure for testing power output before starting your diagnosis. Power supplies have potentially dangerous voltages that can at best damage your multimeter and at worst cause fatal injury. TEST SMART For the A+ exam, remember that looking for a defective or broken cable is done by setting the multimeter to test ohms, and testing power supplies and batteries is done by testing the voltage. Slot Covers? OK, here is a real mind bender. The CompTIA objectives list slot covers in the troubleshooting section, but we are not sure how you would do that! Slot covers are those metal tabs we remove to make room for expansion cards. The simple rule is to use a slot cover when there isn’t an expansion card in the slot. This seals the opening, preventing dust from getting inside the system and also allowing better air flow in the case to keep the system cool. For the test, you might need to know why and when we would use slot covers, but it’s unlikely that you will be required to troubleshoot a slot cover. TEST SMART Though they might seem not to have a purpose, slot covers must be in place to prevent dust from entering the case and maintain proper air flow in the case. This serves to prevent overheating. Front Cover Alignment Like slot covers, discussing the front cover alignment of a PC might seem like an odd topic in a discussion of troubleshooting, but also like slot covers, it is actually a natural inclusion. Having the front cover of a system incorrectly aligned can affect the cooling characteristics of the case, as well as obstruct access to buttons and, in some extreme cases, even prevent access to CD-ROM drive drawers or floppy disk drives. If a user complains that they are unable to insert a floppy disk into a recessed floppy drive like that found on some computers, you might want to check the front cover alignment. Storage Devices and Cables Today’s computer systems come at the very least with a floppy drive and a hard drive, a CD-ROM drive, and in many cases a CD-RW and/or DVD-ROM drive. In addition, new storage devices are finding their way into desktop systems including DVD writers, external storage, and USB static memory devices. Floppy Drive We no longer rely on floppy disk drives (FDDs) as much as we once did—most of the stuff we need to save exceeds the 1.44-MB capacity of floppy disks. However, almost every computer still has a floppy drive, so it is another device that you, as a PC technician, will be required to support. Troubleshooting floppy drives is typically not a difficult process. Sometimes a floppy drive will fail completely, but this is not very common. Other times it might work intermittently, reading some floppy disks and not others. In such cases, it is often best to clean the floppy drive heads to see if the problem is resolved. There are two ways to do this. The easy way involves using a floppy disk drive cleaning disk, which can usually be bought from an office supply or computer store. The other way is to remove the case of the floppy drive, and clean the heads with some denatured alcohol and a cotton swab. The bottom line, however, is that floppy drives are so inexpensive (and non-serviceable) that they’re not devices we would normally take apart if broken. Instead, we are more likely to replace one at the first sign of a problem. The following is a list of a few symptoms of a failing floppy drive and possible resolutions. • Floppy drive light stays on This is a very common problem and easy to resolve. If the drive does not function and the floppy LED stays on, it is often because the floppy drive cable has been attached incorrectly. Refer to Chapter 1 for the proper procedure for installing a floppy drive. As with some of the other errors discussed in this chapter, this is only likely to occur on a drive that has been newly installed. • Floppy drive light does not come on In contrast to the LED staying on is the problem of the light not coming on at all. This indicates that the drive is not getting power. Check the power connections to the floppy drive. • System halts saying a floppy drive cannot be located If the system cannot detect the floppy drive during the POST process, the system might stop and indicate that the floppy drive cannot be found. This is because the BIOS has configuration settings for a floppy drive and if the system cannot find the drive, it stops. You can then either change the BIOS settings to remove the floppy drive configuration (if the system doesn’t have one) and boot normally, or power down the system and ensure that the floppy drive is correctly installed. • Error message when accessing the floppy drive If you are working within the operating system and attempt to use the floppy drive, you might get an error indicating that the disk in the drive is not formatted or the device cannot be accessed. If this happens with a single floppy disk, it might simply be a faulty disk and not the floppy drive itself. If the same error message occurs with several disks, it is more likely that your floppy drive has failed. Sometimes cleaning the floppy drive heads will correct the problem or the floppy drive needs to be replaced. Hard Drives Of all the components in the computer, hard disk drives (HDDs) can be the most unnerving of them all to troubleshoot. This is not because they are more difficult to troubleshoot than any other component, but with hard drives, the stakes are much higher. Hard drives are the component responsible for maintaining user data, and in most environments, the loss of data is far more costly and more difficult to remedy than any problems with all the other computer components combined. Given the importance of hard drives and the data they contain, extra caution needs to be taken when troubleshooting them, which will often include making a full backup of the contents of the drive before trying to fix it. Better to be safe than sorry. There are two types of hard drives you will be working with, SCSI and IDE. In terms of troubleshooting, IDE drives are much easier to work with; troubleshooting SCSI devices can leave even the most seasoned PC troubleshooting veterans scratching their heads. Troubleshooting IDE Devices Many of you already have experience working with and troubleshooting IDE devices, whether it was upgrading to a larger hard drive or adding a second one to a system. IDE hard drives are the most common used today and the ones you can be guaranteed at some point to be troubleshooting. When troubleshooting IDE hard drives, there are two areas to keep in mind: external considerations such as cabling, and internal factors, including the overall health of the hard drive and any damaged clusters or sectors. Damaged clusters and sectors are basically areas of the hard disk platters that have been damaged and are no longer suitable for storing data. In Chapter 1, we reviewed the procedures and methods for installing IDE hard disks, but a quick review is in order before talking about troubleshooting an IDE hard drive installation. If two devices are connected on an IDE chain, one must be set as the master, and the other must be set as the slave. You set the master and slave settings by using jumpers, typically located at the back of a hard drive. If the Cable Select option is being used, set the jumpers accordingly. There can only be one master and one slave on each IDE channel. With that in mind, here are some areas to be aware of when troubleshooting an IDE hard drive. • Ensure that pin 1 of the cable is aligned correctly to pin 1 of the IDE channel. On most cables pin 1 is denoted by a red strip running down one side of the cable. • Verify that the master/slave jumpers are correctly set. • Confirm that the IDE disk is getting power and the power connector is securely attached. • Check the IDE cable to ensure that it has not become disconnected from the motherboard. • Sometimes the IDE cable itself might be faulty. To determine if this is the case, swap it out with a known working one. NOTE Believe it or not, you might have everything configured correctly on the IDE chain and it still fails. This is sometimes because some devices and hard disks are not compatible with each other. If you are sure everything is correctly set and things still do not work, the problem might lie with incompatible hardware. To verify that the hard drive is installed and recognized by the computer system, you can go into the system’s BIOS settings and confirm that it detects the hard drive. Modern BIOSs automatically detect the presence of a new hard drive and add the settings for it. Older BIOSs might require that you do this step manually. If the hard disk is detected but is shown as the wrong size, it is often not a problem with the physical installation, but means that the BIOS needs to be updated to accommodate the size of the new hard disk(s). TEST SMART If after you install a new and larger hard drive, you find that the BIOS does not recognize the drive’s full size, you might need to upgrade the BIOS. If after installing a new and larger drive, the operating system doesn’t recognize the full capacity of the larger drive, you might need to update the OS software. Tracking Down Bad Sectors and Clusters If you haven’t yet encountered a hard disk with bad sectors and clusters, don’t worry—you will. Even when an IDE hard disk is physically installed correctly, you are not out of the troubleshooting woods yet. Sometimes damage can occur to the actual hard drive, creating bad sectors on the disk itself, and in turn corrupting the hard disk and the data contained on it. This might happen through everyday wear and tear or it can be a result of the hard disk being dropped, kicked (or drop-kicked), or otherwise abused. Even improperly shutting down the computer system, sustaining power surges or spikes, or being victimized by viruses might cause the problem. The results of such events can range from destroying the entire hard disk to damaging just a few files contained on it. Either way, it’s not good. Many symptoms can indicate that there are problems with your hard disk. Some of these include: • Missing or corrupt files • Clunking or grinding noise caused by the read/write heads coming in contact with the platters • Inability to run or execute a program • System does not boot and indicates an error such as a missing operating system, or the inability to find Command.com Such indicators do point to a failing hard disk, but not always. For example, a virus could easily cause any of the problems just listed. So how can you tell the difference? That’s where software troubleshooting utilities come in. Ruling out that hard disk problems are related to virus activity is as easy as installing a virus checker and scanning your computer. If you suspect that there are bad sectors or other such anomalies, operating systems include utilities that give the hard disk a thorough test to see what is actually going on with the hard disk. The utility primarily used with Windows client systems such as Windows 9x and Windows Millennium Edition (Windows Me) is ScanDisk. On computers running Windows 2000 and Windows XP, the program is called CHKDSK, commonly referred to as Check Disk. This program can be set to run automatically as part of a regular maintenance of the hard disk, or it can be set to run manually when a problem is suspected. When run, the program can scan your hard disk and identify any bad sectors or clusters on it. This process can take a very long time but be patient, it is worth it. Does This Look Infected to You? One of the more difficult tasks when troubleshooting system errors is determining whether the problem is virus, hardware, or software related. Unfortunately, there are no black-and-white guidelines that can easily distinguish between them, but over time it will be easier to identify where the problem lies. One thing that helps in the process is to always ensure that you are using a robust and up-to-date virus solution. To do otherwise will make the troubleshooter’s life a whole lot more difficult. Another tactic PC technicians use to help win their battle with viruses is to stay on top of what viruses are out there and what they are designed to do. Armed with this knowledge, it becomes easier to identify the presence of a virus. To learn about virus threats, refer to Web sites from companies such as Symantec (www.symantec.com) and McAfee (www.mcafee.com). These sites offer detailed information on viruses and the damage each of them can do to a computer system. More Info More information on these tools can be found in Chapter 9. CD/CD-RW Drives On most of our computer systems, the CD-ROMs get quite the workout. As with many of the other components listed in this section, CD-ROM problems can be a result of actual failure of the device or incorrect installation. Of the two, an incorrect installation is most often the cause. As with hard disks, CD-ROMs can either be SCSI or IDE; however, the cost of SCSI CD-ROMs makes them less popular than their IDE counterparts. With ¬performance being similar between the two, SCSI CD-ROMs are not often seen. Troubleshooting a SCSI CD-ROM follows the same basic principles as troubleshooting SCSI hard drives, discussed previously. This includes verifying termination, SCSI IDs, cabling connectors, and cable length. There is also a chance that the SCSI CD-ROM device itself is damaged, although this is not common. Troubleshooting IDE CD-ROM drives is not complicated. Checking the physical connection follows the same guidelines as verifying the physical connectivity with IDE hard drives. This includes checking the master/slave settings and cable connections. If the CD-ROM drive is physically attached to the system and you are still unable to access it, the drive itself might be damaged. It is often a good troubleshooting step to swap out the CD-ROM drive with a known working one and see if the problem persists. If the problem is that the drive gives errors while reading, you can buy special cleaning CDs, which can be used to clean the lens within the drive. Often, all that is needed to “fix” a CD-ROM drive that is operating erratically is to run it for a couple of minutes with the cleaning CD and all is well. In most cases, the same steps that are relevant to troubleshooting CD-ROM drives apply to troubleshooting CD-RW drives as well. There are also some other considerations, however. One of the most significant of these might be that with CD-RWs, the correct, device-specific, software drivers are normally required. Whereas most OSs can detect the presence of a CD-ROM drive and often use a generic driver quite satisfactorily, in many cases the OS software will install a CD-RW as a CD-ROM drive and so you will not be able to use it as a writer. So, if you are installing a CD-RW into a system, make sure that you install the correct drivers for the device. NOTE Although many CD-R and CD-RW drives advertise high write speeds, in practical use the actual maximum write speed might be slower. In some cases it might be that the media doesn’t like being written at such a high speed. If you are consistently having problems writing a CD, try bringing the write speed setting down a notch or two. DVD/DVD-RW Drives Nowadays, it is very common for systems to come with a DVD drive and increasingly DVD-RW units. Because DVD units are almost identical in their physical and logical operation to CD units, troubleshooting these devices follows almost exactly the same processes. Tape Drives Generally speaking, tape drives tend to be the domain of server systems rather than desktop PC systems, making the need to troubleshoot them relatively uncommon. The following are some of the things that you should consider when troubleshooting tape drives. • Check physical connections Tape drives can be either internal or external units. In either case, if you are having problems you should check the data cable and the power connections. • Use a cleaning cartridge Cleaning the mechanism on a tape drive is considered a routine maintenance task, but if you are having problems reading or writing to a tape it is also the first step in the troubleshooting process. Use the correct cleaning tape for the drive—for the best results periodically use a new cleaning cartridge. • Don’t discount the cartridge As unpalatable as it might be, there are times when there might simply be a problem with the tape that you are trying to read from. In these instances, you can try the tape in another tape drive, but the likelihood is that the data is gone. This is just another good reason why you should always keep more than one copy of critical data. Removable Storage Devices Any problems you encounter with removable storage are likely to be with the connection method rather than the devices themselves. For that reason, if you are having problems with a removable storage device, check the physical connection and, if the device has one, the external power source. Outside of that, the general troubleshooting steps for your removable storage device are much the same as those for nonremovable storage. Perhaps the only exception to this advice is that many external storage devices require special software drivers. You should ensure that these drivers are correctly installed and configured. Cooling Systems The cooling systems within a PC are vital to its correct operation. Without any one aspect of the cooling, your system might overheat, causing permanent damage to key (and typically expensive) components. For that reason, it is essential that PC technicians understand the cooling components within a system and the functions they perform. Fans Fans are quite simply the most important component in keeping your system operating at an acceptable temperature. A system generally has at least two fans, one on the CPU and another in the power supply, but it is becoming increasingly common to have one or even two additional fans in the system case to further aid the cooling. For a system to be cooled adequately, all of the fans should be connected and operating. If one of the auxiliary fans fails, it might be OK to continue running the system for a short while until it can be replaced, but if the fan in the power supply or on the CPU fails, the system should be powered off immediately and not used again until the fan has been replaced. The heat generated by both a CPU and a power supply can burn that component out very quickly if it is not cooled correctly. Modern system boards have sensors that can detect when a CPU fan fails, causing an alarm to sound. If your system board has such a feature, you should make sure you understand how it works and that it is enabled. If you have to replace a fan, make sure that you use the same size fan and that you install it in the correct orientation. It is possible to install a fan so that it is blowing air into the system case rather than sucking it out. As you can imagine, this is not a good thing. Heat Sinks Given that a heat sink is an inanimate lump of metal, troubleshooting it is a relatively straightforward process. Provided that the heat sink is squared directly on the chip that it is cooling, and provided that the heat sink has a good contact with the chip, albeit through an adhesive paste, heat sinks should not pose you any problems. Do look out for broken fins, though. These fins serve to increase the surface area of the heat sink and although one or even two being broken off is unlikely to cause a problem, a large number of broken fins can affect the surface area of the heat sink sufficiently to affect its cooling capabilities. Liquid Cooling Although fans and heat sinks do a good job of conducting the heat away from a CPU, they still rely on one thing—air—to move the heat around. With CPU speeds increasing, it looks like we are starting to push the limits of what this air-based cooling system has to offer, which means that we will have to look for something stronger. The answer is liquid cooling. In much the same way that cars use a liquid-filled system with a radiator to pull heat away from the engine, systems for CPUs that use a liquid-filled heat sink and a “radiator” on the rear of the PC are starting to appear in PC systems. At this point the systems are expensive, difficult to install, and still evolving, but over the next couple of years you can expect to see more and more of these liquid cooling systems appearing in high-end server and desktop PCs. One secondary benefit of these systems is reduced noise. Whereas fans, and the airflow they produce, create noise as they run, liquid-based cooling systems use an impeller within the liquid that is almost silent. One thing to note is, at this point, liquid-based cooling systems are only being considered for CPU cooling. It is likely that other areas of the PC such as the power supply will continue to be cooled by fans for some time to come. Temperature Sensors Because of the potentially damaging effects of heat buildup, many motherboards now include heat sensors that can be configured to trigger alarms or even shut off devices if the temperature within the system case reaches a certain level. If your system board (or any other device installed in your system) comes with these temperature sensors, you should make yourself familiar with how they work, and also with how they are enabled and disabled. Typically, you will leave temperature sensors enabled and set to conservative levels to prevent damage to the system in case of overheating. TEST SMART Configuration of temperature sensors is typically performed through the motherboard BIOS. If you receive an alarm from one of the temperature sensors, you should investigate as soon as possible. Typically, if a system has been running for some time without a problem and then the heat sensor alarm goes off, a component like a cooling fan has failed or something is blocking one of the vents on the system case itself. If the alarm goes off on a new or recently upgraded system, it could be that there is simply not enough cooling capacity. In this case, you will need to look at adding additional fans or improving airflow to accommodate the increase in heat production. Processor/CPU In the years that we have been involved in repairing and maintaining computer systems, we can count the number of failed processors on a single hand. Most processor failures we have encountered are the result of overclocking (configuring the processor to run at a higher speed than it was designed for) or overheating, which are for the most part preventable processor disasters. With that in mind, there are two key areas to watch when working with potentially failed processors: installation and overheating. NOTE If you are working with a newly installed processor, you should not discount the possibility that it might be faulty. Processors are extremely sensitive devices, and it’s not unheard of for them to be faulty upon delivery, or to become faulty during the installation process. Processor installation was covered in Chapter 1, but as far as troubleshooting is concerned, if the processor was installed incorrectly it might not work at all or work at the wrong speed. Some people overclock their processors to increase processor performance, although this practice is not as prevalent as it once was. While this might work, for a while at least, it also increases the work the processor has to do, forcing it to operate beyond what it was designed to do. Overclocking a processor can destroy it immediately or it can shorten the life of the processor. We don’t recommend overclocking the processor. A second problem associated with processors is overheating. Today’s processors operate at very high temperatures and use a heat sink and fan to dissipate the heat. The problem is that processor fans periodically fail, and when they do, the processor cannot keep cool. If a processor were to operate too long without a fan, perhaps even minutes, the processor can overheat and burn out. There is no way to recover a burned-out processor; it is reduced to an item for show and tell. You do not need to keep taking off the case of a computer to see if a fan is failing. A failing fan typically makes a noise that an astute PC technician will suspect as a potential problem. If you do hear an unusual noise coming from the computer system, pop off the case and see if the fan is still functioning. Processor fans are cheap, and replacing them is a straightforward process. So if you suspect a fan is failing, it might be better to just replace it to be sure. TEST SMART Nowadays, many processors have fans that are fed with power directly from the motherboard. This makes it possible to monitor the fan status via software. Memory A memory error can be one of the more common errors you are going to encounter as a PC technician. The problem with memory errors is that they are deceptive and might not look like memory-related errors at all. Some symptoms such as beep codes during the POST process can easily indicate a memory problem, but other symptoms such as page faults and system lockups can also indicate a memory error. The problem is these symptoms do not necessarily mean the problem is related to memory, only that it is a suspect. SEE ALSO Refer to the section titled “POST Audible/Visual Error Codes” earlier in this chapter for information on POST errors. When you do believe that memory is at the root of the problem, there are some specific troubleshooting steps you can perform, including: • Verify memory compatibility If you have newly installed RAM, confirm with the manufacturer’s Web site that it is compatible with your current configuration. • Confirm memory configuration Depending on the motherboard, it might be necessary to match equal capacity memory modules in available banks. Refer to your documentation or the motherboard manufacturer’s Web site to determine what the configuration requirements are. Refer to Chapter 1 for memory installation. • Confirm installation Sometimes a memory problem is nothing more than poorly seated module. When troubleshooting memory, it’s a good idea to carefully remove the RAM and reinsert it. Make sure that it is seated properly in the slot and that it is securely in place. • Replace the modules It might be necessary to remove and replace the memory modules systematically to help determine the one that is causing the problem. In other words, remove memory modules one at a time (or one bank at a time in older systems) to see if you can identify the defective module. If you have spare, known to be good RAM, this process is slightly easier. • Clean contacts and connectors If the memory in your system is old or the inside of the case is particularly dirty, the contacts and connectors might need to be cleaned. Refer to Chapter 3 for more information on how you can safely and effectively clean connectors. TEST SMART If your system is freezing periodically and you suspect memory is the cause, a good strategy is to swap out the RAM and see if the problem is corrected. As well as these steps, there are software utilities that can be used to test RAM, although this assumes that the system is running in the first place, enabling you to run the program. Display Device When it comes to troubleshooting the display, there are two possible scenarios you will encounter: no display and bad display. Of the two, staring at a blank screen seems more dramatic, but it’s often easier to isolate and troubleshoot. Bad display can be a bit trickier. Troubleshooting a Blank Display Staring at a blank screen can seem like a huge problem but in reality, when you have no on-screen display, the problem is often quite easy to isolate. As we know from Chapter 1, there are two separate components that give us our display, the monitor and the video card. Therefore, when we are troubleshooting a display problem, we have two components to look at…so to speak. Monitors Troubleshooting monitors is typically a simple process with most problems being easy to isolate. The first step we would suggest when troubleshooting display problems is to verify that there is a connection between the monitor and the video card. When a connection is made between the monitor and the video card, assuming that the monitor is one of the modern power-saving models and that the PC is turned on, the LED on the front of the monitor will turn a solid green. When a connection is not detected, the LED is often a flashing or solid amber. On older monitors, the LED will most likely be green no matter what the state of the connection. NOTE Don’t forget—check the simple things first. If you are troubleshooting a monitor issue, check that the power cable has not become disconnected from the monitor. If the problem is not the cable, another place to look is the monitor’s settings. Monitors typically have a series of buttons on the front, allowing you to change the brightness, contrast, and other settings of the monitor. Oftentimes, the settings might have been changed and are preventing a display. Perhaps the simplest example of this is when the brightness has been turned all the way down. The picture is there—you just can’t see it! TEST SMART One simple way to isolate the cause of a display problem is to swap out the monitor with a known working one. If your new monitor works, the problem lies with the old monitor and not the video card. Troubleshooting Poor Display Far more common than troubleshooting a completely blank screen is troubleshooting poor display. You will often be faced with the problem of a blurry screen, a screen with low resolution, or a screen that flickers constantly. Like troubleshooting a system with no display, the problem with a poor display can be traced to either the monitor or the video card. To troubleshoot poor on-screen display, consider the following: • Verify that the latest drivers are being used for the video card. This simple procedure can fix a number of display anomalies. To get the latest drivers for the video card, go to the manufacturer’s Web site. • Verify the settings on the monitor. New monitors allow you to make very specific configuration changes, normally through an on-screen menu system. Adjusting these settings will change how the monitor displays information on the screen. You should also check the monitor settings in the OS to ensure that your monitor and video card are configured correctly. If the on-screen image is fuzzy, this can often be fixed by degaussing the monitor. Over time, the monitor picks up a weak magnetic charge that interferes with the system’s display. To remove this charge, you use a degauss button located on the monitor or select the degauss option from the monitor’s menu selection screen. Some smaller monitors might not have a degauss function, but larger ones do. Sometimes, though not often, a bad monitor cable can cause on-screen errors. The symptom of a bad monitor cable is a display with a blue, red, or green tint to it. Wiggling the cable will often return the display to normal but this cable will have to be replaced eventually. Input Devices Input devices are where “the rubber hits the road” so to speak in the interaction with the user. As a result, as well as the normal wear and tear considerations, you must also consider the human factor when troubleshooting these generally trouble-free devices. Keyboard One of the first things we do when troubleshooting a keyboard problem is to reseat the cable connection at the back. Keyboards are hardy devices and little things such as loose cables are more often the cause of a keyboard failure. If the keyboard is connected correctly and you still can’t type, the next thing you might try is to swap the keyboard out with a known working one. This eliminates the actual keyboard as the root of the problem. If the good keyboard still fails, you are looking at a failed keyboard controller. This is not good, as the controller is built right onto the motherboard and generally is not replaceable. If the keyboard controller has failed, instead of replacing the motherboard you can use a keyboard that doesn’t use that port. For example, if the faulty keyboard controller is for a PS/2 port, perhaps use a keyboard with a USB connector. TEST SMART If you are called to fix a keyboard that continually beeps even after the system is restarted, you have a stuck key or there is something pressing down on a key. Mouse/Pointer Devices Try using your computer system without the assistance of the mouse and you will discover how awkward life is without it. Fortunately, mice are fairly simple devices and they do not regularly fail. The most common problem with mice is erratic cursor movement on the screen or unresponsive cursor movement. This is often an indicator that the mouse ball and/or the inside of the mouse is dirty and in need of a good cleaning. The rollers on the inside of the mouse might need to be scraped and cleaned. More Info For more information on cleaning the mouse, refer to Chapter 3. Some mouse errors are far more complex than a simple cleaning can fix. Of course a severed mouse cable would qualify as such, but other problems are a little less obvious. If you find that you boot up your system and you are unable to move the cursor around the screen, there are a few things to look for. First and most obvious is to check that the mouse is actually plugged into the computer. If it is, try to reseat the cable connection; sometimes it just comes loose. A second possible problem is a resource conflict. Each device connected to the computer system requires system resources such as an IRQ and I/O address. When two devices use the same resources, there is a conflict preventing one or both devices from functioning. Of course, chances are that a resource conflict will only occur if a new device has been installed or you have reconfigured a device already installed in the system. Much as people would have you believe otherwise, computers do not reconfigure themselves, and resource conflicts on a system where nothing has been changed are extremely rare. A PS/2 mouse typically uses IRQ 12, so if you power up your system and the mouse isn’t working, another device might be trying to use IRQ 12, creating a resource conflict. A serial mouse can also be involved in a resource conflict with other devices such as a modem that use serial port resource assignments. If your serial mouse is connected to COM1 or COM3, it will be using IRQ 4; if it’s connected to COM2 or COM4, your mouse will be using IRQ 3. You need to make sure that if devices like internal modems are installed in your system, they are not conflicting with other serial devices. If the mouse is properly connected and you have confirmed that there is not a resource conflict, the best course of action is to swap the mouse out with a known working one to help isolate the exact cause of the problem. As well as mice, many people now use other pointing devices with their systems including trackballs and graphics tablets. Trackballs are almost like an inverted mouse, so many of the same troubleshooting steps that apply to mice apply to trackballs. Graphics tablets generally employ a special pointing device or a pen that is used to draw on a flat surface—the “tablet.” The most important consideration with a graphics tablet is that the surface of the tablet must be kept very clean. The best way to do this is with a damp cloth. Touch Screen There is something really cool about touch screens—you just peck away at the screen with a fingertip and away you go (so much more 21st century than using a keyboard). In fact, touch screens are a relatively mature technology, having been around for well over a decade. Touch screens are commonly found in point of sale (POS) systems and in other specialized applications such as Internet kiosks and information terminals. They are not often found connected to PCs in normal office environments. In addition to the standard video cable, which provides the communication of the picture between the system and the screen, touch screens also utilize an additional connection, either through the serial port or via a proprietary interface card, to receive and process the information from the touch screen itself. Because touch screens are specialized pieces of equipment, actual servicing of the screen itself is best left to those with the relevant experience. As far as basic troubleshooting goes, in addition to checking that the physical connections are secure, you will also need to determine that the software configuration for the screen is correct. Adapters The vast array of adapters used in today’s PCs makes troubleshooting them a varied and frequent task. In the following sections we explore some of the more common troubleshooting procedures with some of the more common adapters. Network Interface Card Network interface cards (NICs) are another one of those components that, if handled and installed correctly, will give you very few hassles along the way. When a network card does fail in a system, the computer will no longer be able to access the network to which it was connected, whether this is the Internet or an internal network. If you suspect a faulty network card, check the following settings before replacing the card: • I/O settings If any new hardware has been added, check to see if there is an I/O address conflict. Use the specific hardware reviewing tools to see if this is the case. • Network card drivers An incorrect driver might make a card work improperly. Even if the driver installed is assigned by the operating system, visit the manufacturer’s Web site to get the most recent driver available. • IRQ conflicts Although nowadays, with plug and play, IRQ conflicts are not as frequent as they were, they are still a concern. Review your operating system logs to ensure that there are no conflicts. • Card settings Make sure all of the card’s network settings are correct. One wrong setting and things could just not work. • Network card LED Most modern network cards have an LED on them, called the link light, which is used to tell if there is an active connection to the network. If you cannot access a network, check to see if the LED is lit. If it is not, there might be a problem with the network card installation or setup. NOTE In some environments, network support technicians have spare network cards available to be swapped into a system at a moment’s notice. As much as possible, use duplicate network cards to those already in the machine. This will prevent driver-related problems and decrease potential downtime associated with swapping out a card. Sound Card In the old days of computers, fighting with sound cards and audio problems was a common occurrence, leaving many of us to make our own sound effects for those early dungeon games. Today, installing and assigning resources to a sound card is normally no more difficult than physically installing the sound card and turning your computer on. If you are using legacy sound cards, the same cannot be said. Many of these old cards still use jumpers to set their resources. You might need to check with the manufacturer’s documentation if you are setting system resources manually using jumper settings. When troubleshooting sound problems, there are three key areas to focus on: the physical installation, the sound card drivers, and the software setup. In Chapter 1, we explored the installation of sound cards. If there is no sound coming from the system, the sound card might not be correctly installed. Of course, if your sound card is built into the motherboard, this is not possible. Sound cards are installed into the system in the same way that other expansion cards are, with older sound cards using legacy ISA slots and newer sound cards using the PCI slots. You might need to remove and reseat the card to ensure that the connections are being properly made between the card and the expansion slot. One common error in the physical setup is to connect the sound jack from your speakers to the wrong port, such as the microphone port. It is worth double-checking to ensure that cables are correctly attached. A second area to look at when there is no sound coming from the computer is the sound card drivers. Even if you are using the drivers that shipped with the audio card, you might need to take a trip to the manufacturer’s Web site to get the latest drivers. Only with the latest drivers can you be sure that you have the correct ones. A final area to check is the software you are using within the operating system. Believe it or not, a large majority of sound-related problems are fixed by simply turning the volume up or on in the operating system. Like we said before, troubleshooting is often about finding the easiest solution. Video Card Video cards are one of those components that, once they are installed and working correctly, rarely go bad. If you suspect that the video card has failed, confirm that it has been properly installed and securely attached in the appropriate expansion slot. If it is an old one that has quit working, replace the video card with a known working one. If the new video card works, the old one is simply faulty.The old card is typically thrown out, or returned to the manufacturer if still under warranty. Modem Over the years, we have probably fought more with modems than any other computer component. In the days before plug and play, modems were often a nightmare to configure and resource conflicts were a common occurrence. Today, the assignment of resources to modems is handled automatically, at least for newer modems that support plug and play. For all those legacy modems that still use jumpers and are not plug and play, you might need to roll up your sleeves and go a few rounds with your modem. As with other components, when troubleshooting a modem it is very important to start with the simple fixes. Though not always the solution, these simple fixes can save a lot of your valuable troubleshooting time. The simple fixes as far as modems are concerned include: • Verify that the phone jack and power cable (on external modems) are correctly installed. • Modems often share a phone line, and if someone else is using the phone you might not be able to dial out. • You need a dial tone to use the modem. If you are unable to connect using the modem, ensure that you have a dial tone. If one of these simple fixes does not get the modem working, you will need to dig a little deeper. The first thing to check is whether the operating system detects the modem you have installed. All major operating systems provide the mechanisms to do this—in Windows you can view installed hardware from within Device Manager. If your modem is detected, it will be listed there. If it is detected but does not function, it might be that you need to update your modem drivers and verify that there are not any resource conflicts. If your modem is not detected by the system, you might need to confirm that it is physically installed correctly. Table 2-4 provides a summary of the areas to check when troubleshooting a modem. Table 2-4. Summary of Modem Troubleshooting Symptom Possible Solution The modem stops working. If the modem stops working, check whether new hardware has been added that might be causing a resource conflict. If new software was added, verify that the modem settings have not changed. The modem keeps hanging up after making a connection. Verify that the modem settings are correct. Also, you might need to download and install the latest modem drivers. The modem is not connecting at correct speed or is operating slowly. If a modem is connecting more slowly than it should, verify that you have the latest drivers installed. The problem can also be a result of poor phone line quality (often referred to as line noise), which is something you cannot do very much about. The modem makes a connection but cannot authenticate. This is often caused by the wrong user name/password combination. Verify that you are inputting the correct settings. The operating system cannot find the modem on a COM port. This can often be fixed by moving the modem onto another COM port or checking the configuration of the port to make sure it’s correct. The modem dials but cannot connect. Ensure that the correct phone number is being dialed. The modem gives an error that it cannot make a connection to the server. Verify that the protocol settings are correctly configured to access the server. As you can see from the preceding table, there are plenty of areas to look at when troubleshooting the modem connection. Once you have fought with one or two of them, it gets easier. But when you are just getting used to troubleshooting modems, don’t be afraid to call tech support or your ISP—it can and will save you a huge amount of time. Modem Commands When working with and troubleshooting modems, we often use something called the AT command set, to help in the process. The AT commands are used from a communications application, such as the HyperTerminal utility in Windows, and talk directly to the modem. The modem in turn responds to the commands, providing us with information we can use in the troubleshooting process. Some of the most commonly used AT commands are included in Table 2-5. Table 2-5. Commonly Used AT Commands Command Result ATA Sets the modem to auto answer ATH Hangs up an active connection ATDT phone number or ATDP phone number Dials the specified phone number using tone (T) or pulse (P) dialing ATZ Resets the modem AT13 Displays the name and model of the modem ATX Resets the modem to a predefined state TEST SMART Before taking the A+ exam, make sure that you know the functions of the commonly used AT commands. SCSI Working with SCSI devices can be a frustrating experience, and if things start to go wrong you might find yourself burning the midnight oil as it were. If it’s any comfort at all, most SCSI-related difficulties can be isolated to a very few causes. In light of this, perhaps the most important thing to remember when a system that uses SCSI devices starts to go down is not to start pulling things apart too soon. The key is to take a step back, take a breath, and start with the basics. The following sections describe some of the basic considerations when troubleshooting SCSI hard drives. NOTE As with all troubleshooting procedures, when you’re working with SCSI systems, be sure to make only one change at a time. Making multiple changes confuses the issue and prevents you from knowing what the exact remedy to the problem was. Termination Many SCSI-related problems can be traced to improper termination issues. Installing SCSI devices and termination was discussed in Chapter 1, but in a nutshell, each of the physical ends of a SCSI bus must be terminated to prevent signal reflection. Improper termination can be tricky to isolate because it might appear to work at first, and because the problems created by improper termination might be intermittent. What you might notice is that data is being lost from time to time, or the system periodically hangs. It might even be that the system will not boot at all, or some of the SCSI devices won’t be recognized. To prevent these often hard-to-track-down problems, it’s absolutely necessary to ensure proper termination from the start. Remember, a SCSI bus needs termination at both ends of the bus and nowhere else. NOTE When troubleshooting SCSI termination, it might be necessary to connect and test devices one at a time. Finding termination problems on a fully loaded SCSI bus can be difficult to do. Cable Connections Given the complexity of connecting SCSI devices, it’s often easy to overlook the obvious. But when troubleshooting a SCSI problem, the whole issue could be as simple as whether the SCSI cables are securely attached or connected at all. Before calling your SCSI vendor and complaining of faulty equipment, take a quick look to see if all of the cables are properly attached. It only takes a second and can reduce downtime if it is that simple. Like improper termination, loose cables can cause intermittent problems. All external SCSI cables come with a means to securely connect them to the system. Whether the mechanisms are the locking clamps of the Centronics connectors or the thumbscrews of the D-shell connectors, they need to be fastened down securely. Broken Cable Though this is most often not the case, SCSI cables do fail, and if low-quality SCSI cables were purchased initially, this is even more of a concern. If you suspect a faulty cable, visually inspect the cables to ensure that the pins are not bent and that they make a good connection. It is a good idea to check for strains or cracks on the ends of the cable. Some of the more expensive cables have a strain relief mechanism designed to protect the cable, but this doesn’t always provide the necessary protection. Cable Length Exceeding the recommended SCSI cable length is not a good idea. If the total length of the SCSI cable exceeds the recommended length, do not be surprised when problems arise. Cable-length specifications for the various SCSI standards were listed in Chapter 1. SCSI IDs Conflicts are sure to arise if you try to have two SCSI devices on the same bus trying to use the same ID. Each device on a SCSI bus must have a unique ID. So when troubleshooting your SCSI configuration, ensure that each of the devices has a unique ID assigned to it. TEST SMART If after installing a new hard drive, whether a SCSI device or IDE, the device does not work, the first area to check is the physical installation. This includes the cabling and hard drive settings. IEEE 1394/FireWire, USB IEEE 1394 and USB adapters are used when either the system needs more ports, the system does not have any of its own, or the system ports are no longer functioning. IEEE 1394 and USB adapter cards are typically very basic with little or no configuration options on the hardware level. From a software perspective, the cards will either come with specific drivers or rely on the drivers supplied with the operating system. For this reason, you should check that the correct drivers are either supplied for your OS or that they are downloadable from the Internet. Within the operating system, the additional IEEE 1394 or USB ports provided by the card will appear as if they are standard system devices. Portable Systems Portable systems present their own unique troubleshooting problems because in general they are all proprietary. That is to say that you generally can’t just go to your local computer store and buy replacement parts off the shelf. Generally speaking, replacement parts have to be specially ordered, and they are normally only available through an authorized dealer. PCMCIA PCMCIA cards are designed to be rugged devices, making them very reliable. Because PCMCIA cards can be plugged and unplugged while the system is powered on, the main problems that you will experience with PCMCIA cards are related to drivers or the detection process by the OS. The biggest problem with PCMCIA cards is that not all cards are supported by all OSs. You need to be very careful with PCMCIA cards, especially if you are trying to use an older PCMCIA card with an older OS such as Windows 9x. If you are buying a new card and using a current or recent OS, you shouldn’t have any problems but it is still worth checking. Another thing to consider with PCMCIA cards is correct insertion. The card must be fully inserted in order to make the correct contact with the pins within the interface. Remember from Chapter 1 that PCMCIA cards come in a variety of sizes. You must ensure that you are using the correct card for the correct slot. Batteries One of the major differences between batteries and other components is that while other parts might continue operating indefinitely, batteries are guaranteed to fail sooner or later. Even though the modern battery technologies used in today’s portable systems are designed to withstand irregular charging schedules, eventually their capacity to hold a charge will diminish to the point where they must be replaced. NOTE Although software-configurable power schemes such as those found in Microsoft Windows might increase the usable time available on each charge, they generally don’t improve the longevity of the battery. In nearly all instances, batteries in a laptop must be replaced with a model-specific battery. For this reason, you will often have to order the battery through an authorized reseller. In certain cases you might be able to buy a generic replacement battery but this is becoming less and less common. Before you replace a battery for a portable system, you should make sure that the battery is indeed the problem. Often your local electronics store will be able to check a battery to see if it is dead. There is nothing worse than spending upward of $100 on a new battery for your laptop only to find that it is the internal charger or some other related component that is the problem, and not the battery at all. Docking Stations/Port Replicators Like batteries, docking stations and port replicators are specific to the portable system that they were designed to be used with. Therefore, the first troubleshooting step for these is to make sure that the correct docking station or port replicator is being used. Once this is determined, you can then move on to more specific considerations. If you are experiencing a problem with a single function or peripheral, you should concentrate on determining that the correct settings are being used for that device and that any drivers are installed and configured. Because certain operating systems can be used in multiple configurations, often to cater specifically to docked/undocked configurations, you should make sure that the device is enabled in the current profile being used. If you have done all of this and you are still having problems with the device, you might need to turn your attention to the device itself. If none of the devices or functions provided by the docking station or port replicator are available, you should concentrate on making sure that any external power connections are secure and that the connection between the portable system and the docking station are complete and correct. The interface between the docking station and port replicator may be difficult to see, making it necessary to feel” that the connection has been completed rather than seeing it. If this is the case, try removing the portable system from the docking station and reinserting it. Finally, always consider that the docking station or port replicator adds an extra layer of complexity to the troubleshooting process. In the past we have encountered devices that simply will not work when connected to a docking station that would work if connected directly to the portable system. For that reason, you should always try a direct connection, thereby eliminating (or implicating) the docking station or port replicator as the source of the problem. 34

Posted in Identifying and Troubleshooting System Component Errors | Leave a Comment »