In this video you will learn how to identify a problem; research the problem using a knowledge base or the internet (if applicable, establish a theory of probable causes; test the theory to determine the cause; establish a plan of action to resolve the problem & identify potential effects; implement the solution or escalate as necessary; verify full system functionality, and if applicable, implement preventive measures; and finally, document your findings/lessons learned, actions, and outcomes.
The first thing you need to be aware of when troubleshooting a problem is understanding that the symptoms are not the problem. The actual problem is what’s causing the symptoms to manifest. To identify the problem, use the following procedures as needed:
Before making any changes, make sure you safeguard current settings:
Gather Information
Start the process by gathering information. Some of the information you need may be obtained from log files created by the operating system. You can also compare the current settings for the device to its default settings.
Other information sources include:
Duplicate the Problem, If Necessary
If possible to duplicate the problem. Try the same tasks with the same files and output devices that were originally involved. Record any error messages or dialogs that are displayed. Use the screen capture utility to capture error messages or dialogs.
Question Users
Ask the user(s) that have reported the issues to you to provide details about the problems they are experiencing. When asking users, remember that some of them may be scared for fear of believing that they may have done something they were not supposed to do. Just remind them in a compassionate and understanding manner that you are simply there to help them solve the issue they are experiencing with the network and/or their device.
Some of the ways you establish rapport and build a good relationship are:
Identify Symptoms
As you are talking to the customers, you should also be trying to identify symptoms. Some possible symptoms might include:
Some questions you could ask the user:
Be mindful that these are just questions to help get the ball rolling towards identifying the problem.
Determine If Anything Has Changed
Determine if anything has changed (device settings, upgraded hardware, updated operating system or app, cables, etc). The change might be the reason for the failure you are trying to troubleshoot. Some ways to determine if anything has changed are as follows:
Approach Multiple Problems Individually
Multiple problems could be the result of a common issue such as problems taking place with the network, but unless you know for sure it’s a network problem, it’s easier to figure out the solution to a single problem before moving on to the next.
Always remember, the knowledge base and/or the internet are your friends when it comes to troubleshooting. If your organization has its own knowledge base, start your research there first. If not, then head out to the internet to conduct research. When you are out there searching for possible reasons, keep these items in mind:
Once you think you have successfully researched solutions & identified potential issues that may be causing the problem, it is now time for you to establish a theory of probable cause.
Question the Obvious
Sometimes the solution to a problem could be something very simple that just goes unnoticed. For example, a user calls the helpdesk to report that their screen just suddenly went black. You arrive to investigate the issue. As you are investigating the issue, you notice that power lights to the monitor are not on. You look behind the monitor to see that the power cord is plugged in. You follow the power cord to the wall outlet to discover that the power cord is not plugged into the outlet. You then inform the user that the monitor was not plugged into the outlet. The user then recalls accidentally kicking something under their desk which more than likely was the power cord from the outlet. You then plug the power cord back into the wall and secure the power cord with zip ties to the desk in such a manner as to prevent the user from accidentally kicking the power cord in the future. Problem solved.
Consider Multiple Approaches
A rule you should tell yourself (and only yourself) is the K.I.S.S. rule: Keep It Simple Stupid. Sometimes there are multiple approaches to solving a problem, but it is best to go with the simplest and most easy to implement an approach. For example, if a user is experiencing problems with their keyboard such as sticky keys, simply swap out the keyboard for another one so the user can continue on with their work instead of you trying to take apart keys to remove the stickiness.
Divide & Conquer
Sometimes problems arise that may deal with components and their various subsystems. Take for example a printer. From the time a user hits the print button until a document is actually printed out, there are various subsystems linked to the printer which could possibly cause a print failure. The printing subsystem includes the printer, the USB cable between the printer and the computer, the USB port, the printer driver in the operating system, and the application. Each could cause a problem. First thing you could do is check to see if the printer is turned on and if so, does it have ink and toner? If not, that may be the problem. If it does, then check the USB cable to the printer and the computer to make sure it is plugged in on each device in the correct ports. If not, then move on to the next print subsystem until you isolate the problem. This is called the “divide & conquer” technique which allows for you to find and fix problems in a systematic manner.
Once you believe you have established probable cause, it is now time for you test a theory to determine the cause. To test a theory, change what you think is causing the problem. Some examples are as follows:
After you make a single change in the system, retest it to see if the problem is solved.
Side Note: If you don’t record the current configuration of the system’s hardware and software before you make a change to test your theory, you will not be able to reset the system to its previous condition if your first change doesn’t solve the problem.
Once the Theory is Confirmed (Confirmed Root Cause), Determine the Next Steps to Resolve the Problem
If your theory is confirmed, it’s now time to resolve the problem. Here are some examples from the previous section:
If the Theory is Not Confirmed, Establish a New Theory or Escalate
If you come to the conclusion that your theory did not work, the next thing you should do is develop a new theory and test it. If you are confident that your theory is what is causing the problem and you have identified the correct problematic subsystem, move to the next step in the process of testing the subsystem.
For example, if you remove a USB cable from a USB port and plug that same cable into a different yet similar system and the system works fine, then your issue might not actually be the USB cable, but the actual USB port on the original system. Some of the issues could be damaged contact pins or a build of dirt inside the port. Now if you have concluded that there is no dirt nor any damaged contact pins inside of the USB port, your next step may be to escalate the problem to the next support tier.
Once you have identified the problem and discovered a solution, it is now time to establish a plan of action to resolve the problem and identify potential effects. An example for how to deal with a malware outbreak is as follows:
If you are responsible for implementing the plan of action, follow it carefully. Be sure to note any problems with the plan or any additional problems you observe. If you are not responsible for implementing the plan of action, escalate it to the department that is responsible.
Once you have implemented the solution, the next step is to check to make sure that the system, peripheral, or device actually does what it is supposed to do. An example of a full system functionality test is as follows:
When it comes to problems arising in IT, more than likely you are going to encounter similar problems over and over again. Instead of approaching each problem as if it is the first time you’ve seen the problem, the best approach is to document your findings, lessons learned, actions, and outcomes from each problem you have solved so that you can build a repository of solutions for future problems you are bound to encounter again in the future. Be sure to add any figures (screen captures, diagrams, photos, etc) that will help you or others solve similar problems next time. Detailed documentation is your friend when it comes to solving problems in IT.