What is Incident Management?
When a service fails to deliver its ideal functions or performance during service hours, priority is placed on having that service repaired to ensure that day-to-day operations in a company can continue. If the service failure could potentially result in a security breach or degradation of the service, then systems should be in place to act as responses that are triggered to prevent the failure from disrupting the workflow of a company.
Incident management describes the processes involved in ensuring that services remain functional for as long as possible and are repaired as quickly as possible. This is the main objective of incident management and level 1 support (often carried out by service desk staff) typically involves the following
- Identifying incidents
- Logging incidents
- Categorizing incidents
- Prioritizing incidents
- Diagnosing incidents
- Resolving incidents
- Closing incidents
- Escalating incidents if necessary to level 2 support
- Communicating with users throughout the incident
Incident management focuses on restoring a service as quickly and efficiently as possible and typically doesn’t involve analysing the root cause of the incident. This means that the “fix” is often a temporary solution or workaround that enables the service to continue functioning but sometimes at reduced efficiency. However, the goal is to ensure that the service continues running at all before it is passed onto higher-level support who will diagnose the incident and search for the cause.
This entire process is recorded in the event that the problem repeats itself. This data is then stored into the known error database or KEDB for short. This way, any workarounds and fixes can be deployed immediately and even fixes could potentially be utilised earlier to ensure the issue is fixed sooner. This is often known as an incident model and involves the following:
- Steps are taken to ensure that the incident is resolved quickly
- The sequence in which these steps should be taken and the responsibilities involved
- Precautions that should be taken before the incident is resolved
- The expected timeframe for the resolution to be active
- Any escalation procedures involved before pushing the issue to level 2 support
- Methods for preserving and reporting the data collected
The incident management model is closely related to other service management processes such as:
- Problem management. Problem management also relies on accurate data collection and manages the KEDB which can assist in incident management.
- Change management. Many incidents can be caused by changes, so documenting and recording these changes is a crucial way to determine the key performance indicators for change management.
- Service level management. Service level agreements, known as SLAs, often define timescales and escalation procedures for multiple types of incidents including the breach of a service level.
- Service asset and configuration management. The configuration management system identifies the relationships that each service component has while also providing integration of configuration data with problem and incident data.
What’s Classed as an Incident?
An incident is defined as an unplanned interruption to an IT service. Reduced quality of service can also be defined as an incident that should be documented and reported to the appropriate service level.
Incident management differs from problem and request management. The former involves an interruption to a regular service or quality of service while the latter two are identified after multiple incidents with similar issues occur. Service requests are often formal requests made towards management positions. This can include requests for training, hardware, licenses and any other services that may need to be approved by a senior position. Problem management delves into the root issue of a problem while incident management attempts to restore functionality even if a workaround or temporarily solution must be used.
These both differ to incidents which interrupt normal service. This is a complete interrupt, meaning something like a computer break down or failing to connect to a network. However, partial faults and a degraded service also count. For example, if an internet connection is slow enough to become a bother but not hinder one’s ability to work. In short, if it’s an unplanned issue that prevents someone from doing their job and requires a service provider to fix, it’s an incident.
The Role of the Service Desk
As explained previously, incident management is typically handled by the service or help desk of a business. This is an easily-accessible point of contact that users can contact when reporting incidents and disruptions to business-critical services. With no service desk, users will be forced to contact support staff directly and this could lead to congested communication lines and an inefficient service which leads to long wait times until a service is brought back online. It lacks priority and categorization which means that high-priority issues may be pushed back, ignored or even forgotten about.
A well-structured service desk enables supports staff to visualize everyone’s issues, categorize them and also prioritize them based on their importance and severity. It also has a number of other advantages, such as giving support staff the ability to share knowledge among themselves, create self-service models that don’t require third-party assistance and it also assists in the collection of data that can be submitted and utilized in the KEDB.
Service desks are often divided into several different tiers of support:
- Tier one incidents typically involve basic issues. This can include forgotten passwords, PC troubleshooting and other commonly recurring problems. These can be converted into incident models since fixing these issues is often routine and simple.
- Tier two incidents often require specialist training or skills in order to find the root issue of a problem. Something that involves networking hardware may require tier-two escalation and this is where specialists are brought in.
In most cases, an incident that could disrupt your business operations and halt day-to-day operations or even cause potential security breaches are often defined as high-priority tier two issues. This is because they either put the business at risk or slow down productivity enough that the business cannot resume normal function. If the issue is only preventing one or two employees from working, then it is often a low to medium incident that will change depending on the affected employee. For instance, if a VIP member of an organization cannot access their computer during a crucial time, then priority will be given to fixing that employee’s issues considering the role they play in the company.
Incidents will typically go through the following process:
- The incident is identified and logged as a ticket
- It is even logged and categorized
- The service desk will then prioritize it based on its severity
- A response is generated to fix the issue
- The response starts with an initial diagnosis
- A temporary fix or workaround is used if possible
- The incident is escalated if it cannot be fixed
- An investigation begins and a diagnosis is made
- The incident is resolved and a recovery is made
- Closure includes recording the incident for future reference
This isn’t a set-in-stone process but is recommended for most companies who are utilizing a ticket-based incident management system. In short, the service desk plays an important role in categorizing and prioritizing the severity of reported incident.
What is ICORE TECHNOLOGIES Incident Management?
When it comes to incident management, the process typically involves handling and escalating incidents as they occur. If possible, service should also be restored even if workarounds and temporary fixes are used in order to restore the service to a functioning state that allows it to be used. Incident management will then escalate the issue while reporting on it instead of looking for the root cause or finding a permanent solution if it is out of their scope.
Once incident management processes have been established, it can provide businesses with a quick and efficient method of resolving serving interruptions. Most companies will be replacing an existing slower communication method such as emailing support technicians and instead use a formal ticketing system that can prioritize and categorize support requests. In addition the system will be able to track data and records which can be accumulated in the KEDB which can be accessed in the future to help resolve recurring issues and escalate the problem to senior staff who can begin the problem solving process.
In larger companies, incident management is often handled by a dedicated member of staff. However, in smaller and medium-sized businesses, the task is delegated to service desk staff. The value of an incident management process makes it an investment that is easy to see the value of. This is because no one enjoys waiting in line or communicating via slow emails in order to resolve a service issue, so being able to handle said issues quickly and efficiently means that staff can be more productive which benefits the company.
Summary
An incident management system plays a very important role in the overall productivity and effectiveness of a large corporation. Without it, resolving issues can be time-consuming and congest communications within a business. With the right processes and a knowledgeable service desk, incident management can be simple to handle and will help create a more efficient business.