--draft –
Scope of Incident Management Process
Incident is as an unplanned interruption or reduction in the quality of Seamless Access service or a failure of one of Seamless Access Service components that has not yet impacted service. The purpose of incident management is to restore normal service operation as quickly as possible and minimize the adverse impact on business operations, thus ensuring that agreed levels of service quality are maintained.
Two types of incidents are defined:
- Generic incidents
- TIER2 infrastructure incidents
Handling of security incidents and resolution times are defined in the Seamless Access Operational Level Agreements - OLA. This process elaborates on them in a bit more detail.
Generic Incidents
Start points:
- S1: Incident detected by Monitoring tool and reported as event to SUNET NOC
- S2: Incidents reported by Users
- what is the process for that!?
Tasks:
- IM 01: SUNET NOC will create an incident in status.seamlessaccess.org if they receive any alarms from NagiosXi and Pingdom or other sources.
- IM 02: The incident should be updated atleast once a day.
- IM 03: Identify if the incident is critical or minor.
- IM 04: If it is a critical incident, the Service Operations Manager, Product Manager and Technical Lead will be notified. This is according to 6.2. Incident handling.
- IM 05: In case it is a minor incident, NOC needs to check if it is related to the software running in the docker images or an operational incident that can be handled by the NOC itself. In the first case, it will be escalated to Technical Lead/layer 3 support.
- IM 06: In case of an operational incident, whoever is in charge of SUNET NOC will try to resolve the incident with common knowledge or help of SA documentation https://wiki.sunet.se/display/sunetops/SeamlessAccess.
- IM 07: