Skip to content

Why MSP incident response breaks under real attacks

MSP incident response usually breaks under real attacks because teams act from habit instead of following one shared, documented process. When detection, triage, communication and evidence capture all live in different tools and people’s heads, every serious incident becomes a scramble, and you have nothing simple or consistent to show customers, insurers or auditors when they ask how you stayed in control.

Clear process beats heroic effort when seconds count.

In many MSPs, incident response “grew up” informally. Senior engineers know what to do, but their approach lives in chat threads, unstructured tickets, personal checklists and war stories. Service desk staff raise tickets in their own way, SOC analysts use different severity scales, and account managers talk to customers based on what they happen to have heard. The result is inconsistency: two similar incidents in different tenants are handled in completely different ways. That inconsistency is not just an operational nuisance. It also conflicts directly with ISO 27001’s expectation that information security processes are planned, documented and controlled. Standards such as ISO 27001 set that expectation in clauses on planning, operation and documented information, which are written to ensure key security activities follow defined, repeatable procedures rather than informal habits.

Most organisations in the 2025 ISMS.online survey said they had already been impacted by at least one third‑party security incident in the past year.

Multi‑tenant platforms magnify the risk. A failure or compromise in a shared RMM, identity service, backup platform or monitoring tool rarely affects just one customer. Without a unified view, teams see dozens of tickets that all look local, rather than one coordinated multi‑tenant incident that needs central ownership. That makes it harder to see blast radius, harder to coordinate containment and much harder to give consistent answers to all affected customers. Community incident reports from CSIRTs such as DIVD have shown how weaknesses or compromises in widely used MSP tools can quickly cascade across many customer environments at once, underlining why structured, cross‑tenant incident handling matters.

Another common fault line is the blurred line between firefighting and incident management. Engineers are rightly rewarded for restoring service quickly. Under pressure, they may bypass steps such as classification, notification decisions, proper logging of actions taken or preservation of evidence. Work gets done, but the storey of what happened, who approved what and whether obligations were met is incomplete.

Finally, documentation is rarely designed with reconstruction in mind. Timelines, key decisions, customer calls and internal debates live in multiple places. If a regulator, board or major customer later asks for a precise, defensible narrative of an event, teams end up piecing it together manually. That is slow, stressful and prone to gaps that erode trust.

An ISO 27001‑aligned incident response runbook template addresses these problems by giving your MSP one shared model: common lifecycle, common definitions, common roles and common records. It does not replace engineering skill; it turns that skill into repeatable behaviour that can be demonstrated. Implementation guidance from certification bodies and standards organisations, including providers of ISO 27001 training and audits such as BSI, consistently emphasises the value of having standardised, documented incident processes rather than relying on individual habits. When that runbook lives inside a structured ISMS platform such as ISMS.online, the same actions that resolve incidents also generate the evidence you need for audits, customer assurances and continual improvement.

What “good” looks like when you replay your last serious incident

“Good” looks like being able to replay a serious incident as one clear, consistent flow from first alert to lessons learned. You should be able to trace detection, triage, communications, technical actions, approvals and improvements in a single narrative, no matter which tenant was affected.

In a mature MSP, that replay is boring in the best possible way. The first responder knows how to log the event, which questions to ask and when to escalate based on a clear severity model. A named incident manager takes ownership once agreed criteria are met. The team uses prepared checklists for the relevant incident type. Customer communications follow pre‑approved templates. All actions are logged against the incident, and evidence is preserved according to policy. After recovery, a post‑incident review captures root causes, improvements and any changes to risk or controls.

If your actual replay feels nothing like that-if it involves hunting through chat logs, arguing about who owned what or struggling to remember which customers were told what-then your organisation is running on intuition rather than a standard. That is precisely the gap an ISO 27001‑aligned runbook template is designed to close.

Why ISO 27001 turns nice to have process into a business requirement

Default Description

Book a demo


The ISO 27001 backbone for MSP incident response

The ISO 27001 backbone for MSP incident response is the set of clauses and Annex A controls that define how you plan, operate, evidence and improve your incident process. When you design your runbook around that backbone, you stop writing standalone procedures and start building a visible, auditable incident management system that aligns with clear expectations.

Earlier, you saw how undocumented response creates audit pain and leaves you scrambling for records. The ISO 27001 backbone is how you fix that in a way regulators, customers and auditors all recognise. An ISO 27001‑aligned runbook lets you point to one coherent system instead of a patchwork of habits and ad‑hoc documents.

An ISO 27001‑aligned incident response runbook is essentially a practical expression of the standard’s planning, operational control and incident‑management controls. It translates clauses and Annex A controls into headings, fields and workflows your team can actually follow. Instead of writing procedures in isolation, you design the runbook as part of your Information Security Management System.

At the planning level, clause 6 of ISO 27001 expects you to identify risks and opportunities and define how you will address them. This planning requirement is explicit in clause 6 of ISO 27001, which asks organisations to determine information security risks and opportunities and plan actions to address them. For incident response, that means understanding what kinds of incidents are relevant to your MSP, which assets and services are most critical and what objectives you have for detection, response, communication and learning. Those objectives then drive the content of the runbook and the metrics you later track.

Clause 8, on operational planning and control, raises the bar further. It requires you to plan, implement and control the processes needed to meet information security requirements. Clause 8 in ISO 27001 sets this expectation by requiring organisations to establish and control operational processes and to maintain documented information as evidence that those processes are being carried out as intended. An incident response runbook is one of the clearest ways to show that your incident process is defined, controlled and backed by records.

Annex A controls 5.24 to 5.28 focus specifically on information security incident management. In the 2022 revision of ISO 27001, analysis of Annex A changes notes that these new controls group together planning and preparation, event assessment and decision‑making, incident response, learning from incidents and evidence handling for information security incidents, replacing the older Annex A.16 structure and making expectations clearer for organisations that manage incidents regularly, such as MSPs, as explained in overviews of the Annex A updates like this IT Governance summary. An MSP runbook that aligns with these controls will therefore need sections devoted to each of those themes, with clear links to roles, workflows and records.

For a managed service provider, these requirements must be applied through the lens of multi‑tenancy and shared responsibility. Your runbook needs to answer not only “how do we handle an incident?” but also “how do we define what is in scope for us versus the customer or a third party?”, “how do we reflect SLAs and regulatory obligations for each tenant?” and “how do we show auditors that this is consistent across our portfolio?”. For privacy and legal officers, the same backbone provides assurance that regulatory reporting, evidence standards and data‑protection duties are embedded in the process rather than bolted on.

Mapping clauses and controls into clear runbook sections

You can make ISO 27001 traceable in daily work by mapping clauses and Annex A controls into simple, named sections in your runbook. Each section becomes both a practical guide for staff and a visible bridge to specific requirements during an audit, so you spend less time explaining and more time showing how things work.

A concise, ISO‑aligned structure might include:

  • Purpose and scope: incident types, environments, services and tenants in scope.
  • Roles and responsibilities: key internal and external roles, mapped to specific actions.
  • Lifecycle overview: high‑level phases from detection to post‑incident review.
  • Procedures: stepwise guidance for detection, assessment, containment, recovery and review.
  • Evidence and records: minimum logs and artefacts to capture at each stage.
  • Governance: ownership, review frequency, change control, training and testing.

Purpose and scope mainly support clause 4.3 and clause 6.1. Roles and responsibilities help you meet clause 5.3. Lifecycle, procedures and evidence sections show how you satisfy clause 8.1 and Annex A controls 5.24–5.28 in concrete terms. Governance closes the loop with clause 9 on performance evaluation and clause 10 on improvement. Implementation guides for ISO 27001 often illustrate similar mappings between documented procedures and specific clauses and controls, while stressing that organisations are free to choose section titles that suit their context, as long as the underlying requirements are covered in a traceable way, as reflected in overviews from organisations like BSI.

To make this feel like a concrete template, you can define a standard incident record layout. Typical fields include incident ID, tenant, services affected, incident type, severity, status, owner, key timestamps (detected, acknowledged, contained, recovered, closed), linked risks and controls and attachments for evidence. When every incident uses the same field set, it becomes much easier to compare events and satisfy ISO 27001’s documentation expectations.

Each of these sections can be annotated internally with references to the relevant clauses and controls, making it easy to show in an audit how you have interpreted the requirements. For engineers and operations staff, the value lies in the concrete headings and checklists; for auditors and compliance leads, the value lies in the traceability.

Keeping the runbook usable while still audit‑ready

An ISO‑aligned runbook only adds value if your teams actually use it when pressure is high. The goal is a document that is light enough to follow in real time while still rich enough to satisfy auditors and legal review, so it earns trust without slowing down real work.

A practical way to achieve this is to separate concept from action. Policy‑level statements and detailed rationales can live in supporting ISMS documents, while the runbook itself stays focused on operational steps, decision points, prompts and references. That means writing in the language your engineers already use, keeping steps simple and sequential and tailoring examples to the incident types your MSP actually sees.

Integrating the runbook into the platform you use for your ISMS, rather than leaving it as a static document on a file share, makes this easier to sustain. Your ISMS platform can manage ownership, version control, training records and links to real incident logs and corrective actions, while the runbook stays focused on guiding day‑to‑day behaviour.

As you refine the template, aim for a balance: enough structure and mapping to satisfy ISO 27001, but not so much verbosity that teams abandon it during high‑pressure events. Short, focused runbooks for common incident types, all hanging from the same ISO‑aligned framework, are usually more effective than a single, encyclopaedic procedure.




ISMS.online gives you an 81% Headstart from the moment you log on

ISO 27001 made easy

We’ve done the hard work for you, giving you an 81% Headstart from the moment you log on. All you have to do is fill in the blanks.




End‑to‑end ISO‑aligned incident lifecycle: detection to post‑incident review

An ISO‑aligned incident lifecycle gives your MSP a predictable, measurable path from first signal to lessons learned, with each phase clearly defined and leaving the right records. When that lifecycle is documented in your runbook and aligned with recognised models such as ISO 27035 and NIST‑style incident response, while still reflecting your own tools, teams and tenants, you get something familiar enough to use under pressure and structured enough to show auditors, customers and executives exactly how incidents flow through your organisation.

At a high level, the lifecycle will always include some version of the following stages: detection and reporting, assessment and classification, containment, eradication and recovery, closure and post‑incident review. ISO 27001 does not dictate the exact names, but it does expect that events are assessed, incidents are responded to and learning is fed back into the ISMS. Community explanations of the standard’s incident‑management controls make the same point: you are free to label your phases as you wish, provided you can show that events are assessed, incidents are handled and lessons feed back into the ISMS, as described in guidance on Annex A incident management practices such as this overview of ISO 27001 incident management. A runbook template built around these phases gives you a natural way to satisfy those expectations and align with Annex A controls 5.24–5.28.

The lifecycle is also where you make hand‑offs explicit. Each phase should have a clear entry condition (what makes this phase start), defined activities, responsible roles and an exit condition (what must be true before moving on). That structure turns a messy, continuously evolving incident into a series of controlled steps, each of which can generate the records your ISMS needs while keeping responders focused on the work in front of them.

For busy MSP teams, the most important test is whether the lifecycle is understandable and usable in the middle of the night. Phase names should match words your engineers already use. Activities should be described in the order they will actually be carried out. Decisions should be framed so that first responders know when to escalate rather than hesitating.

Designing lifecycle phases with clear hand‑offs and records

Design each lifecycle phase around four elements: purpose, triggers, key activities and required records. This repeatable structure makes the lifecycle easy to teach, adapt and audit as your MSP grows.

For example:

  • Detection and reporting: capture events consistently, log key context and decide whether they are information security incidents.
  • Assessment and classification: determine severity, impact and scope, then decide who should be involved in the response.
  • Containment, eradication and recovery: apply agreed technical actions to limit harm, remove causes and restore services safely.
  • Closure and review preparation: confirm monitoring is clean, notifications are complete and documentation is ready for review.
  • Post‑incident review: analyse causes, decide improvements and link actions to risks, controls and owners.

To make this more concrete, you can attach a short checklist to each phase in the template. For instance, the “Detection and reporting” section might include prompts such as “Record who reported the issue”, “Capture affected tenant and service”, “Attach initial logs or screenshots” and “Set a provisional severity”. That level of detail keeps the phase grounded in what front‑line staff actually do.

When these elements are included in the runbook template, each incident naturally generates the evidence ISO 27001 expects: logs of events, decisions, actions and improvements. Management reviews can then draw directly from those records rather than relying on anecdote.

Making the lifecycle real for multi‑tenant MSP operations

For an MSP, the lifecycle must also handle cross‑tenant and cross‑team realities. A single incident may involve multiple internal teams (service desk, SOC, platform engineering, account management) and multiple external parties (customers, vendors, regulators). The runbook has to describe not only what happens, but who is responsible at each step and how that responsibility shifts as the incident evolves.

A simple but powerful technique is to add a RACI view for each phase, tailored to your MSP. For example, in assessment and classification, the SOC analyst might be responsible, the incident manager accountable, the customer’s security contact consulted and the account manager informed. In containment, platform engineering might be responsible for shared services, while the customer’s IT team is responsible for client‑side actions. Documenting this once, and refining it over time, removes guesswork in the middle of incidents.

The lifecycle also needs to express how multi‑tenant incidents are handled differently from single‑tenant ones. For example, a shared tool outage that affects many tenants may have a central master incident with linked child tickets per customer, ensuring both a global view and tenant‑specific communications. Building that pattern into the runbook prevents your team from reinventing it under pressure and gives leadership and auditors a clear demonstration of structured portfolio‑level control.

For internal and external stakeholders, these explicit hand‑offs become part of your assurance storey. You can show that incidents follow a tested, role‑based pattern that scales as you grow and does not rely on individuals remembering what to do in the moment.




Detection and analysis in a multi‑tenant MSP environment

Detection and analysis decide how quickly you spot real incidents and how much noise you can safely ignore across many tenants, so they largely determine your response speed and accuracy. For MSPs, this stage is complicated by varied customer environments, different monitoring tools and a mix of on‑premises, cloud and third‑party services, which is why an ISO 27001‑aligned runbook template that standardises how you capture events, triage them and decide what counts as an information security incident is so valuable for turning noise into meaningful signals without breaching judgement or contractual obligations.

At minimum, detection and analysis need to cover how events are captured, how they are logged, how they are triaged and how you decide whether they are information security incidents. For an MSP, those steps must also respect tenant boundaries, consider SLAs and contractual obligations, and recognise blind spots where you depend on client or vendor monitoring.

A good template will prompt first‑line staff to collect a consistent set of information whenever they log an event: who reported it, which tenant and service are affected, what the observable symptoms are, when the issue started and how it was detected. It will then guide analysts through a standard triage process that uses a common severity model while allowing for tenant‑specific parameters.

The aim is to prevent both overreaction (treating every alert as a critical incident) and underreaction (dismissing weak signals that later turn out to be serious). By codifying what “normal” triage looks like, and when to escalate, you create a more reliable front door for your incident process and support Annex A controls on event assessment and decision‑making.

Normalising signals and setting consistent triage rules

Normalised signals give diverse alert sources a shared language so analysts can compare and prioritise incidents across tenants. With clear incident types, severities and triage questions, you reduce uncertainty for first‑line staff and make prioritisation decisions easier to defend later.

In a multi‑tenant MSP, alerts may come from many sources: endpoint agents, firewalls, identity systems, cloud workloads, user reports, vendor notifications and more. Without a common language, each team interprets these signals differently, and it becomes hard to compare or prioritise across tenants.

Your runbook template can address this by defining:

  • A standard incident taxonomy covering types such as malware infection, unauthorised access, data loss, denial of service, configuration error and third‑party breach.
  • A severity model combining impact (on data, services and customers) and urgency (time sensitivity, regulatory or contractual drivers).
  • Default triage questions that help analysts rapidly assess each event: is there evidence of active exploitation, which tenants are affected, which critical services or data are involved and are any regulatory reporting thresholds in play?

The template can then show how tenant‑specific factors modify those defaults. For example, a disruption to a monitoring tool used by all tenants might be classified as high severity even if no data has yet been lost, whereas the same pattern in a pilot service with limited scope might be lower. For regulated tenants, certain categories of personal data or service impact may always raise severity.

By normalising signals in this way, you make triage more predictable and defensible. Over time, the pattern of triage decisions and outcomes can also feed into your metrics and improvements and demonstrate alignment with ISO 27001’s risk‑based approach.

Handling uncertainty, blind spots and shared responsibilities

Handling uncertainty and blind spots well is a sign of maturity. Rather than pretending you see everything, your runbook should show analysts how to act responsibly when information is incomplete and responsibilities are shared between you, customers and vendors, so you can avoid both over‑reaction and silent failure.

Real incidents rarely present with perfect information. Analysts often face grey‑area situations where activity looks suspicious but not conclusive, or where monitoring is incomplete. A good MSP runbook template acknowledges that uncertainty and provides a consistent approach.

In the 2025 ISMS.online State of Information Security survey, about 41% of organisations named managing third‑party risk and tracking supplier compliance as a top security challenge.

For suspected but unconfirmed incidents, the template might prescribe creating a provisional incident record, increasing monitoring, setting a review time and managing customer expectations carefully. It can also define conditions under which these provisional incidents are closed, escalated or converted into full incidents.

The template should also explicitly recognise monitoring blind spots. These might include legacy systems without modern agents, third‑party SaaS where you rely on vendor logs or customer‑owned infrastructure outside your direct control. For each blind‑spot category, the runbook can describe how to escalate: who to inform, what to ask for and how to record limitations in your assessment.

From an ISO 27001 perspective, being honest about uncertainty and limitations is better than pretending you have full visibility. When those realities are reflected in the runbook and your incident records, they show that your process is systematic and risk‑based, not ad hoc. They also give you a basis for improving monitoring coverage or clarifying shared responsibilities in contracts and data‑processing agreements.




climbing

Embed, expand and scale your compliance, without the mess. IO gives you the resilience and confidence to grow securely.




Containment, eradication and recovery across many client environments

Containment, eradication and recovery are where you balance speed, safety and commercial impact across many client environments, and where MSPs most keenly feel the tension between protecting customers quickly and minimising disruption across the portfolio. A standard, ISO‑aligned runbook that defines common patterns, clarifies roles, sets approvals and pre‑agrees options with each tenant turns those difficult trade‑offs into well‑understood choices instead of improvised decisions that might cause unnecessary disruption or breach agreements with customers and vendors.

There are three broad categories of incidents you need to handle: those originating in your own platforms and tools, those originating in a tenant’s environment and those caused by third parties such as cloud providers or software vendors. Each category has different implications for control, communication and accountability. A good template will make those distinctions explicit and provide branching paths for each.

Across all categories, containment is about stopping further harm, eradication about removing the cause and recovery about restoring services safely. In a multi‑tenant MSP, you must also consider cross‑tenant spread, shared infrastructure and regulatory or contractual requirements that apply differently to each customer.

Without a standard approach, engineers may improvise containment measures that are technically effective but commercially problematic, such as shutting down a shared platform without clear communication or approvals. Conversely, they may delay strong action because they are worried about SLA penalties or customer reaction. The runbook template provides a framework for making these decisions in a consistent, documented way.

Standardising playbooks and pre‑agreeing tenant‑specific options

Standardising playbooks means turning your most common containment and recovery responses into reusable patterns, then clarifying how they apply to each tenant. Once those patterns and tenant‑specific options are agreed, engineers can act quickly without guessing or renegotiating under pressure.

Begin by listing the common containment and recovery patterns you use, such as:

  • Isolating endpoints or servers showing clear compromise indicators.
  • Suspending or resetting user accounts with suspected credential theft.
  • Disabling risky integrations or network paths until risk is understood.
  • Failing over to alternate infrastructure or restoring from known‑good backups.

For each pattern, your template can specify preconditions, required approvals, dependencies and follow‑up checks. You can then decide which patterns are safe to apply by default and which require explicit customer agreement. For example, you might standardise immediate isolation for hosts with active ransomware indicators, whereas shutting down a shared line‑of‑business application always requires consultation with the customer’s leadership.

Your runbook template can include a tenant profile section that captures these nuances: critical systems, maintenance windows, regulatory obligations and acceptable containment options. That way, when an incident occurs, engineers consult a structured set of agreed parameters rather than guessing or negotiating from scratch.

For incidents originating in your own platforms, the template should describe how you manage portfolio‑wide containment and recovery. That might involve creating a master incident record, assessing impact across all tenants, coordinating with vendors and issuing consistent updates. For tenant‑specific incidents, the focus may be on guiding client administrators through remediation while protecting your own shared infrastructure.

Practising recovery and defining return‑to‑service criteria

Recovery should be defined by clear, testable criteria, not by a vague feeling that things “look okay again.” Your runbook can spell out what needs to be true before systems, accounts or services are put back into normal use, so you do not reintroduce risk while trying to restore service quickly.

Recovery is often treated as just getting things back online, but ISO 27001 and good practice require more than that. Recovery steps should ensure that systems are restored from trustworthy sources, that vulnerabilities are addressed and that monitoring is in place to detect any recurrence.

Your runbook template should therefore define clear return‑to‑service criteria. These might include verification that malicious code has been removed, patches applied, configurations corrected, credentials refreshed, logs reviewed for residual activity and controls adjusted where appropriate. For certain incident types, you may also require confirmation from a second pair of eyes before declaring recovery complete.

Because multi‑tenant recovery can be complex, testing is crucial. Tabletop exercises, simulated incidents and controlled failover or restore drills help reveal gaps in your steps, approvals and communications. The runbook template can double as the script for those exercises, ensuring that practice is realistic and directly applicable to live operations.

From a business point of view, practising containment and recovery using the runbook builds confidence that your MSP can handle major incidents without improvisation. From an ISO perspective, it demonstrates that your incident procedures are not just written but tested and improved in line with Annex A controls on disruption and continuity.




Communication, escalation and evidence: making incidents auditable

Communication, escalation and evidence handling are what make incidents understandable and defensible to customers, regulators and auditors, and even the most technically competent response can be undermined by weak practice in these areas. ISO 27001 expects you to plan how you communicate internally and externally and to maintain records that show what happened, so if you script who you inform, what you share, when you escalate and how you capture proof for different audiences and jurisdictions, you remove much of the stress and ambiguity from major events and make your responses easier to trust.

An incident response runbook template should therefore include a dedicated section on communication and escalation. This section describes who needs to know what, when and through which channels, for different types and severities of incidents. It also specifies who approves messages, how conflicting views are resolved and how all communications are logged in a way that will stand up to later scrutiny. Clause 7.4 on communication and the standard’s documented‑information requirements make this explicit, requiring organisations to determine what, when and with whom to communicate and to retain records that demonstrate what actually happened, as reflected in ISO 27001.

Evidence handling is the other half of auditability. The runbook must describe what evidence should be captured at each stage, how it is protected from tampering and how long it is retained. For multi‑tenant MSPs, evidence may include both your own logs and artefacts supplied by customers or third parties. Chain‑of‑custody considerations may apply where legal or regulatory proceedings are possible, which is particularly important to privacy and legal officers.

Without clear guidance, responders may over‑share sensitive information, under‑inform key stakeholders or fail to secure the very evidence needed to understand and prove what happened. A well‑designed template reduces those risks by providing default patterns that can be adapted but not ignored.

Structuring stakeholder communication and approval paths

Structured stakeholder communication turns ad‑hoc status updates into a predictable flow of information that matches each audience’s needs and obligations. When you design those flows in advance, you reduce the chances of panicked over‑sharing or damaging silence and give every stakeholder confidence that they will be kept appropriately informed.

Start by identifying your incident audiences: internal executives, operations and security teams, account managers, customers’ technical and business contacts, regulators, data protection authorities, data subjects where applicable and key vendors. For each audience, your template can outline:

  • Triggers for communication: which severities or incident types require notification.
  • Timeframes: expected windows for initial and follow‑up updates.
  • Content: the level of technical detail, impact description and commitments that are appropriate.
  • Channels: email, portals, phone calls, status pages or other agreed methods.

The runbook should also define who draughts, reviews and approves messages. Technical teams might draught incident summaries, while legal and privacy teams review regulatory notifications and public statements. Account managers may be responsible for tailoring templates to specific customers while keeping core messaging consistent.

Disagreements will occur, especially around whether to notify, how much to disclose or when to declare an incident closed. Your template can address this by defining an escalation path: which roles are involved in the decision, how risks are weighed and how a final call is made and documented. That framework prevents disputes from being handled informally in chat threads where they are hard to reconstruct later and supports Annex A control expectations on communication.

Defining evidence requirements and chain‑of‑custody expectations makes it far easier to reconstruct incidents and defend your actions later. Your runbook should make it clear what to capture, where to store it and how to protect its integrity, so people do not have to improvise under pressure.

From an ISO 27001 perspective, incidents must leave a trace. The runbook template should list the minimum evidence set for significant incidents, such as system and application logs, security alerts, configuration snapshots, forensic images where appropriate, timelines of key events, decision logs and approvals.

It should also set expectations for preserving the integrity of that evidence. That may involve restricting access, recording who handled which artefacts, using secure repositories and avoiding actions that overwrite or destroy useful logs. For MSPs, this is particularly important when incidents may lead to contractual disputes, insurance claims or regulatory investigations.

Where customers or vendors provide evidence, the template should describe how it is integrated into your records. That might include linking or importing logs into your own repository, recording the source and date received and noting any limitations on use. For privacy‑sensitive data, the runbook can refer to your data protection policies and any additional restrictions.

If your runbook and incident records already live in your ISMS platform, those linkages, approvals and retention rules become part of normal work rather than separate admin. You can then show auditors and regulators a clean chain from event to evidence and improvement without building manual document packs each time.




ISMS.online supports over 100 standards and regulations, giving you a single platform for all your compliance needs.

ISMS.online supports over 100 standards and regulations, giving you a single platform for all your compliance needs.




Post‑incident review, root cause and continual improvement KPIs

Post‑incident reviews and metrics turn painful events into concrete improvements, and over time this is where the real value of an incident response runbook emerges. ISO 27001 explicitly requires continual improvement, and many practitioners treat incidents as one of the richest sources of insight into how effective your controls and processes really are in practice. Commentary on implementing clause 10 of ISO 27001 often highlights incident learnings as a key input to that cycle. For an ISO‑aligned MSP, this includes linking incidents back to risk assessments, statements of applicability and control improvements.

Post‑incident reviews (sometimes called lessons‑learned meetings or after‑action reviews) should not be blame sessions. Their purpose is to understand what happened, why it happened, how well the response worked and what should change. For an ISO‑aligned MSP, this includes linking incidents back to risk assessments, statements of applicability and control improvements.

About two‑thirds of organisations in the 2025 ISMS.online survey said that the speed and volume of regulatory change are making compliance increasingly hard to sustain.

Metrics give you a quantitative view of how your incident process is performing. Common measures include mean time to acknowledge (MTTA), mean time to resolve (MTTR), frequency of incidents by type and tenant, recurrence of similar incidents, SLA impact and the completeness of documentation. Tracking these over time shows whether your runbook and training are having the desired effect and helps you demonstrate improvement in management reviews.

A runbook template that embeds post‑incident review questions and metric fields ensures that each incident contributes to this feedback loop. Over time, you can show executives, auditors and customers that similar incidents are being handled faster, with less impact and fewer surprises.

Making post‑incident reviews meaningful and actionable

Post‑incident reviews are valuable only when they lead to specific, owned actions and visible change. A structured review format in your runbook keeps discussions focused on facts, causes and improvements rather than blame, so people feel safe to be honest about what went wrong.

Your template should define when a full post‑incident review is required-typically for high‑severity incidents, multi‑tenant events or any incident that exposes a serious gap. For lower‑severity but frequent incidents, you might use a lighter‑weight review, perhaps batching them into periodic thematic analyses.

A structured review format helps teams focus. Common elements include:

  • Factual timeline: what happened when, based on logs and records.
  • Detection and analysis: how the incident was discovered and assessed.
  • Response effectiveness: what worked well and what caused friction or delay.
  • Root causes: technical, process and human factors.
  • Control evaluation: whether existing controls were adequate or need adjustment.
  • Corrective and preventive actions: what will change, who owns it and by when.
  • Communication lessons: feedback from customers, regulators or internal stakeholders.

Linking review outcomes directly to your risk register and control set closes the loop. For example, if an incident reveals that multi‑factor authentication was not enforced consistently, the review might drive updates to your access control policies, technical hardening and customer guidance. Your runbook template can include fields or checklists to ensure those linkages are made.

To avoid reviews becoming talking shops, it is important to track follow‑through. Actions agreed during reviews should enter the same planning and tracking systems used for other work, with clear owners and due dates. When your runbook is part of an ISMS platform, reviews and actions can be linked to specific risks and controls, making progress easier to monitor and easier to present in management review meetings.

Choosing and using metrics that demonstrate real improvement

Choosing the right metrics helps you prove that your incident response is improving in ways that matter to different stakeholders. Your runbook can suggest a small set of measures that reflect both operational reality and ISO 27001 expectations, so you avoid tracking numbers that look impressive but do not change behaviour.

To make metrics directly usable, define what each one means and how you will calculate it. For example, MTTA might be “average time between the first alert or ticket creation and assignment of an incident owner”, while MTTR could be “average time between incident creation and confirmation that services are restored and monitoring is clear”. Documentation completeness might be measured as “percentage of major incidents where all required fields and attachments are present before closure”.

A simple table can help align perspectives:

Perspective Primary concern What the runbook and ISMS deliver
MSP founder or director Business risk, reputation and growth Evidence of controlled incidents and improving resilience trends
Security and compliance Control coverage and audit readiness Clear records and mappings from incidents to controls and risks
Operations and service Usable playbooks, SLA adherence, engineer load Consistent workflows, metrics and reduced firefighting

By making these concerns explicit, you can choose metrics that matter to each group. For founders, this might include the number of major incidents, impact on revenue or customer satisfaction after incidents. For security leads, it might include coverage of incident types, percentage of incidents with complete evidence sets or time from incident to control changes. For operations, it might include engineer time spent per incident, ticket reassignments or communication quality scores.

The runbook template should specify where and how these metrics are captured-often directly within incident records or linked dashboards. When metrics sit alongside incidents in your ISMS, they can be surfaced in management views and used in formal management reviews, reinforcing the role of incident response in your overall ISMS and showing continual improvement over time.




Book a Demo With ISMS.online Today

An ISMS.online demo shows how a live, ISO 27001‑aligned incident runbook works in practice across your multi‑tenant MSP. In a short session you can see how one governed environment holds the runbook, incidents, evidence, risks and corrective actions in a way that feels natural for your teams.

In the 2025 ISMS.online survey, almost all organisations said that achieving or maintaining security certifications like ISO 27001 or SOC 2 is a top priority.

Within an integrated ISMS platform such as ISMS.online, you can typically bring your master runbook together with playbooks for common incident types, captured incidents and their supporting evidence, linked risks and controls and tracked corrective actions, so that incident handling and assurance activity reinforce one another. Ownership, version control, training records and review schedules all sit alongside the content itself, so your teams always know which procedure to follow and your auditors can always see how it is maintained.

For multi‑tenant MSPs, the platform also makes it easier to parameterise the runbook per customer. Tenant profiles can record critical systems, SLAs, regulatory obligations and agreed containment options, while the underlying lifecycle and roles remain consistent. That gives engineers clarity under pressure and gives customers reassurance that their incidents are handled within a disciplined, ISO‑aligned framework.

A practical next step is to take one significant incident from the last year-especially one that felt chaotic-and sketch how it would look if it had flowed through the template described here. From there, you can pilot a structured runbook inside ISMS.online with a small group of engineers and one or two key tenants, refining it based on real usage rather than theory.

Choosing to invest in this structure is not about adding bureaucracy. It is about giving your teams a shared playbook, giving your customers a consistent experience and giving your leadership and auditors a clear view of how you protect the services that matter. A short, exploratory demo of ISMS.online, built around your last major incident, is often enough to see how an integrated incident runbook could work in your own environment and whether now is the right time to move away from fragmented habits toward a single, trusted way of handling incidents.

What an integrated incident runbook looks like in ISMS.online

An integrated incident runbook in ISMS.online brings together procedures, ownership, records and improvements in one place, so every incident tells a complete storey from first alert to final action. You move from separate documents and tickets to a single, joined‑up view that anyone with the right role can understand and reuse across future events.

In practice, that means your ISO‑aligned runbook becomes a living object in the platform. You define phases, roles and checklists once, and link them to projects, risks and controls. When an incident occurs, responders work inside that structure: they follow the steps, capture evidence as they go and trigger communications and approvals from the same screen.

As the incident progresses, you can see status, outstanding actions and impact across tenants without jumping between systems. Once the event is closed, the incident record stays linked to its root causes, corrective actions and relevant controls. That traceability is exactly what auditors and regulators look for, and it also makes internal debriefs and board reporting far easier.

How to pilot this with one real incident

Piloting an integrated runbook with one real incident lets you prove value quickly without committing to a large‑scale change from day one. The aim is to learn from a controlled experiment and then scale what works, rather than trying to redesign everything at once.

A simple approach is to choose a recent, meaningful incident and rebuild it in ISMS.online. Create or import the runbook structure, log the incident, attach key artefacts and map it to relevant risks and controls. Then compare this structured record with the way you originally captured the event across chats, tickets and documents.

Next, run a small simulation with the same team using the rebuilt record as the script. Ask what would have been clearer, faster or easier if the runbook and platform had been in place at the time. Capture feedback from responders, account managers and compliance staff, and use it to refine the template.

Once you can see the difference for a single incident, it becomes much easier to build a case for broader adoption. Leaders can see how the approach reduces risk and improves assurance, practitioners can see how it cuts manual effort and customers can see how it strengthens trust. At that point, booking a full demo of ISMS.online is less about exploration and more about planning how quickly you can move your wider incident process into a governed, ISO‑aligned system that feels natural to use every day.

Book a demo



Frequently Asked Questions

What is an ISO 27001‑aligned incident response runbook template for an MSP?

An ISO 27001‑aligned incident response runbook for an MSP is a single, reusable playbook that turns the standard’s incident requirements into clear, repeatable workflows your teams can follow for every customer. It walks you from first alert through triage, containment, eradication, recovery, closure and review, while spelling out who does what, for which tenants, in what order, and what needs to be recorded for customers and auditors.

Which sections make a runbook genuinely usable under pressure?

You want a template that makes sense at 2 a.m. as well as in an ISO 27001 audit. At minimum it should include:

1. Scope, definitions and triggers

Define:

  • Which environments, services and tenants are covered.
  • What counts as an information security incident (authorised vs unauthorised change, security‑relevant outages, suspected compromise).
  • Clear triggers for “declare an incident now” vs “raise a ticket and monitor”.

That removes ambiguity and stops teams arguing about whether something “really is” an incident.

2. Roles, lifecycle and severity

Set out:

  • Concrete roles such as incident manager, first responder, platform engineer, account manager, customer security contact and vendor contact.
  • A simple lifecycle (for example: detect → assess → contain → eradicate → recover → close → review).
  • A straightforward severity model that influences response times, escalation paths and communication expectations.

This gives you a backbone your engineers can memorise and reuse across incident types.

3. Phase‑by‑phase steps, communication and evidence

For each phase, include:

  • Tasks and decision points written in the language your responders already use.
  • Communication prompts (who to notify, by which channel, within what timeframe).
  • Evidence requirements (minimum logs, artefacts and approvals to capture).

If you design the template once and apply it consistently across customers, you reduce improvisation, shorten training time and give yourself clean, comparable records. When you store and run that template from a platform such as ISMS.online, you can also manage version control, assignments and links into your wider Information Security Management System (ISMS) instead of relying on static documents.


How should an MSP incident runbook line up with ISO 27001:2022 and Annex A?

Your incident runbook should make it straightforward to show how everyday response activities satisfy ISO 27001:2022 and Annex A without forcing responders to think in clause numbers. You want to be able to take an auditor from a requirement in the standard to the exact template sections, records and improvement actions that demonstrate how you meet it.

Which ISO 27001 clauses and controls should directly influence the runbook?

A few areas of the standard are particularly relevant to MSP incident response:

Context, planning and operations (Clauses 4, 6 and 8)

These clauses expect you to:

  • Understand your organisation’s context and interested parties (including customers, regulators and key suppliers).
  • Plan how you will treat information security risks.
  • Operate controlled processes that meet information security requirements.

In practice, that means your runbook should:

  • Reference how incidents support risk treatment plans (for example, each incident record links to the underlying risks and controls it touches).
  • Reflect different stakeholder needs, such as notification timelines in customer contracts or regulatory reporting thresholds.

Annex A incident management controls (A.5.24–A.5.28)

These controls cover incident preparation, assessment, response, learning and evidence:

  • A.5.24 – Planning and preparation: show how you prepare for incidents, define classifications, resource the function and keep the runbook itself up to date.
  • A.5.25 – Assessment and decision: reflect triage, severity scoring and criteria for escalating, de‑escalating or closing incidents.
  • A.5.26 – Response: describe containment, eradication and recovery options you have at MSP and tenant levels.
  • A.5.27 – Learning: require a consistent post‑incident review process that leads to corrective and preventive actions.
  • A.5.28 – Evidence collection: define what must be logged and retained to support investigation, reporting and learning.

If you maintain a simple mapping table that links each runbook section to these clauses and controls, your ISMS lead can answer “where is A.5.27 implemented?” in seconds by pointing to your review process and real MSP incidents. At the same time, engineers continue to work with clear prompts rather than standards language, which makes adoption much more likely.


How can an MSP adapt a single runbook to multi‑tenant incidents and shared platforms?

An MSP rarely deals with neatly isolated incidents. A single misconfiguration in a remote monitoring tool or backup platform can affect dozens of customers at once. If your runbook assumes a single‑tenant, single‑team scenario, you risk inconsistent actions, mixed messages and accidental disclosure across your customer base.

Which patterns help you manage incidents across multiple tenants?

A robust template can make complex, multi‑tenant situations feel organised rather than chaotic if you bake in a few design patterns:

1. Incident origin and impact types

Define categories such as:

  • MSP‑originated: incidents rooted in your shared tooling, processes or central infrastructure.
  • Tenant‑originated: incidents primarily located in a customer’s environment (for example, a compromised workstation or misconfigured local firewall).
  • Third‑party: incidents caused by vendors providing platforms or cloud services you rely on.

For each type, specify:

  • Who leads the response (MSP, tenant, or shared).
  • Which containment levers you can use centrally vs on the customer side.
  • Basic notification and escalation expectations.

This stops debates about “ownership” and clarifies what you can and cannot directly control.

2. Master and child incident structure

When a shared platform problem affects many customers, structure your records as:

  • A master incident for portfolio‑level investigation, coordination with vendors and overall messaging.
  • Child incidents: per tenant, capturing impact, local actions and customer communication.

Your runbook can then:

  • Provide fields for linking child records to their master.
  • Differentiate central tasks (such as disabling a faulty integration) from tenant‑specific ones (such as restoring a particular workload).

That keeps systemic issues visible at the MSP level while preserving tenant‑level context and confidentiality.

3. Confidentiality and tenant‑specific parameters

Make privacy explicit by:

  • Stating rules that forbid sharing other customers’ names, identifiers or detailed logs in updates, screenshots or attachments.
  • Using structured tenant profiles within your ISMS to hold SLAs, key contacts, sector‑specific regulations and agreed containment preferences.

Responders then follow the same core process while the system supplies the right “settings” per tenant. If you maintain those profiles and runbook mappings in ISMS.online, it becomes much easier to prove to customers and auditors that your multi‑tenant incident handling is both consistent and controlled.


How do you define roles, RACI and hand‑offs so incidents stay controlled rather than chaotic?

When you review difficult incidents, the root cause is often less about the technology and more about unclear ownership: several people act in parallel, but no one is clearly accountable, and customers get different stories from different contacts. A well‑designed MSP runbook reduces that risk by tying each phase to specific roles, a simple RACI model and visible hand‑off points.

What does a practical role model look like for MSP incident response?

You do not need a complex governance chart in the runbook, but you do need enough structure to remove guesswork:

Role catalogue based on real work

Define roles by what they do, for example:

  • Incident manager.
  • First responder or on‑call engineer.
  • Platform or infrastructure engineer.
  • SOC analyst (if applicable).
  • Account manager or customer success lead.
  • Customer security contact.
  • Vendor contact for critical platforms.

Have your template reference these roles rather than named individuals, so the model survives staff turnover and rota changes.

Phase‑specific RACI and transitions

For each lifecycle stage (detection, assessment, containment, eradication, recovery, closure, review):

  • Assign responsible and accountable roles.
  • List who must be consulted, such as legal, privacy or service ownership.
  • Identify who needs to be informed, including specific customer contacts, your own leadership and any regulators or partners where contractual or legal requirements apply.

Support this with:

  • Entry and exit criteria: (for example, “incident declared and incident manager assigned” or “all affected tenants notified and post‑incident review scheduled”).
  • Short hand‑over checklists when roles change or when incidents roll across time zones and shift boundaries.

If you implement this structure inside ISMS.online, you can mirror it in assignments, escalations and notifications. That way, the system helps enforce the RACI instead of relying solely on people remembering what the spreadsheet says.


How does using a standard template improve ISO 27001 audits, evidence and learning for an MSP?

The same structure that keeps your team calm during an incident can dramatically simplify audits and continuous improvement. When your runbook builds documentation, traceability and learning into each step, responders do not have to remember separate reporting tasks, and you avoid the “we fixed it and forgot to write it up” pattern that leaves you short of evidence later.

What should every incident record capture as standard in an MSP context?

You can keep the burden reasonable while still satisfying ISO 27001 by standardising a focused set of fields:

1. Evidence by phase and ISMS links

Require, for each incident:

  • Minimum logs, screenshots, tickets and approvals per phase, so responders understand what “sufficient evidence” looks like.
  • Links to affected assets, services, risks and controls in your ISMS.

This gives you built‑in traceability from real events to your risk register and statement of applicability, making it much easier to update these when you see recurring patterns.

2. Post‑incident review and metrics

Include in the template a lightweight but structured review that prompts for:

  • Root cause(s) and contributing factors.
  • What went well and what should change.
  • Agreed corrective and preventive actions, with owners and due dates.
  • Quantitative measures such as time to acknowledge, time to contain, time to recover, business impact, SLA breaches and evidence completeness.

Managed through ISMS.online, those fields sit in the same environment as your broader ISMS, so you can:

  • Raise and track improvement actions directly from incidents.
  • Pull consistent incident summaries into management reviews and audit reports.
  • Demonstrate that you are treating incidents as learning opportunities, which resonates well with auditors and customers.

Over time, that dataset becomes one of your strongest proofs that your MSP is not only compliant with ISO 27001 but also improving resilience in a visible, measurable way.


How can ISMS.online help your MSP embed and run an ISO 27001‑aligned incident runbook?

Designing a runbook once in a document is the easy part; keeping it current, usable and visible across changing teams, tooling and customers is where many MSPs struggle. ISMS.online gives you a central environment where your template, live incidents, evidence, risks and actions all sit together as part of your Information Security Management System, rather than as disconnected files and tickets.

What does good day‑to‑day use of the runbook look like in ISMS.online?

MSPs that run incident response through ISMS.online tend to follow a consistent pattern:

1. Treat the runbook as a controlled asset

You store the master template as a managed document, with clear ownership, review dates and version history. Updates are tested and approved rather than appearing ad hoc. That alone reassures auditors that your incident process is not static or informal.

2. Log and run incidents against the template

When something happens:

  • Responders pick the right playbook from within ISMS.online.
  • They work through the phases, completing mandatory fields and checklists and attaching evidence as they go.
  • Roles and responsibilities from the template are reflected directly in task assignments and notifications.

This helps your team operate consistently under pressure without hunting for documents or wondering what to fill in.

3. Link incidents into the wider ISMS and tailor by tenant

From inside the same platform you can:

  • Link each incident to specific assets, risks and controls.
  • Raise corrective and preventive actions straight from the review and track their completion.
  • Parameterise details per tenant (SLAs, regulatory obligations, communication paths) so the same core template flexes automatically for each customer.

That keeps your ISMS closely aligned with reality while respecting each customer’s commitments.

4. Report directly from the system

Because incidents, actions and ISMS artefacts live together, you can:

  • Generate audit‑ready packs for ISO 27001 and related standards from current data.
  • Prepare customer governance or board packs with accurate incident statistics and improvement progress.
  • Replay incidents with your teams to refine the runbook, training and controls.

If you want to test how much difference this could make, you can start by rebuilding a recent complex incident inside ISMS.online using a structured template and comparing the clarity and traceability you get. Many MSPs find that exercise is enough to justify moving incident handling fully into their ISMS, so the next major event feels controlled, consistent and visibly aligned with ISO 27001, rather than improvised around a shared inbox and a spreadsheet.



Mark Sharron

Mark Sharron leads Search & Generative AI Strategy at ISMS.online. His focus is communicating how ISO 27001, ISO 42001 and SOC 2 work in practice - tying risk to controls, policies and evidence with audit-ready traceability. Mark partners with product and customer teams so this logic is embedded in workflows and web content - helping organisations understand, prove security, privacy and AI governance with confidence.

Take a virtual tour

Start your free 2-minute interactive demo now and see
ISMS.online in action!

platform dashboard full on mint

We’re a Leader in our Field

4/5 Stars
Users Love Us
Leader - Spring 2026
High Performer - Spring 2026 Small Business UK
Regional Leader - Spring 2026 EU
Regional Leader - Spring 2026 EMEA
Regional Leader - Spring 2026 UK
High Performer - Spring 2026 Mid-Market EMEA

"ISMS.Online, Outstanding tool for Regulatory Compliance"

— Jim M.

"Makes external audits a breeze and links all aspects of your ISMS together seamlessly"

— Karen C.

"Innovative solution to managing ISO and other accreditations"

— Ben H.