Part 4: Incident management and problem solving
“All credit and debit card transfer of XXXX are down”. yelled headlines of local news. “Salaries of YYYY are not on employee accounts on time” screamed the next one. I felt slight sympathy to our competitors. Their counterparts of me won’t be sipping their morning coffee peacefully with family as I am. System integration sure make showy headlines – when it doesn’t work. Hope those brothers-in-integration did have their Incident management well designed and trained. With this real-life experience, I’ll begin my part four of “Establishing Integration Competency center” series.
I felt I can spend few lines of this post’s space to that sample, because there isn’t that much to ramble about these two processes mentioned in the header. Simply make sure that your ICC designs and decides:
- A process for incident management (use e.g ITIL if it fits)
- During design phase of incident management ask yourself a question: for how long can we afford to keep processes down. Plan incident management according to most critical process. If your answer is measured in seconds, mere processes won’t do, you’ll need high-availability platform with redundant resource.
- A way to test it (or did you plan to test it in production environment with a case like in the beginning of this post?)
- During design phase ask yourself a second question: can I test my incident management process and high-availability? If answer is no, you should reconsider and invent a way to test both of them.
- Service Level Agreements that correspond to your business’s criticality level (e.g. 8-16/5 or 24/7)
- If your platform has a 24/7 monitoring, does it also concern your processes aka solution level? Typically hosting parties do not understand your business integration processes, get your own people or a third-party that can support your process level to be on-call or otherwise ready according to your business’s criticality.
- An identify and eliminate process for root causes of repeating incidents or errors. This is also known as Problem solving.
- Ensure that you have means to catch repeating errors; how do you log and monitor them?
These processes are for ICC to design and for Production Manager to manage and held accountable for.
After your ICC has designed processes (series parts 2, 3 and 4) and set tasks for mandatory roles (Part 1) it is ready to plan for content for its daily, weekly, monthly, bi-yearly and yearly meetings and communications. Clear and short agendas are a go for the sake of efficiency.
More to come in part 5 of Establishing Integration Comptency Center blog post series.
- Establishing Integration Competency Center, part 1 (integrationwarstories.com)
- Establishing Integration Competency Center, Part 2 (integrationwarstories.com)
- Establishing Integration Competency Center, part 3 (integrationwarstories.com)
- Lean Integration and ICC (integrationwarstories.com)