DevOps Command Center
Why is this important?
Modern teams communicate via real-time messaging throughout the day but not all messages are created equal. Unlike social messaging, workplace messaging is often process-driven rather than completely ad-hoc and these processes could be critical to the success of the organization. We’ve learned from our customers that they often have little visibility into these processes, let alone the ability to facilitate and optimize. As a result, they rely on tribal knowledge and piecemeal solutions which doesn’t scale well.
That is why we want to help teams collaborate to solve problems that are time-sensitive, recurring, situational.
What are the use cases?
We believe incident response is one of the best use case for real-time messaging. To start, we are focused on these three areas:
Build and deploy pipeline
Information security
Site reliability
For example a business-critical web application may be reported to be unresponsive and its underlying architecture span multiple teams that now need to stand it back up before SLAs are breached, or a user suspects their data may have been compromised and the security team needs to investigate and recover all possibilities to minimize damage.
Who is this for?
We are currently focused on organizations that have most if not all of the following characteristics:
Primarily uses Mattermost (regardless of editions) at least within some population
Operates and maintains systems that have reliability and availability requirements
Aims to scale procedures and protocols in order to keep up with growth or meet mandates
Coordinate many (between 10 to 50) contributors and stakeholders during incident responses
Here are some characteristics that signal an organization may not be a great fit right now:
Building end-to-end custom solutions
Has little need for context transfer between contributors and stakeholders
Eg. small teams, rare external collaboration, strong tribal knowledge
Does not have an incident playbook or post-mortem procedure
What are other similar solutions?
Here is an overview of comparable solutions that we’ve found so far:
Theme | Examples | Pros | Cons |
---|---|---|---|
Standalone incident management |
|
| |
Chat-centric incident response |
|
| |
Security-specific orchestration |
|
| |
Ticketing workflow |
|
|
How is this different?
Mattermost has a couple superpowers that allow us to solve the problem a bit differently.
Teams are already familiar with their messaging tool and Mattermost can come out of the box with these functionalities. This way, the solution is already deployed and works with all of your existing integrations.
We also found that chat is also the place where real-time collaboration is anchored, especially during the use cases that we are targeting. It is the most complete record of process execution from sharing information and discussions to making decisions and taking actions. As a result, it also has the most leverage to get teams to follow protocols.
Lastly, Mattermost is open-sourced and can be self-hosted for better security and reliability especially considering the sensitive nature of the data that is generated.
What is this going to do?
Our roadmap is around three components that are common across domains:
Reactive: Incident response
Centralize communication to reduce information fragmentation
Improve context-transfer to minimize confusion
Prompt contributors with next steps to speed up resolution
Inform stakeholders of active incidents to raise awareness
Retroactive: Post-mortem and reporting
Visualize incident timeline to recognize bottlenecks in the process
Replay investigations, discussions, and actions taken to extract lessons
Chart trends in aggregate to identify broader changes that are needed
Export channel transcript to save externally or further analyze
Proactive: Playbook and process design
Template checklist prompts to provide teams structure
Auto-send messages with resources to ramp up new contributors
Prepare multiple playbooks to save time looking up documentation
Trigger actions in other tools to reduce repeated work
How to get more info?
Become a community contributor (WIP)