7 Feb 2021
This contribution is a new feature.
Backstage + Splunk On-Call Plugin
In order to understand the initial issue you need to have a basic understanding of what a plugin is.
Backstage is an application build on top of a set of plugins.
A plugin lets you expose any kind of infrastructure or software development tool in Backstage.
This means you can write your own plugins to add new functionalities to Backstage.
Do you know what happen when your service break down ? Does someone get notified ? Can that person fix the problem ?
These questions can occur anytime a service encounters a problem.
Splunk On-Call helps you answer these questions by automating incident management.
Splunk On-Call dashboard
Before continuing, you need to have a basic familiarity with incident management concepts.
An incident management system apply a logical operation to an Incident.It can determines:
- Who should be alerted to this particular incident
- What method of notification is used (SMS, email, etc)
- The state of the incident and execution of secondary workflows
Here are the “Five Phases” of the Incident Management framework:
Incident management framework
1. The initial step of the incident lifecycle is knowing about the problem.
2. The second step helps you establish the severity and priority of the problem.
We can split it into three points:
3. The third step helps you to understand more regarding leading factors of incidents during remediation. For example, a small "fix" can have implications elsewhere in your system and you must be aware of this.
4. The fourth step will allow you to make a complete analysis about the incident:
5. The fifth step will provides great metrics to review for further improvements.
It will allow your team to improve the way they respond to incidents.
Here is the list of some terms with their definition that you can find in this article:
Rotation: A rotation is a recurring schedule, consisting of one or multiple work shifts, with team members alternating through a work shift.
Incident escalation: this is what happens when a person can't resolve an incident themselves and needs to report the incident to someone else (team, person, etc)
Escalation policy: this answers the question of how your organization handles these incidents.
It defines who should be notified when an incident is triggered, and who the incident should escalate to if the first responder isn’t available.
Once one person has responded, the escalation policy will stop escalation, and no further notifications will be sent.
Splunk On-Call Escalation policy creation
For the moment only one incident management plugin exists, PagerDuty.
Some people use Splunk On-Call VictorOps as their on-duty rotation manager and would like it to be integrated into backstage.
The goal of the issue is to implement a new plugin for Splunk On-Call.
This plugin should provides:
- A list of incidents
- A way to trigger a new incident to specific users and/or teams
- A way to acknowledge/resolve an incident
- Information details about the persons on-call
The code blocks are intentionally incomplete for the sake of readability.
If you want to read the full code you'll find it in the PR link at the top.
This PR being still Open, some parts are likely to change.
I will keep the article updated if any changes are made.
To create a new plugin, the Backstage CLI already has a command that we can use:
This will setup a new Backstage plugin with the ID we provided.
We have now a working example plugin on which we will base to add our different functionalities.
Our plugin will be created with the
createPlugin method which will create a new plugin instance:
The plugin looks like a separate package, it has a
package.json and a
It allows us to have a better independence of the different plugins, to deploy them separately and to work on them in isolation from the rest of the application.
Note that the
index.ts files are there to let us import from the folder path and not specific files.
The marketplace is used to:
- List the available plugins with information (title, description, tag, etc)
- Show who that contributed it (user-company)
- Link to appropriate documentation
To add the plugin to the marketplace, we need to create a file in
microsite/data/plugins with our plugin's information.
In order to connect our plugin to the Splun On-Call API, we will make our calls separately from the UI part.
Here is the list of all the routes we are going to implement.
|Fetches a list of incidents.|
|Fetches the list of users in an escalation policy.|
|Triggers an incident to specific users and/or specific teams.|
|Resolves an incident.|
|Acknowledge an incident.|
|Get a list of users for the user organization.|
|Get a list of teams for the user organization.|
|Get a list of escalation policies for the user organization.|
The proxy will allow us to redirect calls from
/splunk-on-call to the Splunk On-Call API
https://api.victorops.com/api-public and add authentication information in the headers.
By default, the proxy is already added to the default Backstage project:
To add our proxy config, we will put our configuration under the
proxy key of the
In order to understand how the API works, we will take the
getIncidents method example.
Note that the factory method
fromConfig takes in parameter
discoveryApi which allows us to retrieve some variables related to the app configuration.
this.config.discoveryApi.getBaseUrl('proxy') refers to the proxy base url)
components folder contains all of our components.
This is the main root component, the one that includes the rest of the child components.
Splunk On-Call Card component
This component is used to display the list of the different incidents with their associated information (creator name, creation date, etc).
An incident can have several status:
It also have an action section where the user can
resolve the incident.
Incident list component
Here is the code of the main
This component is used to display the list of the persons on-call.
The logic of the component is globally the same as for the incident list.
We will see here how to retrieve and transform the list of users returned by the
GET /api-public/v1/oncall/current call.
Here is what the data returned by the Splunk On-Call API looks like:
As we have to return a list with the users, we will filter our
teamsOnCall to keep only those that match our current team and we will then retrieve the users within that team:
This component is used to trigger a new incident to specific users and/or teams.
This component is used to display various errors to the end user.
An error can appear if:
- the Splunk On-Call API_KEY and/or API_ID are not provided
- the Splunk On-Call username is not provided or invalid
- the Splunk On-Call team is not provided
Let's take the example of the
Before writing our tests, we need to set up the different mocked apis:
alertApiRef: Core Utility API which is used to report alerts.
splunkOnCallApiRef: Splunk On-Call plugin API
We will pass these values to the
ApiProvider (an Higher Order Component) which will provide a React
Context.Provider with our apis.
We test the case where the list of incidents is empty and we have to display the
Empty incident list
We test the case where we have an incident list and the incidents are correctly displayed.
Valid incident list
We test the case where we have an error while fetching the
This will trigger the alert API.
Incidents fetching error
The final step is to add a changeset which will contains the list of our file changes.
It lets us declare how our changes should be released.
In our case we only have
Here is the final result with a sample workflow:
- Creation of a new incident to the current team
- Acknowledgement of the incident
- Resolution of the incident
- Incident display on the Splunk On-Call dashboard
I found that there were some inconsistencies in the Splunk On-Call API documentation, especially in the models.
Therefore I had to go back several times on my TypeScript models to fix them.
This contribution has allowed me to use an Incident management tool (Splunk On-Call) and to familiarize myself with the creation of plugins for Backstage.
It allowed me to interact with parts of Backstage that I had never contributed to before.