SECURITY BREACH? CALL 888.234.5990 EXT 9999

OLX Group Standardizes Incident Management with Customized Tools

About OLX Group

OLX Group operates one of the fastest-growing networks of trading platforms globally. Serving 322 million people every month in 30+ countries around the world, OLX Group helps users buy and sell cars, find housing, get jobs, buy and sell household goods, and much more. With more than 20 well-loved local brands including Avito, OLX, OLX Autos, Otomoto, and Property24, its solutions are built to be safe, smart, and convenient for its customers.

OLX Group is powered by a team of 11,000+ people around the world. OLX Group is a division of Prosus, a global consumer internet group and one of the largest technology investors in the world.

The Challenge

OLX previously created a Slack bot called the IMP (Incident Management Process) for managing incidents on various transaction networks. The organization needed to revamp the IMP in order to centralize incident reporting, but the SRE team did not have dedicated developers. Finally, OLX needed to quickly implement the IMP process that had already been defined.

Difficulties with Incident Management

“Worldwide, we have many different teams, countries, cultures, and even different technical capabilities. Teams were taking different approaches to solving technical issues. There were teams that had multiple incidents every day, and there were others with less than a few incidents in a month, but no one had that information organized. We wanted a way to keep all the teams in sync,” explained Paulo Oliveira, SRE Manager, OLX Group.

“Nearly every month, an urgent incident was occurring and woke up a team member in the middle of the night. There was a lack of insight into how issues were progressing, and an overall lack of visibility and standardization of incidents. Additionally, there was unclear ownership of documentation, which led to teams tackling issues in different ways, as well as insufficient documentation.

“We needed to standardize the incident response and management process. We had to ensure each incident was reported appropriately and followed up on using standard procedures, as well as be able to analyze metrics generated by the IMP tool based on the incidents. For making informed decisions, accurate metrics are essential,” said Mr. Oliveira.

The Solution

OLX had worked with Netrix in the past and based on that previous success, chose to work with Netrix to build a new solution.

The Results

Netrix moved quickly and developed a new version of the IMP, an application that integrates with numerous other systems. After that success, Netrix supported OLX by building two additional apps.

Automatically Generate Incident Tickets & Kick Off Resolution

When an incident occurs, the on-call tech is alerted and must confirm that the incident is valid. IMP will then create a Jira ticket automatically with the known information and a channel in Slack for the relevant team members.

Using Jira to store incident information also alleviates the need for an additional database to manage. IMP uses OpsLevel, Jira, and Statuspage and will soon also integrate with Zoom.

A significant value of IMP is that, beyond generating the ticket, it allows users to clearly and intuitively follow the workflow of an incident. Each status transition is accompanied by the correct buttons, helpful suggestions, and relevant references. IMP supports the entire incident workflow from start to finish, making every step easily accessible, thus simplifying tracking and enabling clear visualization of statistics.

Organization-Wide Visibility

Engineering leadership can now view detailed analytics for each team in Europe. IMP features dashboards in OpsLevel with charts and graphs showing metrics such as the total number of incidents for each team.

“The tool is heavily used. In the absence of the Netrix team, we wouldn’t have had the resources to do it in such a small time frame. Netrix’s expertise complements our skills, for instance knowledge in UX/UI and front-end development, which the SRE team lacked at the time. There is a great deal of speed in the development process of Netrix. As a result of their work along with the effort our developer teams have put into improving the code, we have had fewer and fewer incidents per team over time. The ideas Netrix contributed are well thought out, and we have productive discussions together. Even though Netrix is an external service provider, we treat them like a member of our team,” said Mr. Oliveira.

Standardized Tools & Processes

“Netrix also helped the SRE team develop the front end of the Service Shaper tool, which is currently in production and is one of the most important projects on the team right now. We have a lot of internal procedures to create services. Every time we implement a new service, such as cloud procedures, databases, and security, there is a set of procedures and documentation that must be done. With Service Shaper, developers can focus on their specific programming tasks, while ensuring they’re compliant with numerous requirements. The developer selects the language template, answers a few questions, and Service Shaper creates a GIT project and all the required infrastructure. The aim is to minimize the developer’s repetitive tasks,” explained Mr. Oliveira.

This project has begun to unify work processes. With the three applications Netrix has helped develop, OLX teams in Europe now use the same standard of tools and processes.

Helpful Documentation and Training

Netrix provided documentation which now serves as the OLX guidelines for onboarding and training. The valuable documentation provides clarity for the standards and processes across geographies and teams with detailed information such as how to create an incident.

Experience The Impact

No matter what challenge you’re facing today, our team of technical experts can get you started on a path to a better solution. We’ll partner with you to: 

  • Understand your current technology environment
  • Interview key stakeholders to understand the root of the business issue(s)
  • Propose a solution with projected timelines, budget, and dependencies