Scaling Risk Mitigation through Machine Learning at Inter-American Development Bank

At a Glance

Decision-makers at the Inter-American Development Bank were looking to create an effective way to mitigate risks in their infrastructure investment portfolio. To achieve this, the bank partnered with Data Society to create an innovative, machine-learning-based model that would enable decision-makers to proactively identify projects with significant risk factors and take preemptive actions to save time and budget and to secure project success.

The Challenge

Historically, conventional statistical models have afforded bankers risk-related insights, but couldn’t predict the success and failure of infrastructure projects overall. Studies of infrastructure projects throughout the world find that 9 out of 10 experienced cost overruns, which vary by sector and average between 20% and 45% of baseline costs (Flyvbjerg, 2007). The Inter-American Development Bank (IDB) wanted to test its hypothesis that subtle factors can impact the delivery and budget of an infrastructure effort.

The Solution

Data Society utilized machine learning to develop a model that can accurately determine which projects are likely to exceed the original contracting limits, enabling decision-makers to proactively identify projects with significant risk factors and take preemptive actions to mitigate risks. The new solution was used to determine success and potential failure of IDB’s undergoing public construction projects in Paraguay.

User-Friendly Application

In order to make the analysis simple for a non-technical audience to digest, Data Society developed an easy-to-use application that allows users to browse IDB contracts and investigate ones that presented the most risk.

Leveraging the Benefits of Open Sourcing

Open-source data enabled Data Society to most efficiently scale the modeling framework. The Open Contracting Partnership – a consortium of hundreds of stakeholders across government, business, and civil societies with support from The World Bank – developed the Open Contracting Data Standard (OCDS), an internationally accepted set of guidelines for contracting data, to provide a common format for the publication of data about the ~USD 9.5 trillion in annual government contracts awarded globally.

Data Society leveraged the OCDS’s standard data model that reflects the way in which governments structure and award contracts. The Item Classification Scheme (ICS) within the OCDS includes various fields that describe the specific items included in the procurement. Through such classification, OCDS offers a particular standard that is useful in categorizing items with a unique ID, which helps create a standard system for contract risk analysis. The application allows users to browse IDB contracts and investigate ones that are flagged as high risk, as well as explore the patterns in the raw data.

Modeling Approach

When first examining IDB’s infrastructure project contracts, Data Society recognized that there was a distribution imbalance that had to be rectified in order to produce accurate forecasting results for the solution. Research found that there were 80-85% unmodified contracts in our data set and 15-20% modified contracts. Data Society’s models were achieving high accuracy by mostly predicting that a contract would not be modified. Data Society further trained the model on the few contracts that did have modifications. The team utilized ROSE and SMOTE oversampling techniques to bring our modified contract percentage up to ~30-40%. Data Society then used the re-balanced data to develop the final model.

After testing numerous machine learning techniques, Data Society selected a gradient boosting model optimized via cross-validation. The designed algorithm used an ensemble of weak learners, and built them sequentially to obtain a strong learner. Data Society then applied cross-validation to minimize the effects of randomness. This allowed our analysts and IDB stakeholders access to a more accurate idea of the performance of the model that captured the real-world phenomena driving the outcomes of infrastructure projects, which has been especially useful for the imbalanced datasets on which the model was built.

The Results:

Data Society created a scalable machine-learning-driven modeling framework to specifically identify projects with higher than average risk levels. What made our approach scalable is our use of Open Contracting Data Standards (OCDS) when pulling the data.

Data Society created a comprehensive database and data schema to gather the data necessary to operate at a larger scale. For the IDB and its client countries, adherence to a data standard is an imperative policy and data governance prerequisite. Machine learning was applied to risk management and contract structuring, primarily useful when starting with a comprehensive data set.

To deploy at scale, a standardized pipeline was designed for the data extract transform load (ETL) process, and this process evolves over four steps:

Web scraped or API data collection
Transform data into usable format
Load data into scalable cloud system
Run model to inform underwriting

Web scraped or API data collection
Transform data into usable format
Load data into scalable cloud system
Run model to inform underwriting

Ultimately, applying this consistent data governance framework and reporting format, standardizing processes across the 26 countries it serves, the IDB is able to leverage machine learning capabilities to automatically flag infrastructure projects that are at risk of not meeting expectations.

Scaling Risk Mitigation through Machine Learning at Inter-American Development Bank

26

9/10

$9.5 trillion

Client Profile

At a Glance

The Challenge

The Solution

User-Friendly Application

Leveraging the Benefits of Open Sourcing

Modeling Approach

The Results:

26

9/10

$9.5 trillion

Don’t wanna miss any Data Society Resources?

Next Case Study

Upskilling the Data Analytics Workforce

City of Dallas: Guiding a City’s Workforce Towards Data Maturity