Most business to business transactions are based on credit terms and credit risk management is therefore a priority for companies. It is a hard job to efficiently assess the likelihood of bad debts and credit rating and insurance services are used to manage risk of bad debts.
In this case we will build a Machine Learning model for predicting bankruptcy among Danish companies. The model will be an additional risk management instrument for companies, used both for the initial credit rating and ongoing assessment of debtors.
To build a Machine Learning model we need data, and in Denmark we have a good starting point with financial, people and master data from CVR.
There has been intensive research from academia and financial market players regarding machine learning and credit risk management. A recent study from Moody’s indicates that adding transactional data, social media and other varied data will improve prediction accuracy.
However, few companies hold enough transactional data and this type of data (e.g. payment patterns) are sensitive and only shared with highly trusted parties.
In this machine learning model, we will combine data from CVR with transactional payment data from debtors. The model could be developed by several companies in cooperation, but in this fictional case we assume that a financial application (ERP) provider will chair the development.
- First step is to determine the variables (feature vectors) which we believe will best answer our question; which companies will go bank bankruptcy within the next 12 months. Variables comes from CVR data and payment data will be added from the ERP providers installed base of customers using their financial application.
- With our platform for Decentralized Machine Learning, the model is trained locally with payment data from each customer.
- The platform ensures full privacy, only model parameters are shared and no information about each participant payment transactions will be revealed.
- CVR data are added and the model is trained with both payment patterns and all relevant variables from the CVR database.
Finally, the ERP provider turns the trained model into a new service. The model is integrated into their portfolio of financial applications giving their customers a new valuable credit risk management tool.
What are the alternatives to to distributed machine learning. We can think of 3 other options:
- Very large enterprises, with sufficient payment data (and history) can train a model inhouse with traditional machine learning. No need for distributed machine learning.
- Several companies form a collaboration as trusted parties. One party will collect and store data from all parties and the model is trained with traditional Machine Learning.
- If a cloud financial provider is allowed to collect and combine user data, they can collect payment data across their customer base and train a model with traditional machine learning