Dirty Data: How incorrect and false data leads to losses for companies
Published on 6 de February de 2020
Open Data Wenalyze

Dirty Data is a concept that encompasses outdated, incomplete, erroneous and false data. There are many factors involved: changes, data entry errors, or even the possibility that a client has lied.

The presence of ‘dirty data‘ in companies translates into economic losses. The insurance and banking sectors are particularly sensitive to this problem and at Wenalyze we know this. While insurers assume that they must underwrite risks with incorrect data or a lack of data, banks are particularly affected by this problem in terms of credit and regulatory issues.

The financial sector offers standard prices assuming that inaccuracies and scarcity of data do not allow them to adjust prices. This means that financial institutions have customers who pay less than they should, and vice versa, increasing the likelihood of suffering a portfolio drop (customers paying more with a low risk) and of suffering losses (customers paying less with a high risk).

If we apply this to SMEs, the difficulty is added to what we have explained above. This type of company is in constant change, as are its risks, especially in the fields that have a more direct impact on price: activity, sector, number of employees, online products, international expansion, etc.

However, the underwriting of the policy or financial product, and therefore the review of its data, is at best, annual.

At Wenalyze we have based the creation of our tool on solving this problem in three simple steps: review, enrichment and automation.



The data from the insurance company or bank are checked and contrasted. The name of the SME as well as its office address are obtained and cross-checked with open source data. As a result, the financial institution is informed (on a visual platform or via API connection), which of the data have been identified as erroneous and should be updated. The platform offers the possibility of setting up an automatic update.


The data are extended and enriched. This process achieves two objectives: broadening the vision of risk and creating business opportunities. The Wenalyze tool provides financial entities with traditional and updated data fields such as: number of employees, industry, sector, etc. In this way, we ensure that before the renewal of policies or their products, those data that impact on the price, are up to date and the price is adjusted to reality. This reduces the gap between the price offered and the real risk.

The tool also provides non-traditional data such as the type of food served in a restaurant, opening hours, customer ratings, level of cyber security in SMEs with online stores, etc. Although these are not considered in the price calculation models (such as the actuarial), they nevertheless have a direct impact on the viability of the SME’s business.

In this process, the tool identifies which products are most likely to be of interest to the SME. Suggestions such as offering cyber-risk insurance to an SME selling products online or directors’ insurance when there is a significant expansion of the executive team are levers that the sales team can use to differentiate themselves from their competition and at the same time ensure greater effectiveness in their sales campaigns.


This step is collateral. The aggregation, sorting and processing of large amounts of data helps to substantially reduce manual testing tasks and therefore allows insurance companies and banks to make their decisions more quickly.

Automating this process achieves two important goals: responding more quickly to customers (and therefore reducing the chances of them contracting with competitors) and increasing the company’s underwriting capacity. Automation increases the number of decisions made, which in turn, the breadth of data provided ensures that they are more robust.