We present a technology transfer project where Cloud Computing and Open Data play a crucial role. Our aim is to accurately and efficiently model data from the Spanish car insurance sector. Due to the vast amount of data and the complexity of the models, the use of Cloud Computing is needed to ensure not only an efficient but also a feasible implementation of the model. The system was deployed on the OpenStack cloud platform of our Institute and it is portable to other cloud services such as Amazon Web Services. In addition to the usage of cloud technologies we also benefit from Big Data tools such as TensorFlow, ElasticSearch, Kibana or Spark.
The insurance sector is an important and growing sector of the Spanish’s economy, representing a 5.5% of the GDP in 2017. Our data comes primarily from the quote calculator Avant2 of the software company Codeoscopic. This calculator, allows insurance agents to evaluate a specific risk (vehicle, driver,...) with many insurance companies and get quotes for different modalities. However, the companies’ quoting criteria is a black-box. Finding information about this underlying process could shed light on understanding the differences between companies or regions and, ultimately, improve the Avant2 platform. Nonetheless, the companies’ quotes were not completely explained by using only direct variables associated with the risk. To overcome this hurdle, we will also nourish our model with geographical data such as climate conditions, traffic accidents or socio-economic variables. This information was collected from several open source portals. Once we incorporated the open data component we find a significant improvement on the model’s accuracy compared to only using internal data.