Data science moves the insurance industry into analyzing a wider variety of impact factors for risk mitigation and pricing. Insurance as a one size fits all approach only functions when the pooled risk is constrained, as in the case of employer-provided insurance.
Data science helps insurance companies to put these data to efficient use to drive more business and refine their product offerings. Data science can enable insurers to develop effective strategies to acquire new customers, develop personalized products, analyze risks, assist underwriters, implement fraud detection systems, and much more.
Data Science Opportunities in Insurance
The Promise of Data Science
Where it was once difficult to gather data about potential risks, today’s insurers have lots of data to work with.
On any given day, insurance data scientists may gather data from:
- Telematics devices
- Smart phones
- Social media
- CCTV footage
- Electoral rolls
- Credit reports
- Website analytics
- Government statistics
- Satellite data
What’s more, the advent of cloud computing helps companies to aggregate and store it all.
These sources tell insurers far more than historical data from policy administration systems, claims management applications and billing systems, and the mortality reports of yesteryear. Through a judicious analysis of data science, insurers improve their pricing accuracy, create customized products and services, forge stronger customer relationships and facilitate more effective loss prevention.
To match that level of knowledge in the age of decentralization and the Internet, the insurance industry has turned to data science. Insurance data scientists combine analytical applications – e.g., behavioral models based on customer profile data – with a continuous stream of real-time data – e.g., satellite data, weather reports, vehicle sensors – to create detailed and personalized assessments of risk.
Picture a world in which wireless “telematics” devices transmit real-time driving data back to an insurance company.
Telematics-based insurance products have been around since the 1990s, when Progressive first launched them. But technology has come a long way in the intervening years. Telematics devices currently include embedded navigation systems (e.g., GM’s OnStar), on-board diagnostics (e.g., Progressive’s Snapshot) and smartphones.
These can be used to create personalized plans, which typically fall under one of two options two of these options:
- PAYD: Pay-As-You-Drive
- PHYD: Pay-How-You-Drive
PAYD is straightforward. It charges customers based on the number of miles or kilometers driven. Hollard Insurance, a South African insurer, has six mileage options.
But PAYD does not take into account driving habits. PHYD plans use telematics to monitor a wide variety of factors – speed, acceleration, cornering, braking, lane changing, fuel consumption – as well as geo-location, date and time. If an accident occurs, the insurance company has the ability to recreate the situation.
Auto insurers can then provide customers with driving scores, ideas for improvement and individual pricing.
In a move similar to auto, property insurance companies are assessing how they can use telematics to create usage-based home insurance. These data sources can include:
- Moisture sensors that detect flooding or leaks
- Utility and appliance usage records
- Security cameras
- Sensors that track occupancy
Combine this with information from outside sources (e.g., local crime reports and traffic) and you can arrive at a multi-faceted, comprehensive assessment of one person’s property claim risk.
Going a step further, these sources can be used to protect a customer. For example, with predictive analytics, insurers can calculate the likelihood of an event such as theft or a hurricane and take steps to avoid pain and suffering – as well as, of course, big claims.
Life and Health Insurance
We live in a monitored world. Life and health insurance companies know this more than anybody. To create profiles of customer health and develop individual “well-being” scores, insurers are now casting the information net very wide indeed. They can collect:
- Transactional data – e.g., where and what (junk food?) customers buy
- Body sensors – i.e., devices that monitor consumption or alert the wearer to early signs of illness
- Exterior monitors – e.g., data from workout machines
- Social media – e.g., tweets about one’s personal health or state of mind
For more details on big-data applications in this area, see our related profile of the Health Care Industry.
360-Degree Customer Profiles
Insurance aims to improve customer satisfaction, and it is employing big data to accomplish that. The more an insurer knows about its customers’ quirks, the theory goes, the easier it is to keep them happy – and paying premiums.
Companies are combining all their direct customer connections – e.g., email, call center, adjuster reports, etc. – with indirect sources – e.g., social media, blog comments, website and clickstream data – to create a 360-degree profile of each individual.
With a 360-degree profile in hand, insurers have the means to refine their approach to sales, marketing and existing customer service.
Call Center Optimization
A call center is a huge cauldron of data. For insurance data scientists, it’s also a golden opportunity. These folks are investigating ways to:
- Combine claims data with telecom data from CDRs to analyze call center activities and refine training guidelines.
- Analyze raw telecom data, model temporal call patterns, and create a plan for staffing optimization.
- Use sentiment analysis – e.g., speech analytics on call center conversations or Natural Language Processing (NLP) and text analytics on social media – to improve customer service.
Call-center employees are also in an ideal situation to sell customers additional products. One use of a 360-degree profile is to give that friendly voice on the phone the means to offer you the most relevant product for your particular needs.
Fraud costs insurance companies 10s of millions each year. In response, insurers are assembling their data resources and creating a multi-channel approach to fraud detection. They are taking a very close look at both traditional structured data (such as claims and policy data), and textual data (such as adjuster notes, police reports and social media).
- Text analytics
- Predictive analytics
- Behavioral analytics
- Pattern, graph and link analysis techniques
… not to mention a host of other handy tools, data scientists are cracking down on suspicious claims.
Data Risks and Regulations
The Challenges Ahead
Insurance companies still have a few hurdles to cross before they can become fully data-driven. Some of those hurdles are already apparent to the industry. They include:
- The isolated nature of data collected makes it challenging to synthesize data
- Unstructured data
- Outdated fraud detection technology that cannot keep pace with today’s level and type of fraud
Big companies have their own issues. Some deal with creaky IT infrastructures that are not equipped to handle the volume, velocity or variety of data that are streaming through their doors.
Data science can be used to solve many problems, but only if you have employees who are trained to ask the right questions.
And many insurance companies don’t. The insurance industry is well supplied with statistical ability. It’s only a matter of time before the supply of analytics skills catches up to the demand.
But perhaps the most complicated issue centers on a customer’s right to privacy. The Finance Industry in general is subject to a host of federal and state regulations that were enacted to protect consumer privacy and avoid discriminatory practices. These have been joined by a series of strict rules on data collection – all of which an insurance legal department must be aware of.
Just as importantly, insurance companies may need to think about how they treat customer information. It’s all very well to imagine a world run by telematics, but many consumers are rightly afraid of giving up their personal data to a private company. Even the lure of more affordable premiums may not be enough to change their mind.
Insurance data scientists also have to be very careful they’re not mistakenly assuming the role of Big Brother – whether benevolent or not. Despite the hype, not even data science can tell you everything about a person.
History of Data Analysis and Insurance
Insurance has always been a numbers game. What are the odds of a ship sinking? Of the head of the household dying prematurely? Of a wooden house burning down? Since the third millennium B.C., humans have been trying to protect themselves from the risks of living.
Keeping track of risks means knowing the numbers – the data. Increasingly sophisticated techniques were added over time to better calculate the odds. Three and a half centuries ago, “knowing the numbers” was maturing into the mathematics of risk – actuarial science – one of the foundations of modern data analysis.
The Birth of Actuarial Science
In the late 17th century, demand for long-term insurance (e.g., burial, life and annuities) was becoming hard to ignore.
Insurance companies were happy to offer citizens these products, but they were faced with a variety of statistical conundrums in understanding their data:
- What was the likelihood of an insurance-holder dying within a certain time frame?
- How should insurers price their products?
- What percentage of premiums should they set aside to pay for future benefits (e.g., annuities)?
- How much could they afford to invest elsewhere? What would the rate of interest be?
The Father of the Computer and His Descendants
Over the next few centuries, to accompany the data, actuarial science grew both in popularity and in the complexity of its calculations. It’s no surprise that Charles Babbage, father of the computer, found time to dabble in it.
During the 1820s, he created actuarial tables from Equitable Society mortality data and published a handy guide to the life insurance industry titled A Comparative View of the Various Institutions for the Assurance of Lives.
But it was the adoption of punch-card tabulating machines and, subsequently, early computer technology, that the insurance industry began the march towards data dominance.
During the late 1930s, Edmund Berkeley of the Prudential Insurance Company began to investigate the potential of shifting work to calculating machines, and, later, computers.
The next big shift came in the late 1960s and 1970s. More powerful machines and better software were coming into play. Online systems allowed workers to share information freely and conduct inquiries in real time. Investment in technology increased steadily.
By the 1980s, the insurance industry was on top of IT trends.
The Industry Goes Ballistic
The arrival of the Internet in the 1990s helped insurance data science.
- Individuals were able to bypass intermediaries and shop for coverage on their own terms.
- Company and consumer websites sprang up to satisfy demand.
- Banks seized the opportunity to expand into the industry.
As a consequence, the amount of customer data being gathered and exchanged exploded.
At the same time, the costs of data processing and storage were dropping rapidly. Instead of the mass modeling of the past, insurers were gaining the capabilities (and the technical tools) to calculate risk on an individual level. The era of data science was just around the corner.