Forecasting is central to the insurance industry and predictive data analytics, models that use complex data to predict future events, has become a key tool for how insurers decide what a policy will cost. So when Allstate Insurance—recognized by U.S. News as a Most Connected Company—wanted to improve the model it uses to forecast bodily injury claims based on the type of vehicles involved in accidents, it looked to crowdsourcing for a solution.
In practice, crowdsourcing involves distributing a description of a specific problem along with relevant data to an unknown group. The crowd—typically a community of industry experts—submits solutions online.
In July 2011, Allstate announced its "Claim Prediction Challenge," a contest run by Kaggle, a San Francisco-based firm that specializes in statistical analysis and predictive modeling competitions. Its clients simply post problems and data sets, and from there the competition begins. Allstate used an existing model and data—three years of information on drivers' vehicles and their injury claims from 2005 to 2007. Then Allstate had the analysts—those contestants on Kaggle's platform—look at the correlation between vehicles and injury payments, and their results were used to build a model that would predict 2009 claims.
Contest participants used metrics including car length, horsepower, and number of cylinders to determine the likelihood that a customer insured with Allstate would be prone to an injury suffered in a car accident. No personal information about individual customers was included.
The contest was designed to solve a very specific piece of a highly complex puzzle.
"Allstate considers a number of factors when determining bodily injury rates," said Eric Huls, vice president of quantitative research at Allstate. "Included are things like a driver's age, past accident history, and where they live. These and other known effects were accounted for in the data before it was released to the contestants, allowing them to focus solely on the relationship between vehicle characteristics and bodily injury claims."
The competition ran for three months; Kaggle publicized the contest on their website and through their email list. Anyone who registered for a free membership at their site was eligible to participate as a contestant. In all, 202 contestants competing in 107 teams submitted 290 entries. Allstate received submissions from contestants right up until the October 12 deadline.
Prize money—$6,000 for the winner, $3,000 for second place, and $1,000 for third—was minimal given the expertise required for the project, but the three winners were more interested in seeing how their abilities stacked up against their peers. A leader board was made public that influenced improvements to the model and drew new entries.
First-place winner Matthew Carle, an actuarial consultant with Sydney-based Quantium, developed an entry that was 340 percent more accurate than Allstate's existing methodology. Huls put no particular emphasis on the winner's statistical performance, but said the techniques acquired through the contest would increase the accuracy of future predictions. "The winning models were those that did the best job of predicting which cars would have claims and how much those claims would cost," Huls said.
"Continuing to improve our ability to predict which customers have losses and how much it will cost is critical to our success in the industry," he said, adding, "If you predict too high, you lose business to a competitor. If you predict too low, you win the business but lose money."
"What Allstate got was a solid, high-quality underwriting solution—and they got it practically for free through crowd sourcing," said Stephen Applebaum, senior property and casualty insurance analyst at Boston-based consultancy Aite Group. "This competition was made up of contestants that were experts in the field who have invaluable expertise. What they paid the winners was a fraction of what it would have cost to do this project internally."