2

Objective: I need to find a way to consolidate multiple home and auto insurance policies purchased by an Insured (aka, policy holder) and assign those policies to a single customer account for that Insured.

Problem: I have over 100K+ policy records. An Insured could purchase one or more policy, but each policy is currently assigned a unique customer account number, even though those policies belong to the same Insured. Furthermore, the way the information was entered into the database, I can't just group the policies by Insured's name, address, or some other value in other fields without heavy manual intervention, because the information could be entered differently in each field for the same Insured (e.g., Smith, J vs. Smith, James or 1000 E highland, Sac, CA vs. East Highland, Sacramento, CA).

Question: Does anyone know of a tool/utility (hopefully free) that could be used to interrogate the records and, through some "fuzzy" algorithm, group those 100k+ policy records into grouping associated to a given Insured?

CJ Lee
  • 21
  • 1
  • How much smaller does the data set get if you make a first pass, consolidating on exact matches? – ernie Nov 06 '13 at 22:17
  • It depends on how accurate your fuzzy needs to be, and also how poorly or irregularly the data was entered. Say father and son live at the same address, they are different people with different insurance, both are named "Walter White", but the son is "Jr." -- should that extra string be enough to keep them separate, or do you merge them? Also, what about people getting name changes; moving address; exact name collisions? This really seems like it needs to be a manual process. Social Security Numbers exist for a reason, you know -- they uniquely identify one human organism. – allquixotic Nov 06 '13 at 22:23
  • Also, what if the data entry clerk mistakenly forgot to enter the "Jr.", so that their name, address and home phone exactly match to the character in your database, but they are still in fact different people (different SSN, and you could stand them side by side and confirm that they are in fact distinct human beings)? If you aren't recording SSNs, you're really SOL trying to fuzzy match names and addresses to determine who's a unique person and who's not. – allquixotic Nov 06 '13 at 22:26

0 Answers0