A recent article in a Dutch newspaper describes the success the Dutch police force is realizing with data mining products. Policemen are using data mining software to predict time and place of potential criminal activities, such as burglary and robbery, and direct extra police attention to these hotspots at those hours.
As with any data mining project, the quality of the analyses depends heavily on the quality of the data entered in the data warehouse.
Every statement entered in the system, every location, description of people, every relevant object needs to be comparable.
Address standardization products can help when entering locations precise and first time right in the system. Other data quality solutions are available for entering names and other data of people – suspects, victims, and witnesses.
But what about the other aspects of a statement? Was the crime the theft of a car, a vehicle, a van, a pick-up, etc? Did the villain pick a purse or a wallet? A bicycle or a bike? The list of synonyms for objects of crime is endless.
I think the criminal community should come to an agreement and decide on standards to make analyses of these data mining projects even more successful. Now that Christmas is nearing,we all want a better world, isn’t it?
As I was sitting on a terrace in Barcelona during my recent holiday, I found a copy of the Indenpendent, the well-known British newspaper. Having all the time in the world, I started reading and I came across this article about the North Yorkshire Police storing data of more than 180,000 people, including their date of birth and ethnicity. The vast majority of these people had given this information voluntarily and had not committed any crime.
When privacy campaigners questioned the need for compiling such a database, a police spokesperson answered: ” The system is used by many police forces in the UK and internationally to record all information relevant to policing, everything from details of arrested individuals, suspects, victims, witnesses and sources of information as well as addresses, phone numbers and vehicles. The information logged and cross-referenced in the system is absolutely vital to allow us to provide the effective policing service that the people of North Yorkshire and the City of York demand.”
I think that this is a very dangerous comment. What about the possibility of mixing up data of witnesses and criminals? How do the police forces create an unique view on their “customers”? What will be the consequences of so called “database errors”?
Of course I understand that the police forces all over the world need information to do their work properly and to prevent crime and other undesirable behaviour. But reading a comment like the above, I really wonder whether law enforcement agencies are really aware of the essential role of data quality in modern police work.