How-to create the Golden Record


The term Golden Record is closely related to Customer Data Integration or MDM for Customer data. It refers to the “single truth” which has been created or calculated from all those duplicate customer records from different systems. This post is not about finding or tagging all those duplicate records. There all kinds of ways to find them using advanced statistical methods, fuzzy matching etc.

But what do you once you have found the duplicates. How do you create the best possible customer data out of all gathered elements? Continue reading ‘How-to create the Golden Record’

Major challenge, deduplication of India!

dedup India

Last week I travelled to India and just at that time one of the largest deduplication projects in the world had been accepted. The project is to provide every Indian citizen with a Unique Identification Number.

Main goals are (amongst others):

  • Make life easier for the citizens by diminishing the number of ID documents they have now
  • Minimize the fraud possibilities for several projects and welfare schemes
  • Possibility to share information between different disciplines and organizations

Anyone who has ever been in India knows that you absolutely need to take into account the variety of cultural aspects in that huge country. In Western Europe it is already very difficult to deduplicate all kinds of citizen data, given all the languages and cultural aspects. I think, however, that the degree of difficulty is even worse in India, where not all citizens have a registered birth certificate, most will have their first official registration from school, some do not have a last name, addresses are not always that trivial (euphemism), and the whole country is used to the fact that typos are allowed in names, because in one area Shrivastava is actually the same as Srivastava (without ‘h’). Continue reading ‘Major challenge, deduplication of India!’

How green is your data value?

top101Number 4 in the top 10 list of Gartner’s Strategic Technologies is Green IT. David Cearleys take on this is quite straightforward. On the one hand regulations and more efficient equipment will force or help to reduce unwanted emissions. For our discussions – talking about data value – I see several angles:

  • Having the right contact details will reduce waste of natural resources because we bring the deliveries immediately at the right place, and it’s not only the deliveries that can be optimized, we can also avoid that deliveries get lost and natural resources are actually piped for /dev/null !
  • By valueing our data through deduplication we can in general avoid to spoil needless energy – both by humans and other resources – and use the sparse energy only for those who actually need it. Here I feel the same remark as David in his blog. There comes a moment in the near future, with an rising energy prices and increasing emission penalties, that that aspect will win in the equation from the actual spoil of goods and human energy.
  • Saving resources is now also done by concentrating or centralizing services – optimizing the service per energy unit. For data we see this happening in the Virtualization of data and Master Data Management technologies. Strong place in your centralizing strategy will be the role of your data quality – that will bring your real value

I encourage you all to think out-of-the-box how data-value can help to make it a better world for the future. But I’m afraid that in this economic climate the short term is ruling and not the long(er) term.

High Precision Matching at the heart of your Single Customer View Solution

CDIwhitepaper There are many different purposes to create a single customer view. All those different purposes also require different technical architectures. And each architectural design is capable of delivering its own value to the company.An analytical single customer view delivers value by supporting the company decision making via analytics and reporting. For instance: “how many customers do I really have in my focus market segments and what is the age distribution? “ An operational single customer view supports the primary business processes like sales and customer service.For instance an outbound call center employee can deliver additional value to the company if an integrated view on the customers shows which products and services from different business lines have already been sold to those customers and which customer support issues are still pending.

Continue reading ‘High Precision Matching at the heart of your Single Customer View Solution’

Virtualization: It’s the data! – not the hardware

The first Strategic Technology to watch according to Gartner is Virtualization. And I do like their twist in the whole virtualization debate – focus on data. While the whole world is linking the word virtualization with optimizing your hardware assets by using a virtual layer on top of your hardware. By optimizing the usage of your assets in this virtual way you can significantly reduce the total cost of ownership (ToC).

David Cearley at Gartner comes with a fascinating other angle. Basically he sees virtualization also as strategic technology to virtualize the data. And by that twist, data quality and data governance appears annoyingly in the middle of your radar screen. In order to use this strategy for your operational excellence, to eliminate the number of redundant data on your real storage devices, and make a virtual layer between your applications and this virtual data storage, you need to be sure that all your applications can work seamlessly with that virtual data.

Continue reading ‘Virtualization: It’s the data! – not the hardware’