<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data Value Talk &#187; matching</title>
	<atom:link href="http://datavaluetalk.com/tag/matching/feed/" rel="self" type="application/rss+xml" />
	<link>http://datavaluetalk.com</link>
	<description>Customer data is a valuable asset. Why not treat it that way?</description>
	<lastBuildDate>Thu, 10 May 2012 14:49:53 +0000</lastBuildDate>
	<language>nl</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Standardizing crime</title>
		<link>http://datavaluetalk.com/data-quality/standardizing-crime/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=standardizing-crime</link>
		<comments>http://datavaluetalk.com/data-quality/standardizing-crime/#comments</comments>
		<pubDate>Wed, 15 Dec 2010 10:07:22 +0000</pubDate>
		<dc:creator>Paul Drenth</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[crime]]></category>
		<category><![CDATA[crime prevention]]></category>
		<category><![CDATA[criminal activities]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[matching]]></category>
		<category><![CDATA[police]]></category>
		<category><![CDATA[synonyms]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1606</guid>
		<description><![CDATA[A recent article in a Dutch newspaper describes the success the Dutch police force is realizing with data mining products. Policemen are using data mining software to predict time and place of potential criminal activities, such as burglary and robbery, and direct extra police attention to these hotspots at those hours. As with any data [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-thumbnail wp-image-1607" title="fingerprint_large" src="http://datavaluetalk.com/cms/wp-content/uploads/2010/12/fingerprint_large-150x150.jpg" alt="fingerprint_large" width="150" height="150" /></p>
<p>A recent <a href="http://digitaleeditie.nrc.nl/NH/2010/11/20101210___/1_05/article5.html" target="_blank">article in a Dutch newspaper </a>describes the success the Dutch police force is realizing with data mining products. Policemen are using data mining software to predict time and place of potential criminal activities, such as burglary and robbery, and direct extra police attention to these hotspots at those hours.<br />
As with any data mining project, the quality of the analyses depends heavily on the quality of the data entered in the data warehouse.<br />
Every statement entered in the system, every location, description of people, every relevant object needs to be comparable.<br />
Address standardization products can help when entering locations precise and first time right in the system. Other <a title="data quality" href="http://www.humaninference.com" target="_blank">data quality</a> solutions are available for entering names and other data of people – suspects, victims, and witnesses.<br />
But what about the other aspects of a statement? Was the crime the theft of a car, a vehicle, a van, a pick-up, etc? Did the villain pick a purse or a wallet? A bicycle or a bike? The list of synonyms for objects of crime is endless.<br />
I think the criminal community should come to an agreement and decide on standards to make analyses of these data mining projects even more successful. Now that Christmas is nearing,we all want a better world, isn’t it?</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/standardizing-crime/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>High precision matching &#8211; apples, oranges or fruit salad?</title>
		<link>http://datavaluetalk.com/data-quality/high-precision-matching-apples-oranges-or-fruit-salad/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=high-precision-matching-apples-oranges-or-fruit-salad</link>
		<comments>http://datavaluetalk.com/data-quality/high-precision-matching-apples-oranges-or-fruit-salad/#comments</comments>
		<pubDate>Thu, 21 Oct 2010 14:09:43 +0000</pubDate>
		<dc:creator>Holger Wandt</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[apples and oranges]]></category>
		<category><![CDATA[customer data]]></category>
		<category><![CDATA[customer data matching]]></category>
		<category><![CDATA[deterministic]]></category>
		<category><![CDATA[high precison matching]]></category>
		<category><![CDATA[matching]]></category>
		<category><![CDATA[probabilistic]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1525</guid>
		<description><![CDATA[In his excellent post &#8220;New matching engines go beyond apples and oranges&#8221;, Winfried van Holland states that traditional matching engines are based on atomic string comparison functions, like match-codes, phonetic comparison, Levenshtein string distance and n-gram comparisons. He further argues that the drawback of these functions is that it’s not always clear for what purpose [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: left;"><img class="size-thumbnail wp-image-1533 alignleft" title="apples-oranges" src="http://datavaluetalk.com/cms/wp-content/uploads/2010/10/apples-oranges-150x150.jpg" alt="apples-oranges" width="150" height="150" /> In his excellent post <a href="http://datavaluetalk.com/2010/02/11/new-matching-engines-go-beyond-apples-and-oranges/" target="_blank">&#8220;New matching engines go beyond apples and oranges&#8221;, </a>Winfried van Holland states that traditional matching engines are based on atomic string comparison functions, like match-codes, phonetic comparison, Levenshtein string distance and n-gram comparisons. He further argues that the drawback of these functions is that it’s not always clear for what purpose one needs to utilize a particular function, and that these low-level DQ functions cannot distinguish between apples and oranges – you end up comparing family names with street names.</p>
<p style="text-align: left;">Good point! In essence, this is the basis of the discussion on the matching approach within customer data management: As intelligent automated matching of records distributed over various heterogeneous data sources is an essential pre-requisite for correct and adequate customer data integration, there are many opinions on how to achieve this.</p>
<p style="text-align: left;">In theories on <a title="data matching" href="http://www.humaninference.com/products/data-matching" target="_blank">data matching</a>, there are in general two methods that prevail when customer data management is concerned: deterministic and probabilistic matching.<span id="more-1525"></span></p>
<ul>
<li>Deterministic matching uses, among others, country- and subject-specific knowledge, linguistic rules, such as phonetic conversion and comparison, business rules and algorithms, such as letter transposition or contextual acronym resolving to determine the degree of similarity between database records.</li>
</ul>
<ul>
<li>Probabilistic matching uses statistical and mathematical algorithms, fuzzy logic and contextual frequency rules to assign the degree of similarity between database records. In this, patterns with regard to fault-tolerance play an important role (the matching method is able to take into account that humans make specific errors). Probabilistic matching methods usually assign the probability of a match in a percentage.</li>
</ul>
<p>Both methods have advantages and disadvantages, but I believe (following the train of thought in &#8220;Matching engines go beyond apples and oranges&#8221;) that the two methods should always be combined. The reason for this is actually quite simple: <em><strong>the better the matching engine is able to determine what is what in a particular context, the better the probability calculation of a certain match or a certain non-match.</strong></em> This is, in essence, the same as humans do. We determine what we know and consequently use contextual probability and pattern recognition to assign significations to the words we come across.</p>
<p>Combining deterministic and probabilistic matching will yield in more precise matching, with less mismatches and less missed matches. Probabilistic matching often uses weighting schemes that consider the frequency of information to calculate a score and/or ranking. The more common a particular data element is, the lighter the weight that should be used in a comparison. That is a sound and robust approach. However, assigning weighting factors on data that have been interpreted <strong><em>and</em></strong> enhanced with statistical information, will increase the matching results to a high precision level.</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/high-precision-matching-apples-oranges-or-fruit-salad/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Chinglish &#8211; the most delightful side-effect of internationalization</title>
		<link>http://datavaluetalk.com/data-quality/chinglish-the-most-delightful-side-effect-of-internationalization/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=chinglish-the-most-delightful-side-effect-of-internationalization</link>
		<comments>http://datavaluetalk.com/data-quality/chinglish-the-most-delightful-side-effect-of-internationalization/#comments</comments>
		<pubDate>Fri, 09 Apr 2010 12:18:00 +0000</pubDate>
		<dc:creator>Holger Wandt</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Chinese characters]]></category>
		<category><![CDATA[fault-tolerance]]></category>
		<category><![CDATA[internationalisation]]></category>
		<category><![CDATA[internationalization]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[matching]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1439</guid>
		<description><![CDATA[An increasing number of companies have to deal with data from the world’s fastest emerging economy: China. And the big question in this issue is of course: How can we compare these “strange” Chinese characters with our own writing set? Grammar and character set of our Western alphabet-languages (such as English, French, Dutch or German) [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-medium wp-image-1440" title="little grass has life" src="http://datavaluetalk.com/cms/wp-content/uploads/2010/04/little-grass-has-life-300x198.jpg" alt="little grass has life" width="287" height="192" /></p>
<p>An increasing number of companies have to deal with data from the world’s fastest emerging economy: China. And the big question in this issue is of course: How can we compare these “strange” Chinese characters with our own writing set?</p>
<p>Grammar and character set of our Western alphabet-languages (such as English, French, Dutch or German) differ tremendously from Mandarin Chinese (which is the language spoken by most in the People’s Republic of China and abroad. Mandarin is a tonal language with an ideographic character set. Almost all characters have a semantic and a phonetic component. The different pithch in the pronunciation eventually determines the signification</p>
<p>Complicated? Definitely. But what about the other way around? Have you ever thought about the difficulties the Chinese have to face when trying to convert their language into meaningful English?</p>
<p>This phenomenon is sometimes hilariously being illustrated by the many public signs in China used to inform foreign visitors or to help them finding their way around.</p>
<p>This is truly a delightful side-effect of internationalization. …. <span id="more-1439"></span></p>
<p>The German sinologist Oliver Lutz Radtke christened these linguistic attempts “Chinglish” and collected many examples, which can be found virtually everywhere: on hotel room doors, on road signs along the highways, shampoo bottles and t-shirts. A small anthology:</p>
<ul>
<li>A warning sign for a steep slope: <strong><em>“Please, watch your slip”</em></strong></li>
<li>To avoid all misunderstandings, on the inside of a taxi door: <strong><em>“Don’t forget to carry your thing”</em></strong></li>
<li>A sign above a store entrance, to let our fantasy run free: <strong><em>“Welcome to presence”</em></strong></li>
</ul>
<p>Although this is all very funny, from a <a title="data quality" href="http://www.humaninference.com" target="_blank">data quality</a> point of view, this definitely leaves a thing or two to consider. For example: What should we think of fault-tolerance with regard to typo’s when we think of entering Chinese customer data into a database? What is the influence of typo’s in an ideographic writing set on searching, matching, enriching and correcting customer data?</p>
<p>There is still a lot of work to be done in international data quality. For more information, check out the Human Inference website on <a href="http://www.humaninference.com/Our%20products/HIquality%20Name%20Worldwide.aspx" target="_blank">HIquality Name Worldwide</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/chinglish-the-most-delightful-side-effect-of-internationalization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Matching persons with different official names</title>
		<link>http://datavaluetalk.com/data-quality/matching-persons-with-different-official-names/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=matching-persons-with-different-official-names</link>
		<comments>http://datavaluetalk.com/data-quality/matching-persons-with-different-official-names/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 15:32:59 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[cultural differences]]></category>
		<category><![CDATA[fault-tolerant matching]]></category>
		<category><![CDATA[matching]]></category>
		<category><![CDATA[names]]></category>
		<category><![CDATA[naming confusion]]></category>
		<category><![CDATA[nicknames]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1269</guid>
		<description><![CDATA[Dealing with matching of persons or contact data in general, we are all aware that individuals can make use of abbreviations or nicknames as kind of synonyms for their name. Classic examples are the usage of the name Bill for the actual name William, or like my own father is using the name Mans while [...]]]></description>
			<content:encoded><![CDATA[<p class="mceTemp"><img class="alignnone" title="what is the what?" src="http://img1.fantasticfiction.co.uk/images/n37/n185744.jpg" alt="" width="107" height="137" />Dealing with matching of persons or contact data in general, we are all aware that individuals can make use of abbreviations or nicknames as kind of synonyms for their name. Classic examples are the usage of the name <em>Bill </em>for the actual name <em>William</em>, or like my own father is using the name <em>Mans </em>while officially his name is <em>Hermanus</em>. Most <a title="data matching" href="http://www.humaninference.com/products/data-matching" target="_blank">data matching</a> engines make use of a kind of synonym table to take care of this. That can be done because within a culture or region the nicknames are quite often linked to the same names and people do not tend to use completely different official registered names.</p>
<p>It becomes more challenging if there is no longer a link between nickname and official name. That may happen, for example, if people move from one cultural region to another where also other writing sets are used. Take for example my chinese friend<em> </em>高为民, whose Latin name would be Gao Weimin (family name first), but the moment he works in Europe or the US he is using the Latin variant William Gao. There is no common relation to the name William and Weimin both in Latin or Chinese and it they are no phonetic variants of each other. <span id="more-1269"></span></p>
<p>Recently, I have read a very impressive book from Dave Eggers, called `What is the What´. It gives you a good insight in one of the current problem areas of the world and how people try to survive there. Achak Denk is one of the so-called <a title="Valentino Achak Deng organization" href="http://www.valentinoachakdeng.org/" target="_blank">Lost Boys from Sudan</a>. During his live in Sudan, in refugee camps and finally in the US he is officially using differnt names. That has nothing to do with purposely trying to mystify his identity, but more with receiving an identity from your environment &#8211; at that time and place. He is born as Achak, baptized as Valentino, and later on using the name Dominic or Dominic Arou and Marialdit. Of course there are people calling him nick names as &#8216;Sleeper&#8217; or &#8216;Gone Far&#8217; but at certain periods in his life he is officially using completely different names. This makes automatic matching of persons, or even manual matching, challenging and keeps it interresting.</p>
<p>I would recommend the book to everyone who wants to learn about what is happening in our world, and especially those interested in names (don&#8217;t forget to study all the names in the last Section of the book).</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/matching-persons-with-different-official-names/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>International domain names &#8211; there goes the ASCIIhood&#8230;.</title>
		<link>http://datavaluetalk.com/data-quality/international-domain-names-here-goes-the-asciihood/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=international-domain-names-here-goes-the-asciihood</link>
		<comments>http://datavaluetalk.com/data-quality/international-domain-names-here-goes-the-asciihood/#comments</comments>
		<pubDate>Wed, 28 Oct 2009 14:37:31 +0000</pubDate>
		<dc:creator>Holger Wandt</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[ICANN]]></category>
		<category><![CDATA[international domain names]]></category>
		<category><![CDATA[internet address]]></category>
		<category><![CDATA[matching]]></category>
		<category><![CDATA[Seoul]]></category>
		<category><![CDATA[transliteration]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1236</guid>
		<description><![CDATA[The internet is on the verge of one of the most fundamental changes in its history. The Internet Corporation for Assigned Names and Numbers (ICANN) is expected to agree on the use of internet addresses in non-Latin characters during this week&#8217;s ICANN convention in Seoul. If all goes according to plan, it will be possible [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-thumbnail wp-image-1239" title="sel-logo-155x82" src="http://datavaluetalk.com/cms/wp-content/uploads/2009/10/sel-logo-155x821-150x82.png" alt="sel-logo-155x82" width="150" height="82" /></p>
<p>The internet is on the verge of one of the most fundamental changes in its history. The Internet Corporation for Assigned Names and Numbers (ICANN) is expected to agree on the use of internet addresses in non-Latin characters during this week&#8217;s ICANN convention in Seoul. If all goes according to plan, it will be possible to use Greek, Cyrllic, Arabic, Chinese, Korean and many other characters in the internet browser&#8217;s address bar. More than half of the 1.6 billion internet users in the world are using a character set which is not Latin. Therefore, ICANN expects that the number of non-Latin domain names, and thus the number of new internet usersm, will increase rapidly.</p>
<p>This far-reaching change in the use of he internet is based on a system that can &#8220;translate&#8221; or &#8220;convert&#8221; different writing systems (with sometimes different writing directions, i.a Arabic and Hebrew). On a high level, it would look a little like this, I would imagine:</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td>
<p align="center">عربي</p>
</td>
<td>
<p align="center">中文</p>
</td>
<td>
<p align="center">English</p>
</td>
<td>
<p align="center">日本語</p>
</td>
<td>
<p align="center">Deutsch</p>
</td>
<td>
<p align="center">Français</p>
</td>
<td>
<p align="center">Español</p>
</td>
<td>
<p align="center">Русский</p>
</td>
<td>
<p align="center">Português</p>
</td>
<td>
<p align="center">한국어</p>
</td>
<td>
<p align="center">Italiano</p>
</td>
</tr>
<tr>
<td>
<p align="center">AR</p>
</td>
<td>
<p align="center">ZH</p>
</td>
<td>
<p align="center">EN</p>
</td>
<td>
<p align="center">JA</p>
</td>
<td>
<p align="center">DE</p>
</td>
<td>
<p align="center">FR</p>
</td>
<td>
<p align="center">ES</p>
</td>
<td>
<p align="center">RU</p>
</td>
<td>
<p align="center">PT</p>
</td>
<td>
<p align="center">KO</p>
</td>
<td>
<p align="center">IT</p>
</td>
</tr>
</tbody>
</table>
<p>Naturally, this phenomenon raises questions concerning the matching of internet addresses. Is <span style="color: #0000ff;"><span style="text-decoration: underline;"><strong>ووو.هُمَنِنفِرِرِنسِ.كُم </strong></span></span>the same as <a href="http://www.humaninference.com">www.humaninference.com</a>? It appears that generic multilingual <a title="data matching" href="http://www.humaninference.com/products/data-matching" target="_blank">data matching</a> issues also apply in this particular case.</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/international-domain-names-here-goes-the-asciihood/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Deduplication, first time wrong?</title>
		<link>http://datavaluetalk.com/data-quality/deduplication-first-time-wrong/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=deduplication-first-time-wrong</link>
		<comments>http://datavaluetalk.com/data-quality/deduplication-first-time-wrong/#comments</comments>
		<pubDate>Tue, 31 Mar 2009 13:25:28 +0000</pubDate>
		<dc:creator>Paul Tours</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[deduplication]]></category>
		<category><![CDATA[duplicate records]]></category>
		<category><![CDATA[duplicates]]></category>
		<category><![CDATA[match records]]></category>
		<category><![CDATA[matching]]></category>
		<category><![CDATA[merge records]]></category>
		<category><![CDATA[SAP]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=856</guid>
		<description><![CDATA[One of my current projects has been to take an intelligent approach to the removal of duplicates already on an existing system (SAP). The client has already successfully used our software in their IT environment to effectively stop all new duplicates being entered into SAP. They now want to use the same technology to remove [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-863" title="twins" src="http://datavaluetalk.com/cms/wp-content/uploads/2009/03/twins.gif" alt="twins" width="248" height="260" /></p>
<p>One of my current projects has been to take an intelligent approach to the removal of duplicates already on an existing system (SAP).</p>
<p>The client has already successfully used our software in their IT environment to effectively stop all new duplicates being entered into SAP. They now want to use the same technology to remove all existing duplicates. Their idea is so simple I am amazed that I have not heard of it being done elsewhere before.</p>
<p>Every evening the whole clients SAP database will be searched for duplicates in their Companies and Contacts (&gt; 3 million records deduplicated in less than an hour!) The results are stored in a master result table that SAP has been given access to. Now depending on the likelihood of the match, the duplicates can fall into one of three categories: automatic merging, manual merging or no merge. If the score for the whole duplicate group is above the threshold for automatic merging then the automatic merging process is started. <span id="more-856"></span></p>
<p>This merge process has been created by an external SAP consultancy group that does a lot of clever stuff in giving each record a score depending on its&#8217; financial relevance. E.g. open payments, current order status, payment reminders etc. (Hey, it&#8217;s SAP and in the world according to SAP only financial dealings have a value!) In the end the one record with the highest score is set to be the lead duplicate. All information from the other records in the duplicate group is placed onto the leading record to create a unique (&#8216;Golden&#8217;) record. All duplicate records with the exception of the lead duplicate are then removed from the system, in the case of SAP, these records are given a &#8216;set for deletion&#8217; flag and subsequently archived.</p>
<p>The &#8216;Non merges&#8217;, i.e. where the match score is below the accepted threshold level, are discarded and all remaining records are sent to a separate SAP mask for manual inspection for the following day. All that is required is to identify if the records shown belong in a duplicate group or not. After this decision has been made each duplicate group goes to the &#8216;merging&#8217; process. Just the same as the automatic merge process.</p>
<p>At the end of the day the whole process starts again. Wash, rinse, repeat! Simple! The first thing to happen is that over a short period of time all the secure duplicates disappear as they are merged automatically. This is highly visible, no more multiple identical records that pop up whenever a new record has been entered. The impact on the quality on the surrounding systems is just as direct. No sending out bills or marketing mails x times to the same person (having worked in Marketing before, I know the problem and it always leaves such a professional impression with the customer!) So it&#8217;s already something easy to sell to your managers and so far you have not had to lift a finger. Great!</p>
<p>The brilliance of the <a title="SAP data quality" href="http://www.humaninference.com/solutions/first-time-right/data-quality-for-sap" target="_blank">SAP data quality</a> solution though lies elsewhere. The simple fact is that it really does not matter whether the rest of the results are worked through in 1 day, 1 month or a year &#8211; as they are always captured, every day anew. The net result is that the total level of duplicates is constantly decreasing. Where the merge process has taken place, the duplicates will disappear. Only a change on the record will force it to be rechecked in the next round of deduplication. This means that apart for the costs of enhancement of the current system the client has an effective DQ firewall that now not only protects them from duplicate data being entered onto their IT systems, but will now over time cleanse the system from within. Even if it means putting an employee to sporadically make a decision on the manual matches. It is something that the company/department can concentrate on where they have time/resources available. (That should be easy after showing what success you have had with it already!)</p>
<p>How about if it the process could be easily and readily monitored? Say by using Excel or a similar product. Bar graphs and pie charts always tell way more than actual figures! Then the impact on what is happening is all the more visible and easy to sell (a good budget retainer!)</p>
<p>Good luck in dealing with your duplicates.</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/deduplication-first-time-wrong/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The added value of an integrated customer view</title>
		<link>http://datavaluetalk.com/mdm/the-added-value-of-an-integrated-customer-view/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-added-value-of-an-integrated-customer-view</link>
		<comments>http://datavaluetalk.com/mdm/the-added-value-of-an-integrated-customer-view/#comments</comments>
		<pubDate>Mon, 08 Dec 2008 14:44:56 +0000</pubDate>
		<dc:creator>Emile van de Klok</dc:creator>
				<category><![CDATA[MDM for customer data]]></category>
		<category><![CDATA[cdi]]></category>
		<category><![CDATA[demo]]></category>
		<category><![CDATA[matching]]></category>
		<category><![CDATA[single customer view]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=227</guid>
		<description><![CDATA[The added value of an integrated customer view depends strongly on the quality of that integrated customer view. Every organization that is seriously planning to create a single customer view should ask itself the following question: &#8220;What determines the quality of my customer view and so the accompanying level of added value?&#8221; Prior to answering [...]]]></description>
			<content:encoded><![CDATA[<div class="mceTemp">
<div style="text-align: auto;"><a href="http://datavaluetalk.com/mdmdemo/"><img src="http://www.watweetikvanmijnklant.nl/wp-content/uploads/2008/12/mdmdemoss-249x300.jpg" alt="MDM Demo" width="149" height="180" /></a></div>
</div>
<p>The added value of an integrated customer view depends strongly on the quality of that integrated customer view. Every organization that is seriously planning to create a single customer view should ask itself the following question: &#8220;What determines the quality of my customer view and so the accompanying level of added value?&#8221;</p>
<p>Prior to answering this question we need to take one step back. Why does not every organization have a <a title="single customer view" href="http://www.humaninference.com/solutions/single-customer-view" target="_blank">single customer view</a>? The cause lies in the fact that many organizations have their customer data spread across multiple systems all facilitating separate business processes. Additionally customer data is often highly polluted, fragmented and incomplete.</p>
<p><span id="more-227"></span></p>
<p>So it appears that the data itself plays a crucial role in the lack of an integrated customer view. Or more accurately, the better the data &#8211; the better the customer view. And the better the <a title="data matching" href="http://www.humaninference.com/products/data-matching" target="_blank">data matching</a> of customer records across separate systems the better the integrated customer view.</p>
<p>So Data Quality and Matching (Identity Resolution) determine in large parts the quality of the integrated customer view and the added value that it delivers. <a title="MDM Demo" href="http://datavaluetalk.com/mdmdemo/" target="_blank">Take a look at this demo</a> showing a step-by-step approach how to build a single customer view and get a better idea of the role of Data Quality and Matching within this process.</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/mdm/the-added-value-of-an-integrated-customer-view/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

