<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data Value Talk &#187; fault-tolerant matching</title>
	<atom:link href="http://datavaluetalk.com/tag/fault-tolerant-matching/feed/" rel="self" type="application/rss+xml" />
	<link>http://datavaluetalk.com</link>
	<description>Customer data is a valuable asset. Why not treat it that way?</description>
	<lastBuildDate>Mon, 09 Jan 2012 11:38:42 +0000</lastBuildDate>
	<language>nl</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Matching persons with different official names</title>
		<link>http://datavaluetalk.com/data-quality/matching-persons-with-different-official-names/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=matching-persons-with-different-official-names</link>
		<comments>http://datavaluetalk.com/data-quality/matching-persons-with-different-official-names/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 15:32:59 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[cultural differences]]></category>
		<category><![CDATA[fault-tolerant matching]]></category>
		<category><![CDATA[matching]]></category>
		<category><![CDATA[names]]></category>
		<category><![CDATA[naming confusion]]></category>
		<category><![CDATA[nicknames]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1269</guid>
		<description><![CDATA[Dealing with matching of persons or contact data in general, we are all aware that individuals can make use of abbreviations or nicknames as kind of synonyms for their name. Classic examples are the usage of the name Bill for the actual name William, or like my own father is using the name Mans while [...]]]></description>
			<content:encoded><![CDATA[<p class="mceTemp"><img class="alignnone" title="what is the what?" src="http://img1.fantasticfiction.co.uk/images/n37/n185744.jpg" alt="" width="107" height="137" />Dealing with matching of persons or contact data in general, we are all aware that individuals can make use of abbreviations or nicknames as kind of synonyms for their name. Classic examples are the usage of the name <em>Bill </em>for the actual name <em>William</em>, or like my own father is using the name <em>Mans </em>while officially his name is <em>Hermanus</em>. Most matching engines make use of a kind of synonym table to take care of this. That can be done because within a culture or region the nicknames are quite often linked to the same names and people do not tend to use completely different official registered names.</p>
<p>It becomes more challenging if there is no longer a link between nickname and official name. That may happen, for example, if people move from one cultural region to another where also other writing sets are used. Take for example my chinese friend<em> </em>高为民, whose Latin name would be Gao Weimin (family name first), but the moment he works in Europe or the US he is using the Latin variant William Gao. There is no common relation to the name William and Weimin both in Latin or Chinese and it they are no phonetic variants of each other. <span id="more-1269"></span></p>
<p>Recently, I have read a very impressive book from Dave Eggers, called `What is the What´. It gives you a good insight in one of the current problem areas of the world and how people try to survive there. Achak Denk is one of the so-called <a title="Valentino Achak Deng organization" href="http://www.valentinoachakdeng.org/" target="_blank">Lost Boys from Sudan</a>. During his live in Sudan, in refugee camps and finally in the US he is officially using differnt names. That has nothing to do with purposely trying to mystify his identity, but more with receiving an identity from your environment &#8211; at that time and place. He is born as Achak, baptized as Valentino, and later on using the name Dominic or Dominic Arou and  Marialdit. Of course there are people calling him nick names as &#8216;Sleeper&#8217; or &#8216;Gone Far&#8217; but at certain periods in his life he is officially using completely different names. This makes automatic matching of persons, or even manual matching, challenging and keeps it interresting.</p>
<p>I would recommend the book to everyone who wants to learn about what is happening in our world, and especially those interested in names (don&#8217;t forget to study all the names in the last Section of the book).</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/matching-persons-with-different-official-names/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mandatory name-number-check at money transfer?</title>
		<link>http://datavaluetalk.com/data-quality/mandatory-name-number-check-at-money-transfer/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=mandatory-name-number-check-at-money-transfer</link>
		<comments>http://datavaluetalk.com/data-quality/mandatory-name-number-check-at-money-transfer/#comments</comments>
		<pubDate>Tue, 10 Nov 2009 10:08:12 +0000</pubDate>
		<dc:creator>Ron Mulderij</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[fault-tolerant matching]]></category>
		<category><![CDATA[money transfer]]></category>
		<category><![CDATA[name-number-check]]></category>
		<category><![CDATA[risk mangement]]></category>
		<category><![CDATA[typing errors]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1245</guid>
		<description><![CDATA[Through the increase of modern technologies our payments are processed electronically more and more. Banks try to reduce costs and force their customers to carry out the payments themselves. Internet banking has become the standard. Customers no longer can deliver written transfer orders at their bank, but have to book the transfers using internet banking [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-thumbnail wp-image-1249" title="NameNumberMeaning" src="http://datavaluetalk.com/cms/wp-content/uploads/2009/11/NameNumberMeaning-150x150.jpg" alt="NameNumberMeaning" width="150" height="150" /></p>
<p>Through the increase of modern technologies our payments are processed electronically more and more. Banks try to reduce costs and force their customers to carry out the payments themselves. Internet banking has become the standard. Customers no longer can deliver written transfer orders at their bank, but have to book the transfers using internet banking facilities.People can easily make a typing error in the account number that still will result in an existing account number. The risks are fully on the customer’s side. Although banks always are willing to help them to get the money returned, it’s better to avoid these errors. </p>
<p>In my opinion, banks should be obliged to perform a name-number-check for every payment or at least for every larger amount. <span id="more-1245"></span></p>
<p>Recently in the Netherlands someone intended to transfer 43.000 euros to his son but made a typing error in the account number  and transferred the money to someone else (<a href="http://www.nu.nl/algemeen/2088460/typefout-kost-man-43000-euro.html" target="_blank">see  this in article in Dutch</a>). This caused a lot of problems,  since the (unintended) recipient spent the money directly and was not willing to pay it  back.</p>
<p>The best solution would be to check name and number before the money is transferred. On the other hand, only the bank of the receiver knows these details, so the easiest solution is to perform the check before the money is added to the account. The technology for the name-number-check is an on-the-fly matching, based on fault tolerant identification. The rules should not be too strict since it might happen that e.g. initials are not completely the same.</p>
<p>The following examples could be the result of fault tolerant matching technology: </p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="259" valign="top"><strong>Transfer information</strong></td>
<td width="240" valign="top"><strong>Account information</strong></td>
<td width="97" valign="top"><strong>Result</strong></td>
</tr>
<tr>
<td width="259" valign="top">J.H. Johnson11 High StreetLondon</td>
<td width="240" valign="top">J. Johnson11 High StreetLondon</td>
<td width="97" valign="top">Correct</td>
</tr>
<tr>
<td width="259" valign="top">J.H. Johnson11 High StreetLondon</td>
<td width="240" valign="top">L. Johnson11 High StreetLondon</td>
<td width="97" valign="top">Not correct</td>
</tr>
<tr>
<td width="259" valign="top">J.H. Johnson11 High StreetLondon</td>
<td width="240" valign="top">J. Johnson13 High StreetLondon</td>
<td width="97" valign="top">Not correct</td>
</tr>
</tbody>
</table>
<p>The rules should be configurable to meet every bank’s requirements. In the examples mentioned initials and house number have been indicated as important differences. It might be that this is not in all cases the required setting.</p>
<p>Finally, an on-the-fly matching for name-number-check doesn’t require a lot of processing time if the right technology is used and avoids a lot of problems.</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/mandatory-name-number-check-at-money-transfer/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Data Cleansing with intelligent identification</title>
		<link>http://datavaluetalk.com/data-quality/data-cleansing-with-intelligent-identification/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=data-cleansing-with-intelligent-identification</link>
		<comments>http://datavaluetalk.com/data-quality/data-cleansing-with-intelligent-identification/#comments</comments>
		<pubDate>Thu, 16 Apr 2009 14:02:55 +0000</pubDate>
		<dc:creator>Hicham Kardouna</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Data Services]]></category>
		<category><![CDATA[BaFin]]></category>
		<category><![CDATA[Bundesanstalt für Finanzdienstleistungsaufsicht]]></category>
		<category><![CDATA[compliance]]></category>
		<category><![CDATA[compliant]]></category>
		<category><![CDATA[fault-tolerant matching]]></category>
		<category><![CDATA[legal form]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=888</guid>
		<description><![CDATA[In many cases an inductive method of data cleansing is the way to go. With the right tools and expertise you can inspect, transform and cleanse entities in a database and reach high levels of data quality without the need to use external reference data. In some cases, however, only working with the internal data [...]]]></description>
			<content:encoded><![CDATA[<p><span style="font-size: x-small; font-family: Tahoma;"><img class="alignleft size-full wp-image-889" title="clean-data" src="http://datavaluetalk.com/cms/wp-content/uploads/2009/04/clean-data.gif" alt="clean-data" width="200" height="200" /></span></p>
<p>In many cases an inductive method of data cleansing is the way to go. With the right tools and expertise you can inspect, transform  and cleanse entities in a database and reach high levels of data quality without the need to use external reference data. In some cases, however, only working with the internal data and inductively identifying and fixing data patterns is not sufficient. Let&#8217;s take a practical example: a bank needs to report on a particular segment of its clients to German bank supervisor BaFin &#8211; the Federal Financial Supervisory Authority aka Bundesanstalt für Finanzdienstleistungsaufsicht. The bank apparently has done its homework and has created a central database containing all entities needed for the compliance check. Moreover, the bank has worked out a rather complex set of rules how data must be processed and corrected. One of the most important anchor points in this specific framework is the separation between B2C and B2B entities and for the latter the exact identification of the correct legal form. But what if you cannot trust this identification?<span id="more-888"></span></p>
<p>After having profiled the data I quickly found thousands of conflicts, e.g. records with the legal form GmbH (limited company) were at the same time tagged as B2C customer. So, which entry is correct? And: if it is a B2B entity, can we be sure that the legal form GmbH is correct? Especially as the bank has set different rules for processing GbR, GmbH, KG, GmbH &amp; Co. KG, AG, KGaA, just to name a few legal forms&#8230;</p>
<p>This is when cooperation with a specialised data provider is needed. And it is actually an advantage if this is not the first time you work with external reference data of this partner. Knowing the providers and the specifics of the various data models in use is a prerequisite for successfully enriching data. Not to forget having an excellent, fault-tolerant matching engine that helps you to link corrupted or wrong name entities to the reference database! In the case of the bank, using the right data provider and matching its content to the customers in the compliance database resolved more than ca 90% of conflicts related to the wrong legal form. And this of course is the cornerstone for being compliant!</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/data-cleansing-with-intelligent-identification/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

