<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data Value Talk &#187; compliance</title>
	<atom:link href="http://datavaluetalk.com/tag/compliance/feed/" rel="self" type="application/rss+xml" />
	<link>http://datavaluetalk.com</link>
	<description>Customer data is a valuable asset. Why not treat it that way?</description>
	<lastBuildDate>Thu, 10 May 2012 14:49:53 +0000</lastBuildDate>
	<language>nl</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Any close encounters with the FBI terrorist watchlist?</title>
		<link>http://datavaluetalk.com/data-governance/any-close-encounters-with-the-fbi-terrorist-watchlist/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=any-close-encounters-with-the-fbi-terrorist-watchlist</link>
		<comments>http://datavaluetalk.com/data-governance/any-close-encounters-with-the-fbi-terrorist-watchlist/#comments</comments>
		<pubDate>Mon, 17 Aug 2009 09:14:34 +0000</pubDate>
		<dc:creator>Ramon de Noronha</dc:creator>
				<category><![CDATA[Data Governance]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[compliance]]></category>
		<category><![CDATA[identification]]></category>
		<category><![CDATA[identity]]></category>
		<category><![CDATA[interpretation]]></category>
		<category><![CDATA[knowledge]]></category>
		<category><![CDATA[persistent identification]]></category>
		<category><![CDATA[processes]]></category>
		<category><![CDATA[suspect list matching]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1125</guid>
		<description><![CDATA[Just before this summer the U.S. Department of Justice filed a report about the FBI Terrorist Watchlist. This watchtlist serves as a critical tool for screening and law enforcement personnel for alerting them when they come across a known or suspected terrorist. It is used by personnel at airports, harbours and the borderline. Also when [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1127" src="http://datavaluetalk.com/cms/wp-content/uploads/2009/08/tsc080105a.jpg" alt="tsc080105a" width="160" height="152" />Just before this summer the U.S. Department of Justice filed a report about the FBI Terrorist Watchlist. This watchtlist serves as a critical tool for screening and  law enforcement personnel for alerting them when they come across a known or suspected terrorist. It is used by personnel at airports, harbours and the borderline. Also when you apply for a visum you are matched against this watchlist. The Terrorist Screening Center, a subsidiary of the FBI, is responsible for maintaining the watchlist.</p>
<p>This watchlist was created in 2004 from several other lists and at that time it consisted of about 68.000 entries. I use the word entries, because in the years after it became fuzzy if one record is the same as one individual. By the end of 2008 the list had grown to over 1,1 million entries. In 2008 after the American Civil Liberties Union (ACLU) mentioned that the list had <a title="Numbers don't add up" href="http://www.aclu.org/privacy/gen/36064res20080721.html" target="_blank">passed the 1 million</a>, the government came with an explanation. <em>Although we have recorded over 1 million entries in the database, the net result is that these records correspond to about 400.000 individuals. </em>Terrorist often use different and thus multiple identities, use several (falsified) passports etc. But adding entries with only the first initials and last name, while an entry of the full first names and last name already exists will result in unwanted side-effects.<span id="more-1125"></span></p>
<p>We all know, as being interested in data quality and identity resolution, that J. Robinson will result into much more matches (hits) than James Robinson. Indeed the number of found matches will sky-rocket and have to be evaluated manually. Might this be the reason, that we see more and more security personnel on airports?</p>
<p>In the<a href="http://www.usdoj.gov/oig/reports/FBI/a0925/final.pdf" target="_blank"> latest audit report</a> of the U.S. Department of Justice about this watchlist one other problem was analyzed. While extensive procedures were made for nominating and adding suspects to the watchlist, there is no procedure for removing people from the list. Based on a sample of almost 70.000 entries and investigation of the individuals an astounding number of 35% omissions was found. People who had died were still on the list, people who were no longer investigated upon, cases which had been closed etc. So this watchlist is <a href="http://www.aclu.org/privacy/spying/watchlistcounter.html" target="_blank">growing and growing</a>. Resulting in screening personnel who ensnare many innocent travelers as suspected terrorists. And wasting their time and divert their energies from looking for true terrorists. It seems to me that FBI and TSC can benefit from better Data Governance, what do you think?</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-governance/any-close-encounters-with-the-fbi-terrorist-watchlist/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Cleansing with intelligent identification</title>
		<link>http://datavaluetalk.com/data-quality/data-cleansing-with-intelligent-identification/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=data-cleansing-with-intelligent-identification</link>
		<comments>http://datavaluetalk.com/data-quality/data-cleansing-with-intelligent-identification/#comments</comments>
		<pubDate>Thu, 16 Apr 2009 14:02:55 +0000</pubDate>
		<dc:creator>Hicham Kardouna</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Data Services]]></category>
		<category><![CDATA[BaFin]]></category>
		<category><![CDATA[Bundesanstalt für Finanzdienstleistungsaufsicht]]></category>
		<category><![CDATA[compliance]]></category>
		<category><![CDATA[compliant]]></category>
		<category><![CDATA[fault-tolerant matching]]></category>
		<category><![CDATA[legal form]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=888</guid>
		<description><![CDATA[In many cases an inductive method of data cleansing is the way to go. With the right tools and expertise you can inspect, transform and cleanse entities in a database and reach high levels of data quality without the need to use external reference data. In some cases, however, only working with the internal data [...]]]></description>
			<content:encoded><![CDATA[<p><span style="font-size: x-small; font-family: Tahoma;"><img class="alignleft size-full wp-image-889" title="clean-data" src="http://datavaluetalk.com/cms/wp-content/uploads/2009/04/clean-data.gif" alt="clean-data" width="200" height="200" /></span></p>
<p>In many cases an inductive method of <a title="data cleansing" href="http://www.humaninference.com/products/data-cleansing" target="_blank">data cleansing</a> is the way to go. With the right tools and expertise you can inspect, transform and cleanse entities in a database and reach high levels of data quality without the need to use external reference data. In some cases, however, only working with the internal data and inductively identifying and fixing data patterns is not sufficient. Let&#8217;s take a practical example: a bank needs to report on a particular segment of its clients to German bank supervisor BaFin &#8211; the Federal Financial Supervisory Authority aka Bundesanstalt für Finanzdienstleistungsaufsicht. The bank apparently has done its homework and has created a central database containing all entities needed for the compliance check. Moreover, the bank has worked out a rather complex set of rules how data must be processed and corrected. One of the most important anchor points in this specific framework is the separation between B2C and B2B entities and for the latter the exact identification of the correct legal form. But what if you cannot trust this identification?<span id="more-888"></span></p>
<p>After having profiled the data I quickly found thousands of conflicts, e.g. records with the legal form GmbH (limited company) were at the same time tagged as B2C customer. So, which entry is correct? And: if it is a B2B entity, can we be sure that the legal form GmbH is correct? Especially as the bank has set different rules for processing GbR, GmbH, KG, GmbH &amp; Co. KG, AG, KGaA, just to name a few legal forms&#8230;</p>
<p>This is when cooperation with a specialised data provider is needed. And it is actually an advantage if this is not the first time you work with external reference data of this partner. Knowing the providers and the specifics of the various data models in use is a prerequisite for successfully enriching data. Not to forget having an excellent, fault-tolerant matching engine that helps you to link corrupted or wrong name entities to the reference database! In the case of the bank, using the right data provider and matching its content to the customers in the compliance database resolved more than ca 90% of conflicts related to the wrong legal form. And this of course is the cornerstone for being compliant!</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/data-cleansing-with-intelligent-identification/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

