<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data Value Talk &#187; Data Quality</title>
	<atom:link href="http://datavaluetalk.com/tag/data-quality/feed/" rel="self" type="application/rss+xml" />
	<link>http://datavaluetalk.com</link>
	<description>Customer data is a valuable asset. Why not treat it that way?</description>
	<lastBuildDate>Thu, 10 May 2012 14:49:53 +0000</lastBuildDate>
	<language>nl</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Ask Me is linked with Any Body and relates with Walther Von Stolzing</title>
		<link>http://datavaluetalk.com/data-quality/ask-me-is-linked-with-any-body-and-relates-with-walther-von-stolzing/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=ask-me-is-linked-with-any-body-and-relates-with-walther-von-stolzing</link>
		<comments>http://datavaluetalk.com/data-quality/ask-me-is-linked-with-any-body-and-relates-with-walther-von-stolzing/#comments</comments>
		<pubDate>Wed, 12 Oct 2011 08:51:26 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Names]]></category>
		<category><![CDATA[cleansing]]></category>
		<category><![CDATA[identity]]></category>
		<category><![CDATA[interpretation]]></category>
		<category><![CDATA[knowledge]]></category>
		<category><![CDATA[name]]></category>
		<category><![CDATA[names]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1991</guid>
		<description><![CDATA[Weird subject, isn&#8217;t it? Quite obvious for everybody, the persons &#8216;Ask Me&#8217; and &#8216;Any Body&#8217; are artificial names. They will never belong to a real person. How they relate to &#8216;Walter von Stolzing&#8217; will follow. For over 25 years Human Inference has collected reference data, for instance on persons. Because of our reference set we immediately recognize [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://datavaluetalk.com/cms/wp-content/uploads/2011/10/Obama.png"><img class="alignleft size-thumbnail wp-image-2022" title="I'm Obama" src="http://datavaluetalk.com/cms/wp-content/uploads/2011/10/Obama-150x150.png" alt="" width="150" height="150" /></a>Weird subject, isn&#8217;t it? Quite obvious for everybody, the persons &#8216;Ask Me&#8217; and &#8216;Any Body&#8217; are artificial names. They will never belong to a real person. How they relate to &#8216;Walter von Stolzing&#8217; will follow.</p>
<p>For over 25 years Human Inference has collected reference data, for instance on persons. Because of our reference set we immediately recognize that &#8216;Ask Me&#8217; and &#8216;Any Body&#8217; are fake names. People are using these either in test situations or to hide their actual names.</p>
<p>In the old days we only needed to test on &#8216;Test Test&#8217;, in more recent years we see great inventiveness on these fake names. A brief example can be seen in the following list.</p>
<div align="center">
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top" width="137">Alpha Beta</td>
<td valign="top" width="137">Any Body</td>
</tr>
<tr>
<td valign="top" width="137">Ask Me</td>
<td valign="top" width="137">Best Friend</td>
</tr>
<tr>
<td valign="top" width="137">Blue Sky</td>
<td valign="top" width="137">Cool Dude</td>
</tr>
<tr>
<td valign="top" width="137">Dress Code</td>
<td valign="top" width="137">El Comandante</td>
</tr>
<tr>
<td valign="top" width="137">Guess Who</td>
<td valign="top" width="137">In Cognito</td>
</tr>
</tbody>
</table>
</div>
<p>In case you cannot rely on reference data and interpretation you need to provide a check list. Providing it is one thing, but since users tend to be really creative, maintaining it is essential.<span id="more-1991"></span></p>
<p>In these 25 years we identified a move from &#8216;real fake names&#8217; towards &#8216;real names used in a fake way&#8217;. In the USA, for example, we identified popular Hollywood names and names of politicians being used as fake names. Currently the usage of the name &#8216;George Bush&#8217; is decreasing, whereas &#8216;Barack Obama&#8217; is increasingly used. We recognize the false usage of these names because of the change in frequency figures of the given name and family name as well as the usage of the combination itself. Remarkable is that &#8216;Abraham Lincoln&#8217; and &#8216;George Washington&#8217; are quite steady.</p>
<p>Back to &#8216;Walter von Stolzing&#8217;. By now you might have guessed what is happening here. We recognized that in German speaking areas this name is also passing our threshold on validity. By <a href="http://en.wikipedia.org/wiki/Die_Meistersinger_von_N%C3%BCrnberg" rel="nofollow">googling</a> the name you can see that Walter is actually a character in Wagner’s opera &#8216;Die Meistersinger von Nürnberg&#8217; back from 1868!</p>
<p>Let’s see if in 100 years time people are still using &#8216;Darth Vader&#8217;, &#8216;Lord Rings&#8217; or &#8216;Snoop Dogg&#8217;!</p>
<p>All the names used in this blog are ‘real’ names coming from a popular social media site. Please check our <a href="http://www.humaninference.nl/producten/data-cleansing">data cleansing</a> products in case you need cleansing solutions.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/ask-me-is-linked-with-any-body-and-relates-with-walther-von-stolzing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What is equal? &#8211; challenges with sound and synonyms</title>
		<link>http://datavaluetalk.com/data-quality/what-is-equal-challenges-with-sound-and-synonyms/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=what-is-equal-challenges-with-sound-and-synonyms</link>
		<comments>http://datavaluetalk.com/data-quality/what-is-equal-challenges-with-sound-and-synonyms/#comments</comments>
		<pubDate>Mon, 08 Aug 2011 13:43:45 +0000</pubDate>
		<dc:creator>Winfried van Holland</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[apples and oranges]]></category>
		<category><![CDATA[fuzzy matching]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[String comparison]]></category>
		<category><![CDATA[synonyms]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1882</guid>
		<description><![CDATA[What to do when basic string comparison (fuzzy search) techniques won&#8217;t give the right results? Fuzzy search helps to find matches in situations where people make typo&#8217;s (e.g. compare Human Inference with Human Inverence) or make up abbreviations (King str. with King street) or ignore diacritics (Sørensen and Soerensen). In case the &#8216;wrong word&#8217; is [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://datavaluetalk.com/data-quality/what-is-equal-challenges-with-sound-and-synonyms/attachment/phonology-2/" rel="attachment wp-att-1934"><img class="alignleft size-thumbnail wp-image-1934" title="Phonology" src="http://datavaluetalk.com/cms/wp-content/uploads/2011/08/Phonology1-150x150.png" alt="" width="150" height="150" /></a>What to do when basic string comparison (fuzzy search) techniques won&#8217;t give the right results? Fuzzy search helps to find matches in situations where people make typo&#8217;s (e.g. compare Human Inference with Human In<strong>v</strong>erence) or make up abbreviations (King str. with King street) or ignore diacritics (Sørensen and Soerensen). In case the &#8216;wrong word&#8217; is not a real used word it becomes obvious that after correcting the typo we have a match.</p>
<p>More challenges appear if the typo has caused another existing word; now we need to make a decision on how equal the two entries are. In case you have some knowledge on the frequency of usage of words you can use that in the equation. How to get the frequency of usage for words is another ballgame &#8211; at least you can assume that a &#8216;wrong word&#8217; is never used (bit of a paradox).<span id="more-1882"></span></p>
<p>A large group of possible matches that are not found (i.e. missed matches) by fuzzy search methods are the ones that sound the same but are written rather differently. Often a callcenter agent types the name exactly like he hears it. An example would be the family name ‘Farren’ and ‘Pharan’. They have already so many differences that it becomes rather hard for a string comparison to treat both entries as equal. Phonetic search would definitely help here. Drawback on only phonetics is that you can now combine entries that are for sure no matches (i.e. mismatches), e.g.:</p>
<ol>
<li>René Meierhofer and</li>
<li>Renée Mayrhofer</li>
</ol>
<p>Two valid family names, but the given names show both a male and a female entry.</p>
<p>In a real life example, we would expect a complete name with titles and we&#8217;d still need to match in a correct way. Take, for example,</p>
<ol>
<li>Dr. John J. Farren jr.</li>
<li>John J. Pharan jr. PhD</li>
</ol>
<p>Pure string comparisons based searches won’t work in this case. The complete entry could be matched in combination with some smart academic synonyms and some n-gram or matrix comparison on the individual elements.</p>
<p>Introducing synonyms immediately generates new types of challenges. In address matching you will go a long way when you take into account the abbreviations for street types (Avenue for Av., Street for Str. etc). For company names it definitely helps to have a synonym table on legal forms (Limited for Ltd, Incorporated for Inc., etc). With the actual company name itself it becomes more challenging. A German example might look like:</p>
<ol>
<li>Fahrrad-Handel Anna Cintula and</li>
<li>Zweirad-Shop Anna Cintula,</li>
</ol>
<p>Two synonyms for bike shop. Quite often people think in such situations that by adding a synonym table the challenge is gone. They are absolutely right for part of the problem but still there is a large set of words that get their specific meaning based on the context of that word &#8211; and by that they refer to a particular synonym. If we take for example the following three entries, it seems evident that we cannot replace the word &#8216;art&#8217; with one single synonym here</p>
<ol>
<li><strong>Art</strong> Gallery Garfunkel</li>
<li><strong>ART</strong> Auto Rendition Technology</li>
<li>Paul Simon &amp; <strong>Art</strong> Garfunkel</li>
</ol>
<p>String comparison is fine as a start in <a title="Data Matching" href="http://www.humaninference.com/products/data-matching" target="_blank">data matching</a> problems. To really avoid a serious amount of mismatches or missed matches – preventing a serious amount of manual work &#8211; you need to know what you’re dealing with. You need to compare <a title="High precision matching – apples, oranges or fruit salad?" href="http://datavaluetalk.com/2010/10/21/high-precision-matching-apples-oranges-or-fruit-salad/" target="_blank">apples with apples, oranges with oranges</a>. What would really help here, is a bit of natural language processing ;-)</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/what-is-equal-challenges-with-sound-and-synonyms/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Has your name ever hurt you? &#8211; when nomen becomes omen</title>
		<link>http://datavaluetalk.com/data-quality/when-nomen-becomes-omen/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=when-nomen-becomes-omen</link>
		<comments>http://datavaluetalk.com/data-quality/when-nomen-becomes-omen/#comments</comments>
		<pubDate>Mon, 08 Aug 2011 12:46:30 +0000</pubDate>
		<dc:creator>Esther Labrie</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Names]]></category>
		<category><![CDATA[customer data]]></category>
		<category><![CDATA[customer view]]></category>
		<category><![CDATA[first name]]></category>
		<category><![CDATA[identity]]></category>
		<category><![CDATA[knowledge]]></category>
		<category><![CDATA[names]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1887</guid>
		<description><![CDATA[Addressing clients with the right data often means the difference between making a profit and not making a profit. Working with data quality experts has made me ever more consious of the value personal data represents for people. In this respect names are especially intriguing to me, as owners appear to identify with their name [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://datavaluetalk.com/data-quality/when-nomen-becomes-omen/attachment/baby-baby-names-3/" rel="attachment wp-att-1899"><img class="alignleft size-thumbnail wp-image-1899" title="bad baby names" src="http://datavaluetalk.com/cms/wp-content/uploads/2011/08/baby-baby-names2-150x150.jpg" alt="" width="150" height="150" /></a>Addressing clients with the right data often means the difference between making a profit and not making a profit. Working with <a title="Data Quality" href="http://www.humaninference.com" target="_blank">data quality</a> experts has made me ever more consious of the value personal data represents for people. In this respect names are especially intriguing to me, as owners appear to identify with their name <em>a lot</em>. So I decided to do a little research and determine if people really are what their name tells you. Can <em>nomen</em> indeed become <em>omen</em>?</p>
<p>Your parents probably gave a lot of thought to the name they once gave you, and as it turns out they were right to do so! Research tells us a name can do wonders for its owner, as well as a lot of damage for that matter. Let’s have a look at some remarkable results.</p>
<p><strong>Peter for President!<br />
</strong>Recent studies show that in the US a student called Fred is more likely to fail his exam than a student who just happened to be named Andrew: people tend to indentify with their name and, in general, have a positive feeling about letters that correspond with their initials. Consequently Fred is far more likely to settle for a meager F, while Andrew will have an extra motive to strive for an A. <span id="more-1887"></span>It also explains how in choosing a partner we show a slight preference for someone whose name resembles our own, or why Mary will prefer to live in Maryland, while Monica is more inclined to settle in Santa Monica. Most of these preferences only show themselves through our subliminal selves, so we are not actually aware of the motivation for some of our choises. Another US study endorses these findings: inspired by the results mentioned above, researchers decided they’d investigate on another letter. They came up with the letter K, which in baseball stands for strikeout. The study showed once again that there is a connection between a letter and its causer: batters whose names began with a K struck out more often than other batters.</p>
<p><strong>Ominous names<br />
</strong>A UK research tells us that as much as one in 5 parents regret how they named their child. The novelty might have worn off after a few years, but can there be any real objections to a certain name? Apparently, there are plenty! Ironically it’s not the parents who’ll have to carry this burden for the rest of their lives…</p>
<p><strong>“Hi, I’m Antwan, but you can call me Antoine…”<br />
</strong>It seems that even children’s language skills are influenced by their name. This has to do with the effect negative emotions can have on a child’s performance. If for example you decided to name your son ‘Gene’ but spell it ‘Jene’, he is very likely to get confronted with disbelief from his teachers. “Are you sure your name isn’t spelled with a ‘G’?” This can severely undermine Jene’s sense of confidence. That explains why children with an unusual name or a name that is unusually spelled generally are less adequate spellers and readers.</p>
<p><strong>“But Sissi is a Royal name, dear!”<br />
</strong>When a girl is called Frankie we think it’s a fun name, a cool and robust statement to fit a strong personality. Yet when a boy is called Mckenzie, (yes, some parents think it’s cute to give their boy a name that has a feminine touch to it ) we see a similar effect, but with a different outcome. This is something his parents obviously had not foreseen: their son will constantly be shaking off his girly image. The effect is striking: boys with a androgynous name misbehave more often than their unambiguously named peers, especially when they reach puberty. A boy called Mckenzie or Aubrey is even more likely to display bad behaviour when there is a girl with the same name among his peers. One more reason for parents to stick to conventions when choosing a name for their newborn.</p>
<p><strong>Want to produce the new Einstein? Call her Kate!<a href="http://datavaluetalk.com/data-quality/when-nomen-becomes-omen/attachment/einstein/" rel="attachment wp-att-1911"><img class="alignright size-thumbnail wp-image-1911" title="The new Einstein? Kate!" src="http://datavaluetalk.com/cms/wp-content/uploads/2011/08/einstein-150x150.jpg" alt="The new Einstein? Kate!" width="150" height="150" /></a><br />
</strong>A name can be a burden, but if you use this knowledge wisely, you might just turn it into an advantage. What happens to a girl when she has finished school and needs to choose what subject to study? Well, according to a US study, her choice depends on her name. As it turns out girls with a very feminine name like Julietta or Isabella are more likely to study humanities, while those whose name is less obviously feminine are more partial towards science. The question is: who’s aspiring to whom? Could it be that parents would treat Kate in a different way than Barbara? Or did the parents subconciously decide they wanted to raise a scientist when they decided to call their daughter Kate?</p>
<p><strong>Would you rather hire Vanity or Grace?<br />
</strong>Of course it’s not just letters or gender that determines how we feel about a name. In fact, how other people perceive us very much depends on the meaning of our name. For example: when looking for a new member on your marketing team, would you rather hire Vanity or Grace? In spite of what her name tells us, Grace might be a job jumper who doesn’t know how to work in unison with her colleagues. Vanity on the other hand could just be a daughter of a well-read mother who had just finished her latest Thackeray when she gave birth. Still, both women will either meet a lot of prejudice or feel the need to live up to a very high standard because of their name.</p>
<p>It all goes to show that a name defenitely posesses some self-fulfilling qualities. Given the fact that so many parents regret their choice of names afterwards makes me think that the owners of that name might share these sentiments. So what does that mean when looking at it from a data quality point of view? Unisex names for example are responsible for a lot of data quality issues. As the borders between male and female names are fading we’ll need to update our knowledge continually. The human in Human Inference will definitely take care of that. After all, we wouldn’t want to you to put off Mrs Clinton when sending her a petition to take pity on the Syrian citizens starting: &#8220;<em>Dear Mr. Clinton</em>…”.</p>
<p>Source: Livescience.com &amp; Babynames.com</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/when-nomen-becomes-omen/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>First Time Right  &#8211; The customer perspective</title>
		<link>http://datavaluetalk.com/data-quality/first-time-right-the-customer-perspective/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=first-time-right-the-customer-perspective</link>
		<comments>http://datavaluetalk.com/data-quality/first-time-right-the-customer-perspective/#comments</comments>
		<pubDate>Tue, 11 Jan 2011 09:15:26 +0000</pubDate>
		<dc:creator>Holger Wandt</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[busines value]]></category>
		<category><![CDATA[customer perspective]]></category>
		<category><![CDATA[customer view]]></category>
		<category><![CDATA[data correction]]></category>
		<category><![CDATA[data matching]]></category>
		<category><![CDATA[data standardization]]></category>
		<category><![CDATA[data validation]]></category>
		<category><![CDATA[ease of use]]></category>
		<category><![CDATA[first time right]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1620</guid>
		<description><![CDATA[One of the first things I will start working on this year is a paper on First Time Right. Naturally, my colleagues and I had discussed the content of such a paper before, but during my Christmas holiday I figured out what the line of thought for the paper should be. Next to the definition [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://datavaluetalk.com/data-quality/first-time-right-the-customer-perspective/attachment/man-with-binoculars-2/" rel="attachment wp-att-1631"><img class="alignleft size-thumbnail wp-image-1631" title="man with binoculars" src="http://datavaluetalk.com/cms/wp-content/uploads/2011/01/man-with-binoculars-150x150.jpg" alt="" width="150" height="150" /></a></p>
<p>One of the first things I will start working on this year is a paper on First Time Right. Naturally, my colleagues and I had discussed the content of such a paper before, but during my Christmas holiday I figured out what the line of thought for the paper should be. Next to the definition and the importance of the priciple and the approach in data quality solutions, I think that First Time Right is definitely about the business value and the advantages for the customers.</p>
<p>Let me give you a short preview:</p>
<p>Customer data plays a crucial role in the value chain of any business infrastructure. Whether purchasing, production, distribution, marketing, sales or service is concerned, the availability and the quality of your customer data is of great importance to these processes. A few examples?<span id="more-1620"></span></p>
<p>- A customer calls his insurance company in order to find an answer for a question he has on his fire insurance. <em>How fast</em> will the operator in the customer service department find <em>the right customer</em>?</p>
<p>- A large software company is running a report from its CRM-system in order to invite a selection of their international customer to an event they are organizing. How do they know that they are indeed selecting the intended customer? And how do they know they haven’t selected the same customer twice?</p>
<p>- In the self-service portal of a large retailer, customers are allowed to enter and alter personal information. How is the retailer going to prevent data pollution?</p>
<p>I think that organizations need a guiding principle to automatically, quickly and reliably check if the customer data already exists in the database(s). In addition, the data must be validated and, if necessary, corrected, completed and standardized.</p>
<p>This principle is called <strong><em><a title="First Time Right" href="http://www.humaninference.com/solutions/first-time-right" target="_blank">First Time Right</a></em></strong>. Concise application of the first time right-principle will always lead to an increase in customer satisfaction, to a boost in productivity and to higher revenue.</p>
<p>I expect to finish the white paper in about two weeks. Any thoughts on this preview? I&#8217;ll send out another post when I&#8217;m done&#8230;.</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/first-time-right-the-customer-perspective/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>How being focused can blur your vision</title>
		<link>http://datavaluetalk.com/data-quality/how-being-focused-can-blur-your-vision/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-being-focused-can-blur-your-vision</link>
		<comments>http://datavaluetalk.com/data-quality/how-being-focused-can-blur-your-vision/#comments</comments>
		<pubDate>Mon, 01 Nov 2010 08:50:02 +0000</pubDate>
		<dc:creator>Paul Drenth</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[address standardization]]></category>
		<category><![CDATA[naming confusion]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1567</guid>
		<description><![CDATA[In our company we are all very dedicated to serving our customers with their business problems with bad quality customer master data. Aren’t we all? A few days ago, one of our customer support desk engineers sought an answer to what happens with the addresses on the islands of the former Netherlands Antilles. See also [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-thumbnail wp-image-1575" title="sintmaarten 2" src="http://datavaluetalk.com/cms/wp-content/uploads/2010/11/sintmaarten-2-150x150.jpg" alt="sintmaarten 2" width="150" height="150" /></p>
<p>In our company we are all very dedicated to serving our customers with their business problems with bad quality customer master data. Aren’t we all?</p>
<p>A few days ago, one of our customer support desk engineers sought an answer to what happens with the addresses on the islands of the former Netherlands Antilles. See also my previous post <a href="http://datavaluetalk.com/2010/09/14/the-dissolution-of-a-nation/" target="_blank">The dissolution of a nation</a>. Kids of my generation had to memorize the names of these islands at primary school: the ABC islands – Aruba, Bonaire, and Curaçao – and the three islands with an “S”: Saba, Sint Eustatius and Sint Maarten. My colleague, now fully internet savvy, wanted to look up an address on Sint Maarten. Why not use an internet map and type “sint maarten”?</p>
<p>Yes, here it is! They even have a Spar supermarket there (just like home), and the address of this supermarket shows a postal code! A postal code with the same structure as in the Netherlands (NNNN AA). Pleased with this catch, he started to compose an answer to the customer.</p>
<p>Just before sending it, I passed his desk and we started talking about this (the topic has my attention, you know). And he showed me the map proving his arguments: the coastline was near. But when we zoomed out, the picture became clearer: tunnel vision obscured that he had been focused on Sint Maarten near the Dutch coast!</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/how-being-focused-can-blur-your-vision/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Changing trend U.S. immigrants: sticking to their name is custom</title>
		<link>http://datavaluetalk.com/data-quality/changing-trend-u-s-immigrants-sticking-to-their-name-is-custom/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=changing-trend-u-s-immigrants-sticking-to-their-name-is-custom</link>
		<comments>http://datavaluetalk.com/data-quality/changing-trend-u-s-immigrants-sticking-to-their-name-is-custom/#comments</comments>
		<pubDate>Wed, 08 Sep 2010 08:58:00 +0000</pubDate>
		<dc:creator>Vincent van Hunnik</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[international]]></category>
		<category><![CDATA[names]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/2010/09/08/changing-trend-u-s-immigrants-sticking-to-their-name-is-custom/</guid>
		<description><![CDATA[“New Life in U.S. No Longer Means New Name” That’s the title of an article published in The New York Times this week. In short it shows evidence of a declining need to fit in with Western standards. “For the most part, nobody changes to American names any more at all,” said Cheryl R. David, [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-thumbnail wp-image-1477" title="steinway" src="http://datavaluetalk.com/cms/wp-content/uploads/2010/09/steinway-150x150.jpg" alt="steinway" width="150" height="150" /><br />
“New Life in U.S. No Longer Means New Name”<br />
That’s the title of an article published in The New York Times this week. In short it shows evidence of a declining need to fit in with Western standards.<br />
“For the most part, nobody changes to American names any more at all,” said Cheryl R. David, former chairwoman of the New York chapter of the American Immigration.<br />
(Source: The New York Times)<br />
Mr. Steinway (the famous German-born pianomaker who abandoned the name Steinweg in pursuit of economic success) is a perfect example of the 19th and 20th century convention of immigrants adopting Anglicized names.<br />
What used to be needed to blend in and speed assimilation is no longer required. Economic powers are changing, as shown in this article in The Financial Times: “Indian economy shows 8.8% growth.” The world’s population is moving around more than ever, settling temporarily or permanently in other regions and countries.<br />
So what does this mean for people in the <a title="data quality" href="http://www.humaninference.com" target="_blank">data quality</a> playing field?<span id="more-1475"></span><br />
Most people used to live and die in the same country, but trends are changing fast. It already used to be hard to comprehend and spell the names of people in your neighbouring countries, but mastering just that is no longer enough. Italians for example used to live in their own neighbourhood in Boston, just like the Chinese population did. Nowadays you can find anyone living anywhere.<br />
A call centre agent in Austin, Texas is probably familiar with Mexican names, but how about the name Muthukumara? And would you know if Jyoti Thakur is male or female? Well, Jyoti does, obviously, and what’s more: she expects you to know the same.<br />
The world might be changing, but the personal wish of each of us to be seen for who we are stays the same. You might say that who we are is reflected in our name. This need for individuality is ongoing and will probably even increase, as will our settle mania.<br />
It is time for marketers and organizations around the world to make sure that they can treat each (potential) customer as if they were their neighbour. With Anglicized name or without. Luckily there is help out there.</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/changing-trend-u-s-immigrants-sticking-to-their-name-is-custom/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Your name is too &#8220;common&#8221;&#8230;.</title>
		<link>http://datavaluetalk.com/data-governance/your-name-is-too-common/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=your-name-is-too-common</link>
		<comments>http://datavaluetalk.com/data-governance/your-name-is-too-common/#comments</comments>
		<pubDate>Mon, 07 Sep 2009 13:14:24 +0000</pubDate>
		<dc:creator>Holger Wandt</dc:creator>
				<category><![CDATA[Data Governance]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Banks]]></category>
		<category><![CDATA[Chinese characters]]></category>
		<category><![CDATA[customer view]]></category>
		<category><![CDATA[deduplication]]></category>
		<category><![CDATA[interpretation]]></category>
		<category><![CDATA[knowledge]]></category>
		<category><![CDATA[single customer view]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=1207</guid>
		<description><![CDATA[A major bank in Dongguan (China) refused a potential customer because his name is Li Jun. Apparently, there were already over 300 bank accounts assigned to the name Li Jun. Not that this particular Li Jun was responsible for opening all these accounts, there were just too many men with exactly the same name. The [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-thumbnail wp-image-1209" title="chinese-characters" src="http://datavaluetalk.com/cms/wp-content/uploads/2009/09/chinese-characters-150x150.jpg" alt="chinese-characters" width="150" height="150" /></p>
<p>A major bank in Dongguan (China) refused a potential customer because his name is Li Jun. Apparently, there were already over 300 bank accounts assigned to the name Li Jun. Not that this particular Li Jun was responsible for opening all these accounts, there were just too many men with exactly the same name. The bank states that the refusal is nothing personal, since nobody with the name Li Jun will be accepted as customer in the near future&#8230;.. In the meanttime, Li Jun is taking legal action against the bank.<span id="more-1207"></span></p>
<p>When I read this news article this morning, my first thoughts were that it was perhaps a hoax. It turns out , however, that the news fact is true. From a data quality point of view this strikes me as really strange. How does this particular bank manage its customer data? Are there no additional identifiers (address, date of birth, etc.) to determine that you are actually dealing with the customer you think you are dealing with? Imagine that every John Smith would have a hard time to open a bank account, to apply for a job or to buy a product via the web. Or Jenny Jones? Bob Johnson? When is a name too &#8220;common&#8221;? It is common misbelief that the complexity of ideographic characacters such as Mandarin Chinese makes it harder to identify. At Human Inference we carried out some pretty serious dedups of Chinese files and-taking into account that Mandarin Chinese is a tonal language and other priciples of fault-tolearnce apply- the duplicate identification was rather accurate.</p>
<p>It is all a matter of using an intelligent <a title="data matching" href="http://www.humaninference.com/products/data-matching" target="_blank">data matching</a> method and knowing what kind of data one is working on. Every name can be identified; even &#8220;common&#8221; names.</p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-governance/your-name-is-too-common/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A question of quality?</title>
		<link>http://datavaluetalk.com/data-quality/a-question-of-quality/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=a-question-of-quality</link>
		<comments>http://datavaluetalk.com/data-quality/a-question-of-quality/#comments</comments>
		<pubDate>Fri, 13 Mar 2009 08:35:33 +0000</pubDate>
		<dc:creator>Holger Wandt</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[definition]]></category>
		<category><![CDATA[definition data quality]]></category>
		<category><![CDATA[definition quality]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=813</guid>
		<description><![CDATA[Yesterday I gave a lecture for management information system (MIS) students. We were looking into definitions of data quality linked to the natural language processing approach of the Human Inference software. As discussions developed, the students could not easily agree on criteria for quality in general. In an exercxise, we talked about &#8220;good&#8221; and &#8220;bad&#8221; [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday I gave a lecture for management information system (MIS) students. We were looking into definitions of <a title="data quality" href="http://www.humaninference.com" target="_blank">data quality</a> linked to the natural language processing approach of the Human Inference software. As discussions developed, the students could not easily agree on criteria for quality in general. In an exercxise, we talked about &#8220;good&#8221; and &#8220;bad&#8221; service. It appeared that, besides differences in taste, good service had a lot to do with expectation and fulfillment of that expectation. Of course, there were also a lot of other &#8220;requirements&#8221; for good service, but the discussion made me think of a Youtube movie I had recently seen. Seeing this movie made the jump to a solid and generic data quality definition easy: data has quality if it satifies the requirements of its intended use&#8230; Enjoy the movie!<br />
<object width="425" height="344" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/XvX7ovvf-LI&amp;hl=nl&amp;fs=1" /><param name="allowfullscreen" value="true" /><embed width="425" height="344" type="application/x-shockwave-flash" src="http://www.youtube.com/v/XvX7ovvf-LI&amp;hl=nl&amp;fs=1" allowFullScreen="true" allowscriptaccess="always" allowfullscreen="true" /></object></p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/a-question-of-quality/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Data Quality in Outlook?</title>
		<link>http://datavaluetalk.com/data-quality/data-quality-in-outlook/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=data-quality-in-outlook</link>
		<comments>http://datavaluetalk.com/data-quality/data-quality-in-outlook/#comments</comments>
		<pubDate>Tue, 18 Nov 2008 15:33:13 +0000</pubDate>
		<dc:creator>Admin</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[contact cleanse]]></category>
		<category><![CDATA[Data Quality on Demand]]></category>
		<category><![CDATA[outlook]]></category>

		<guid isPermaLink="false">http://datavaluetalk.com/?p=177</guid>
		<description><![CDATA[Microsoft Outlook must be the most used CRM application in the world, be it on the desktop or on a smartphone. A common problem with Outlook contacts is that information is often incomplete, incorrect and not formatted correctly. Specifically telephone numbers are often formatted in such a way that it won&#8217;t be accepted by your [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://datavaluetalk.com/2008/11/18/data-quality-in-outlook/"><img class="  alignleft" style="border: 0px initial initial;" title="Contact Cleanse Logo" src="http://www.contactcleanse.com/wp-content/themes/corporate/images/people.jpg" alt="HIquality Contact Cleanse" width="124" height="91" /></a></p>
<p>Microsoft Outlook must be the most used CRM application in the world, be it on the desktop or on a smartphone. A common problem with Outlook contacts is that information is often incomplete, incorrect and not formatted correctly. Specifically telephone numbers are often formatted in such a way that it won&#8217;t be accepted by your mobile phone. </p>
<p>A new service launched by Human Inferences brings a remedy to this problem. The service called HIquality <a title="HIquality Contact Cleanse" href="http://www.contactcleanse.com/">Contact Cleanse</a> allowes users to simply email a vCard to <a href="mailto:contactcleanse@humaninference.com">contactcleanse@humaninference.com</a> or transmit the contact from a <a href="http://www.contactcleanse.com/how-to-smartphone">Windows Smartphone using a downloadable application</a>. The Contact Cleanse service then simply responds by email with the cleansed vCard as an attachment. Give it a try!</p>
<p><span id="more-177"></span></p>
<p>A demo of Contact Cleanse from Outlook can be viewed below.<br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="344" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/NHog36HosTY&amp;hl=nl&amp;fs=1&amp;ap=%2526fmt%3D18" /><embed type="application/x-shockwave-flash" width="425" height="344" src="http://www.youtube.com/v/NHog36HosTY&amp;hl=nl&amp;fs=1&amp;ap=%2526fmt%3D18" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/data-quality-in-outlook/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What your data tells you about your processes</title>
		<link>http://datavaluetalk.com/data-quality/what-your-data-tells-you-about-your-processes/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=what-your-data-tells-you-about-your-processes</link>
		<comments>http://datavaluetalk.com/data-quality/what-your-data-tells-you-about-your-processes/#comments</comments>
		<pubDate>Thu, 16 Oct 2008 12:41:58 +0000</pubDate>
		<dc:creator>Ralph de Graaf</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[11-proof]]></category>
		<category><![CDATA[data processes]]></category>
		<category><![CDATA[datakwaliteit]]></category>
		<category><![CDATA[dataprocessen]]></category>
		<category><![CDATA[modulo 11 check digit]]></category>

		<guid isPermaLink="false">http://datavaluetalk.wordpress.com/?p=116</guid>
		<description><![CDATA[Do you know exactly what the data processes in your company look like? Many people have a general overview of the processes. Others have a detailed view of some of these processes. Few people know all data processes in detail. Looking at the data can give you deeper insight in your data processes. Not only [...]]]></description>
			<content:encoded><![CDATA[<div><span>Do you know exactly what the data processes in your company look like? Many people have a general overview of the processes. Others have a detailed view of some of these processes. Few people know all data processes in detail.</span></div>
<div id="attachment_75" class="wp-caption aligncenter" style="width: 310px"><a href="http://datavaluetalk.files.wordpress.com/2008/10/survey-challenge-dq.jpg"><img class="size-medium wp-image-75" title="survey-challenge-dq" src="http://datavaluetalk.files.wordpress.com/2008/10/survey-challenge-dq.jpg?w=300" alt="HI Survey Results" width="300" height="188" /></a><p class="wp-caption-text">HI Survey Results</p></div>
<p>Looking at the data can give you deeper insight in your data processes. Not only the processes in theory, but in practice too. How? By focusing on the exceptions!<span id="more-17"></span></p>
<p>For example, do you think the field ‘social security number&#8217; is reliable because it is mandatory and uses a <a href="http://www.pgrocer.net/Cis51/mod11.html">modulo 11 check digit</a>? Maybe this field is filled with a fake-value like ‘111222333&#8242; , or the social security number of the agent who entered the data. A frequency check on the values in this field will identify values that occur more frequently than others.</p>
<p>Why do people do that, cheat? One reason may be that staff at the front desk want to help their customer, or sell something, and the customer doesn&#8217;t know their social security number at that moment.</p>
<p>The same applies to all other fields. Address fields mandatory, but the customer doesn&#8217;t want to give their address ? Fill in the address of the office. Customer deceased? Put the text (deceased) after their surname. Ex-directory phone number? Use the email field for entering the text ‘ex-directory&#8217;.</p>
<p><strong>Get to know your processes by analyzing your data!</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://datavaluetalk.com/data-quality/what-your-data-tells-you-about-your-processes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

