Weird subject, isn’t it? Quite obvious for everybody, the persons ‘Ask Me’ and ‘Any Body’ are artificial names. They will never belong to a real person. How they relate to ‘Walter von Stolzing’ will follow.
For over 25 years Human Inference has collected reference data, for instance on persons. Because of our reference set we immediately recognize that ‘Ask Me’ and ‘Any Body’ are fake names. People are using these either in test situations or to hide their actual names.
In the old days we only needed to test on ‘Test Test’, in more recent years we see great inventiveness on these fake names. A brief example can be seen in the following list.
In case you cannot rely on reference data and interpretation you need to provide a check list. Providing it is one thing, but since users tend to be really creative, maintaining it is essential. Continue reading ‘Ask Me is linked with Any Body and relates with Walther Von Stolzing’
Late 2009 in their report on Who’s Who in Open-Source Data Quality, Andreas Bitterer and Ted Friedman from Gartner, pointed already to DataCleaner as a promising tool. A tool that, in their opinion, could certainly improve by offering more high end Cleansing functions and improve the rather basic User Experience.
Since then, a lot has happened in the DataCleaner space and in the profiling market. Before the launch of version 2 we notified everybody on the acquisition of eobjects.org or DataCleaner by Human Inference. It might be that some of you were curious on what would happen with the functionality, and as stated at that time we would continue with the community and further participate and expand in it. Under the flag of Human Inference we launched the renewed DataCleaner 2.0, where we definitely increased the customer experience with an enhanced user interface together with possibilities to provide filters or filter flows. The filter flows show their benefit if you analyze your data source and want to create new (temporary) data sources based on matching criteria. You can do that either manually, or in a completely automated way to monitor your data.
With Open Source in general, and with DataCleaner in particular we want the community to participate in the functionality of the product. Since long DataCleaner contains the RegexSwap: the community where you can share regular expressions. Why would everybody reinvent the same wheel to build a regular expression on creditcard checks, emails, etc?
Next to regular expressions that can be used to profile data, there is the need on data cleansing functions that contain much more business logic that can hardly be covered in a regular expression. For example, to validate of the syntax of an email is correct is something else than validating if there is also a running mail server attached to the domain. Cleansing functions are already part of DataCleaner but there is always a need for other or more advanced functional extensions. To prevent that you need to create them in the ‘DataCleaner’ way we have created an easy extension sharing mechanism. Continue reading ‘DataCleaner adds expert cleansing functions- added value in Open Source’