What do YOU think of the quality of your data?

Dilbert HQ

The quantitative assessment of data quality holds many challenges. Definitions such as

“The effect of the information system is determined by the quality of the information system multiplied by the degree of acceptance of the user organization, E = Q * A”

or

“Quality is defined as the degree to which a product meets the requirements of clients”

or

“Defining quality means destroying quality”

are merely intuitive statements translating or partly translating different points of view of data users. In most cases, neither the purpose for which the data are used nor the underlying requirements to be met by the data are clearly defined. Clear definition is a prerequisite for the assessment of data quality. Requirements should be formulated so as to permit an objective and automatic verification of the ability of data to comply with such requirements.

Furthermore, individual opinions and expectations play an important role in data quality perception. Whereas the data users want their requirements to be met, the data suppliers attempt to comply with specifications. These preconceived points of view are often the initial reason for dissatisfaction and data quality disappointment.

Statements such as “We make no data-entry mistakes, therefore our data quality is 100%” or “The data we receive from department X is absolutely useless” are illustrations of such individual opinions.

If measurement or assessment of data quality in an organization is to be a genuine tool of control, it must be defined and executed in a reliable, understandable and reproducible manner. Data quality is a continuous process and therefore the management of data quality (i.e. the decisions deriving from the measurement of data) is a continuous process as well.

Business rules are usually formulated by data specialists (list managers, for example) who assess the data under different and changing circumstances. Here lies the key to usable and reproducible follow-up. Rules on completeness, accuracy, format, range, and frequency can be used to generate numeric measurement results and, if desired, error lists.

The results from a pre-defined set of rules will function as indicative input for policy decisions. Aggregated reports on these results will permit evaluation on different hierarchical levels of an organization.

Decision themes will therefore cover a wide variety of topics. Underneath a list (in random order) with some of the topics that can be used for the specification of an organizational data quality decision repository:

· Basic data quality (DQ) criteria

· Specific or situational DQ criteria

· Requirements to the DQ system

· Requirements to the DQ documentation

· Privacy dimensions

· DQ monitoring criteria

· Satisfaction of the participants in the DQ process

· Data access requirements

· Data access permissions

· User profiles

· Interface requirements

· Correction measures

· Prevention measures

· Risk management

· Economic issues (for example ROI on DQ)

· Data cleansing measures

· Data enrichment measures

· General data processing topics (storage and exchange)

· Education/training of employees

So, answering the question “What do you think of your data?” should be bit more well-considerd than a generic “good”, “bad” or “ok”. Having a closer look might actually prove to be of value…

1 Response to “What do YOU think of the quality of your data?”


  • There are several activities recognized about new standards for data quality. On March 6th 2009, news about the upcoming EIDIQ-Standard for Data Quality will be presented by Prof.Dr. Jens Lüssem at CeBIT Hannover, Hall 4, B64. His Keynote starts at 2.30 PM on the Forum BI + BPM.
    This coming EIDIQ-Standard is supposed to become a strong content for the ISO8000. You are welcome to support the EIDIQ-Members. Look at http://www.eidiq.org

Leave a Reply