Group of healthcare workers conversing over a tablet
HealthAugust 16, 2017

Pregnant men and smoking babies? 4 reasons dirty data destroys healthcare analytics

A goldmine of data exists across the healthcare continuum. Hospitals and health systems, by and large, recognize the opportunity. Many just don’t know how to overcome the complexities of data management to make it work for them.

Consider the challenges: Staggering amounts of patient information are regularly accumulated from disparate systems, many with their own terminology frameworks. Even when data is converted to industry standards like SNOMED CT or LOINC, many healthcare organizations find that the tools they use produce duplicate records or were not set up to capture data correctly.

Addressing healthcare data challenges

Given the complexity and variety of information systems needed to manage a healthcare enterprise, it’s understandable that data degenerates over time. Unfortunately, faulty data results in inaccurate analytics initiatives and negative downstream impacts, touching everything from patient care and regulatory reporting to revenue cycle and the bottom line.

Amid so many competing priorities in healthcare today, many executives may be tempted to put a band-aid on issues rather than dedicate the resources needed to lay the right foundation for accurate, clean, and complete data analytics. Unfortunately, this response will lead only to greater issues down the road.

Whether you’re using analytics to inform patient care directly or to supplement care with innovative care management plans, quality analytics requires the cleanest data possible. Consider the following four reasons why dirty data must be eliminated.

1. False assumptions skew healthcare delivery and quality

How often have you received an inaccurate, misaligned ad from a social media site? The experience is ubiquitous in today’s marketing climate and is the result of ineffective system management of data. An algorithm pulls information based on faulty information—or dirty data—and makes assumptions about your interests and buying habits.

The same problems with dirty data exist in healthcare, but the stakes are much higher. If the assumptions about a patient’s health or regional health trends are based on bad information, the consequences directly impact a patient’s outcomes or the health of a community.

2. The clearer the picture, the better the care

The primary goals of current health information exchange movements are focused on bringing together a more complete patient picture and enabling greater stakeholder collaboration around that information. Providers who have access to a more accurate and complete patient picture are much better positioned to make informed, optimal care decisions.

Not only can inaccurate, dirty data cost lives, it can cost you money. With inaccurate data, you risk leaving money on the table due to inaccurate DRG shifts or incomplete, inaccurate quality management reporting that impacts Hierarchical Condition Categories (HCCs) and ultimately, your Risk Adjustment Factor (RAF) scores.

3. Knowledge of trends is only as good as the data

One of the big promises of healthcare analytics is that it can allow the broad recognition of health trends in a given region, or across a community. If data is sullied with inaccuracies, health trends can be overstated or understated, undermining providers’ attempts to react to those trends and provide appropriate care.

The ability of enterprises to break out metrics by zip code, or even neighborhood, provides a wellspring of opportunity for identifying and treating public health concerns and connecting the dots between diseases and broader social factors. But when dirty data reigns, real trends go unidentified and the true innovative potential of health IT remains a pipe dream.

4. Faulty healthcare data means inaccurate analytics and negative downstream impacts

The old saying “what you put in is what you get out” is certainly accurate in healthcare analytics. To truly leverage the potential of big data, disparate data must be clean. What errors have you seen in your healthcare data, and how have you tried to clean the data?

At Health Language, we make optimal health data management our business. Speak to an expert to learn how the right infrastructure can help you improve the quality of your data, and achieve robust, accurate analytics that represent the future of healthcare.

LOINC® is a registered trademark of Regenstrief Institute, Inc.

Speak To An Expert


Brian Diaz
Director of Strategy and Business Development, Clinical Surveillance and Compliance

As the Director of Strategy and Business Development, Clinical Surveillance and Compliance, Brian is responsible for business strategy, research, and business development.

Health Language Interoperability and Data Normalization
Normalize your clinical and claims data to standard terminologies so you can effectively measure performance, gain insights, and support enterprise initiatives.