April 03, 2015 The Perils of Dirty Data in Healthcare Perhaps in no other industry is dirty data more of a risk than in healthcare. Dirty data can cost big bucks, or worse, lead to liability charges. Jamie Bolseth “Dirty Data” isn’t a good thing in any field, but, for obvious reasons, it can be especially dangerous in healthcare. In a field where data provides information that could mean the difference between life and death, one would hope data is checked and double checked. Dirty data can be a huge problem, but before we go further the first thing we have to do is answer the question—what is “dirty data”? Often times, data that isn’t accurate is the result of poor communication and human error. Data management departments will also blame poor data strategies that end up resulting in erroneous data. Insufficient budgets that don’t allow for data to be completely and properly collected can play a major role in dirty data as well. Although many organizations claim they have proper systems in place to check accuracy, less than half of them are using any specialized software to help them achieve this. The situation worsens when we look at how data is cleaned up and secured after collection. The resulting inaccuracies, corruptions, and errors are contributors to what is commonly referred to as “dirty data.” Many organizations are still use manual checks as their primary way of ensuring their data’s accuracy — considering that one of the major causes of dirty data is human error, this is obviously a bad move. We can’t be sure that we will ever be able to clean up dirty data entirely. Chances are there will always be some human error, or other problems with the way data is collected that will result in less than 100 percent accurate findings. Which brings up other questions: What types of dirty data are common to the healthcare industry? What steps are being taken to improve collection methods? What is the cost of dirty data? Duplicate Medical Record Numbers, assigned to more than one patient, are one of the most common forms of dirty data. Besides duplicates, we also consider overlaps, overlays, invalid, erroneous, and default data to be dirty data. We can’t pinpoint one reason for duplicate records. We do know why some of them are created, though, including: Missing data for a specific patient. Misspelling, typos, or incorrect data entered. Use of nicknames. Identity fraud. Duplicate patient data resulting from entering different key data at separate care locations. No data validation controls triggered by entering redundant data at the same location. The last three are often caused by incorrectly entering the same data for different people or having multiple records for a single person. Such erroneous data can adversely affect the transaction processing of a patient. It surely affects the quality of care, and may even result in serious health issues for the patient. To improve collection methods, healthcare professionals and institutions need to make efforts toward: Detecting and eliminating all invalid data. Including all patient-related data into records, irrespective of its validity. Estimating correct values. While it is easier to implement the first two options, estimating correct values is understandably challenging for many organizations. You need to consider that this option will change source data and there will be no way to identify the actual value. It is a crucial requirement, however, in order to provide proper healthcare. Remember, just like any business process, healthcare cannot be 100 percent accurate without a 100 percent accurate source data. Keep in mind, too, that the cost of data integration increases when data quality measurements are added to the process. No matter what the cost for cleaner data, we still need to remember that the cost of dirty data is significantly greater. Duplicate/erroneous data can cost an institution anywhere between twenty to a few hundred dollars per piece of data. Institutions under the threat of significant liability exposure in cases where an adverse event took place or when the quality of care was compromised because of the institution’s inability to find a patient’s proper record, will tell you that the cost of making adjustments to ensure accurate data collection is far less expensive than ignoring the issue. In conclusion, data-cleansing is a must for all healthcare institutions. Control measures need to be implemented for all source data. Proper record keeping and data management will help to curb the perils of dirty data, if not eliminate them altogether. If you would like to learn more about this and other digital health topics register for MobCon today. Photo Credit: Amickman via Compfight cc Tags DevelopmentCross Platform Share Share on Facebook Share on LinkedIn Share on Twitter Share Share on Facebook Share on LinkedIn Share on Twitter Sign up for our monthly newsletter. Sign up for our monthly newsletter.