Back to blog

20% of Your Data Could Be Bad: Here’s How to Fix It

When you think about how much bad data can really cost you, it’s important to make sure that your info is deliverable, callable, and emailable.

Okay, so, you’ve got a lot of data—and who doesn’t, these days?

How can you be sure that the data you’re using is quality, and without spending a ton of resources?

Bud Walker, the Vice President of Strategy at Melissa Data, appears on The Corporate Data show today to explain why focusing on active data quality can save and make both time and money (plus he gives tips on how to convince management that this is worth their attention).

He answers five essential questions about data quality that can transform the way you think about big data.

1. How can organizations avoid drinking data from a firehose?

Well, it’s not the data itself you want to avoid, Walker says. It’s bad or unhygienic data quality that is so damaging across the industry.

Lately, organizations have ramped up the amount of data they’re dealing with. As corporations continue to produce and try to accumulate data from pretty much everywhere, that means there’s a lot of initiatives in big data going crazy right now. While customer-related data is Melissa Data’s specialty, it really comes down to data quality, not quantity, when you’re talking about using data efficiently.


2. What are the most problematic errors in data quality right now?

All kinds of different business will have different problems, but point of entry problems are a significant area organizations should look to for data quality errors.

For example, call centers and sales teams enter the wrong contact data into forms, customers themselves will mis-key data about themselves, and data aggregations purchased en-masse are more riddled with holes than Swiss cheese.

And then sometimes, data just gets old. “Data decay is a gigantic problem—44 million people move every year, so data is always going through attrition,” Walker points out. But the downstream effects of bad data quality are severe.

“Garbage in, garbage out,” Walker says, is applicable here. Incoming data quality could be between 7 and 20 percent flawed, according to studies. What Walker wants his clients to avoid is being so sure they have the right data—but it doesn’t pan out that way.


“Data decay is a gigantic problem—44 million people move every year, so data is always going through attrition.” - Bud Walker

3. What should organizations do about data that looks “right” but isn’t?

Not all data quality errors come from spelling or formatting, unfortunately. Data based on IP addresses or website ID products, for instance, can come with prepopulated errors, such as the correct spelling and information for a person’s previous job title, employer, and contact information.

With so many people moving or changing jobs—2 million Americans quit per month, according to Forbes—data quality has a definite half-life. Active data quality is focused on products that help uplift the data and move everyone forward by ensuring phone numbers still work and emails don’t bounce.

“Data quality” refers, of course, to the usability of the data, and “active data” refers to data you’re actually using.


4. How do you deal with validating a huge data file without a huge cost?

Okay, say you have a huge database, either one you’ve acquired from a merge or bought from another company, you know you need to clean it up. But the problem is you won’t be drawing from every part of the file all the time.

Some of it is just going to sit there for a while, so you don’t want to invest in validating the whole thing yet, because by the time you get to some parts of it, the data may have aged again. The issue here is whether bad hygiene is acceptable if the data is not continually in use.

And the short answer is, it doesn’t have to be.

So if you’re actively trying to campaign a section of a database, there are definitely low cost ways to ensure data quality of the part that’s being actively used. Syntax correction and formatting can certainly be performed at low cost, but even beyond that, there are low cost solutions to determining whether addresses, phone numbers, and emails work, too. You might be one of those SMBs that store data in Excel.

A lot of sales reps work out of Excel, but a sales team of five is not going to go through tens of thousands of lines of data in a month. Happily, Walker has a plugin that will allow you to select data in rows or columns of Excel and run it through an active data campaign wizard to ensure it’s all deliverable, callable, and emailable, exactly when you need it.

Like combing your hair right before you walk out the door.


Yeah, we thought so. Check it out here.

“Organizations should focus on one critical piece of data and in small measured steps try to complete a series of objectives.” - Bud Walker"


5. What are some tips for convincing management they want active data quality as much as we do?

Here are the most common mistakes for executives to make with an active data management plan: Biting off more than you can chew. Creating a project that has no deliverable measurements. Progressing through corporate initiatives that are too broad. “Organizations should focus on one critical piece of data, and in small measured steps try to complete a series of objectives,” Walker says.

With 14 years of experience in the data quality business, Walker can’t really be surprised by common mistakes. His take on the best way to convince executives that data quality is important is to demonstrate measurable progress. Do one small project to show the executives to win them over.

Take finite numbers and demonstrate how the project is progressing. (Be sure to point out that if they spend the money, they will see more results just like this.)

Essentially, it’s important to demonstrate measurable progress that is connected to return on investment. Data quality is not just a nice thing to have but it’s also directly related to income.

When you think about how much bad data can really cost you, it’s important to make sure that your info is deliverable, callable, and emailable.

If you’ve got a lot of data—and who doesn’t, these days?—it becomes vital to spend the resources to ensure that the data you’re using is accurate.


This episode is based on an interview with Bud Walker from Melissa Data. To hear this episode, and many more like it, you can subscribe to The Corporate Data Show.