Managing your data well is crucial to any business in this day and age. This topic has become increasingly more important to corporations as more people recognize how good data can help companies make smart decisions. In fact, I have seen this topic popping up on more and more conference panels in the past few years, one of which I was privileged to take part in last October, the Perrin Asbestos conference. But what about data management is so important? If it’s so easy, why don’t all companies do it? Well, the answer is it can be easy if you know what pitfalls to look out for and if you do it correctly from the start.
What is bad data?
First, bad data can be wrong data. If the data that one captures isn’t correct, then it isn’t useful and can actually be pretty harmful as wrong data can lead a company to make poor decisions. Additionally, all bad data isn’t necessarily “wrong” data but it can be data that isn’t usable. So, how does anyone ensure they aren’t collecting bad data?
For starters, you need a quality control process. Most of the time, data is manually entered into a system and humans are not infallible. So having a way to reviewing the data entry, even if not all of it, is better than no review. There are also other checks that can be done to help ensure data makes sense without having to review every piece of information. For example - when entering a person’s date of birth and date of death, having a check to make sure they died after they were born is a pretty simple way to ensure that the data entered isn’t nonsensical.
Second, you need to have a structure set up for your data collection that will help ensure you aren’t falling into any data management pitfalls. Properly storing data can be one of the most fundamental issues with data collection and the first place that data collection can go askew. Why? Because sometimes data collection can be a pain to do right and not everyone wants to take the time up front to get all those rules set up correctly …or you don’t have the experience or foresight to know what you may need in the future. Therefore, when starting data collection, think about how your data might grow into the future. If today you are collecting information about an injured party in a mass tort lawsuit, what other information might you need in the future? Will there be any information you need to capture about who filed the lawsuit as well? What about the allegations the person has made? Are there any additional statistics to capture about exposures related to your company’s product? These are just a few questions to consider when starting to capture information to allow you to make sure you are making your dataset scalable for the future.
Related to that, picking the correct medium to house your data is very important as well. While the dataset is small and spreadsheets like MS Excel can be very useful, this type of storage device will have its downfalls into the future as your data size grows, the relatability of your data grows and the number of users who need to access your information increases. Databases were built for just this type of management, manipulation and storage and can have a very positive effect on a company’s ability to store and maintain their data well as well as eventually report on it too - but that will be covered in another blog another time. Another nice aspect of using a database is that it can protect you from collecting bad data by enforcing data types on entry of certain data fields, enforcing referential integrity between data in various linked tables and eliminating duplicate data by removing the need to have repeating “rows” for each person as other fields need multiple values.
Overall, data is a very powerful tool that all companies should harness. Taking a practiced approach to setting up your data collection right will save a lot of time in the future by not having to do rework. In the next blog post, I will dive a little more into some additional tools that can be used to help set up your data collection correctly at the beginning to hopefully avoid the extra expense of rework and the collection of “bad data”.
Never miss a post. Get Risky Business tips and insights delivered right to your inbox.
Carrie Scott is KCIC’s technology lead, both in operations/infrastructure and for development. “I work with a talented group of people to make sure our technology stays innovative and top of the line to support our client’s needs,” she says. “I also focus on the Consulting side of our practice, leading many clients through their day-to-day and long-term strategic goals.”
Learn More About Carrie