Mining the New Gold- the Smart Way
Data is everywhere – and today, it is everything, the new gold upon which fortunes are made and lost. And inside an organization’s collection of data – analytics, social media discussions, maps and images, and almost anything else – lies the raw material for the information that can unlock big profits for organizations.
By mining its data sources, companies can more precisely develop messages and create products that will make give customers more value – making them more satisfied, and more likely to remain as customers.
For example, an insurance company could find itself selling more policies if it matches car sale data with demographics, offering different products to drivers of different ages and backgrounds that are more likely to appeal to them. The same principle applies to any other business where data is collected about people, products, prices, places, or anything else; the more an organization knows about its customers and markets, the better it can reach its customers with clearer messages and innovative products or services.
There are several challenges to being able to do that, however – and in order to overcome those challenges, organizations may need some “outside” help, as in automatic big data search and analysis systems that can sweep their many data sources for the information relevant to their needs. The job of tracking down data and metadata is a job for a machine, not for a person.
One challenge organizations have in utilizing data is simply knowing where it is, and how to get at it – and for many organizations, that’s not a simple proposition. That data is likely spread across a wide variety of sources – databases, both current and backed up, as well as social media accounts, logistics data, product or service updates, customer phone call transcripts, and much more.
But the likelihood that an organization will have a single protocol through which the data was labeled and categorized is extremely unlikely, given that the data was collected through different protocols, on different platforms, and under the administration of different CIOs, who may have had different ideas on how to categorize or store data.
A second challenge is probably even more daunting. Even if an organization was able to develop a system to search out the relevant data, they are likely to knock up against a wall of metadata misinformation. The same data in the manifold sources that will be drawn upon are likely labeled with different labels and terms, among other issues.
For example, even if our insurance friends develop a search script to find females between the ages of 25 and 40 who purchased extended repair policies, the search system would likely be unable to get all the data – because the metadata labels for age (Age, DOB, Date of Birth, Year of Birth, YoB, YOB, etc.) gender (gender, sex, M/F, etc.) policy name etc. is likely to be different in at least some of the data sources.
The metadata issue is actually even more serious than that; besides preventing organizations from progressing, it could also be hurting the bottom line. Metadata errors tend to propagate themselves throughout an organization, as reports cite previous uses of mislabeled or incorrectly classified data. If the original data was inaccurate, all the subsequent reports based on that data will be incorrect as well. The problem may be in the data itself, or in the metadata containers that hold the data. Either way, that lack of control could be costing the organization money.
For organizations, this can be extremely frustrating. Within reach of their fingertips is data that can help increase sales, reduce costs, grow their customer base, and much more – but the data “mistakes” of the past prevent them from reaching that “treasure.” In order to make data work for an organization, the first thing it has to do is take control of it – and the first thing that needs to be done in order for that to happen is for organizations to take an inventory of what they have.
That job generally falls to an organization’s Business Intelligence (BI) team, which specializes in tracking down data. But building a map of the location and relationships of that data, or examining the metadata structure of the organization, costs money too; it’s a labor-intensive activity that requires many hands-on hours by (BI) professionals, who are not cheap to employ. In addition, the work is slow and painstaking.
A better, cheaper, and much faster way to track down data relationships, problems, and locations is with an automated metadata management platform that can parse through systems, building an index of the location, relationships, and dependencies of data. The system can then be queried for information needed for projects and reports – and it will be error-free. Once the data is “liberated,” it can be used to develop products and services, save money, make organizations more efficient, or anything else companies need it to do.
Data provenance and remediation isn’t usually the thing executives think about, but if they thought about data as money, they would probably think about it a great deal. And today, data is money. The more control organizations have over data, the more money they can save – or make.