QueBIT Blog: Making Sense of Unstructured Data

Posted by: Jennifer Field

Jan 8, 2016 8:00:00 AM

If I were to tell you that companies around the world are prioritizing structured data analytics initiatives, you wouldn’t think twice about it. Given the progressive climate of big data and analytics, it seems like that would be a fair statement to make. Especially as many businesses continue to count on their relational databases, ERPs, CRMs, and other data management systems to organize, structure, store, and define their structured data sets to run more meaningful analytics.

For example, according to the 2015 Big Data and Analytics survey by IDG Enterprises, organizations place greater priority on structured data initiatives, with 32% of organizations stating that managing unstructured data is not on their to-do list.

But is ignoring unstructured data the right way to go?

People usually associate unstructured data with images, videos, and social media. This may be the reason unstructured data is perceived as less meaningful. However, emails, presentations, spreadsheets, Word docs, and many other data sharing files are all part of the larger unstructured data web.

Just because all of this information may be structured in a dedicated OLAP database or data warehouse doesn’t mean that it won’t, at some point, become unstructured in nature.

As Terry Boedeker (CISSP certified Varonis Systems Engineer) explained, “While data may be structured somewhere, any time an individual interacts with the data in the environment, they’re often doing so in an unstructured format. The data may be in a customer management system, or in a SQL database, but they’re pulling data and putting it on a spreadsheet documents, PowerPoint's, and emails. Humans are interacting with data at an unstructured level all the time.”

Forbes went further to point out the importance of prioritizing unstructured data management alongside structured data, referring to the practice of favoring one over the other as being “half blind.”

Unstructured data is the rearview camera organizations need to catch blind spots in their data analysis.

For instance, retailers today need unstructured data from mobile devices, call center notes, or social media to make sense of all the elements of a transaction. They can better answer why a customer made a purchasing decision and which promotional source led them to the product they purchased.

In the case of predictive analytics specifically, being able to integrate that unstructured information with defined data variables (like customer demographics) is important to forecast future sales scenarios and the correlation between certain sales trends.

Not only is unstructured data needed to complete the analytics picture, it’s also the fastest growing form of data. As Forbes pointed out, nearly 80% of new data is unstructured.

In addition to social, other external data sources are coming together through the Internet of Things. There is a wealth of knowledge that will lead us to new discoveries that we couldn’t have imagined years ago.

But to get there, unstructured data management and analysis will play a pivotal role. It may be messy to deal with at first, but that’s why big data processing and management platforms like Hadoop are going to be so important to tackling unstructured data and adding value to analytics.

Topics: Big Data Analytics


Blog Search

Subscribe to Email Updates

Popular Posts

Recent Posts

Follow Me