For the uninitiated, perhaps a brief primer is in order around the subject of data science. If you’ve only recently heard of the “data scientist” moniker, you’re in pretty good company. According to Google Trends, use of the term was relatively obscure just a few short years ago.
Lately, we’ve heard it mentioned in the news, as a de rigueur position on election campaign teams and emerging into the business vernacular. Harvard Business Review even published an article calling it
the “sexiest job of the 21st century”.
What is a Data Scientist?
Although the title is relatively new, the practice has actually been around for quite some time. You knew them as statisticians, data analysts, data miners, predictive analysts, knowledge discovery professionals and applied mathematicians (to just name a few). And while many of these individuals began their careers as such, progression in the areas of data storage, various technological advancements, and modern business drivers have prompted the adoption of skills and roles well beyond those of their original field. This amalgamation of business acumen, data skills, analytic prowess and a healthy dose of creative thinking needed a new label – which is what we have today. To put it simply, a data scientist is an analytical data expert that explores and solves complex business problems.
What does a Data Scientist do?
One thing that seems to be a common thread in the field of data science is that there really isn’t a succinct job description for the role. Sometimes it’s best to describe the typical things, a data scientist does:
- Communicate and understand which problems management is trying to solve
- Collect large amounts of unruly data from various sources and transform it into a usable format
- Rapidly identify a simple, robust, and scalable solution to a problem
- Work with a wide variety of databases, programming languages and analytic tools
- Collaborate effectively with business, analyst and IT resources
- Evaluate predictive models and data patterns
- Perform knowledge transfer and training
Data scientists will spend time in a number of areas depending on what is needed once the business problem is understood. The typical tasks involved in any project are:
Data gathering & preparation: Finding all the data that might be necessary to support an analytic solution. As mentioned before, this can sometimes be an unruly process. We can also use a process known as Extract transform, load (or ETL) which is a method to move data into a data warehouse. Preparation of the data can involve summarizing raw data, correcting or removing faulty data, and deriving new attributes from the data.
Basic exploration and analysis: A cursory look at the data available in light of the problem being solved. It is at this point a data scientist will evaluate if there is enough volume, veracity and variety to achieve a viable solution.
Machine Learning & Statistics: Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn and create a program or predictive model based on mathematical algorithms and automation. Data mining and pattern recognition are also included here and sometimes used interchangeably with machine learning.
Text Analytics: The process of examining unstructured text data to derive key concepts and business insights.
Visualization & Presentation: The presentation of data in a graphical and/or report format so it can be easily analyzed and understood.
Decision Management: The combination of predictive analytics and business rules to optimize and automate the millions of decisions that are made every day within the enterprise.
Automation & Integration: Insights using data science can help a business on their way to understanding a particular problem. Until the work is deployed into a prescriptive system that can rapidly and regularly put the information into the hands of decision makers, or the systems they use – it’s typically just research. Automation and integration with other systems puts the data science into action and makes an immediate positive impact on your business.
Why might your business need a Data Scientist?
It’s often been said that a company’s data is their greatest corporate asset. Knowing what you have and how it can be leveraged can be daunting and sometimes even elusive. Do you have reams of reports that give you a lot of information, but very little understanding as to why things are going wrong or how to correct it? Or, what is driving and how to leverage the things that go right? A data scientist can create and implement a strategy to gain these insights.
There can be a lot of pitfalls when taking on a data science endeavor. Framing a business problem in the context of what is achievable in comparison to the data assets available will avoid wasted time going down what may be a dead end. Knowing what data to leverage, what to keep and what to let go is important. Utilizing data incorrectly or poorly validated models are common mistakes that can lead to very costly results. Hiring a data scientist can help mitigate these risks as well.
Perhaps you’d like to build a center of excellence or data science competency within your enterprise. An experienced data scientist can help kick start your program by providing training, conveying best practices developed over years of experience, and knowledge sharing using your data first-hand. I have found this to be the most rapid and impactful way to gain competency in this area.
The QueBIT difference
Experience – Collectively, the data scientists on our advanced analytics team have decades of experience beyond most consulting practices. We have built creative and complex solutions in a variety of applications across a vast number of industries and business channels. Our team can relate to your key stakeholders and explain model results and data science methodology in plain English. Because of this deep experience, we can lead your team in a data science endeavor or augment it as you see fit.
Attention to your business – While we work on a project, we keep your business in mind. As much as it pains me to admit it, sometimes there is just no way to solve a particular problem with data (or the data that is available). This is usually discovered early in a project and if we see a dead end, we’ll let you know so there isn’t any wasted time. We will also discuss any alternatives available and even come up with a strategy to capture what might be needed for future evaluation if it is warranted. Because of the proximity and depth of evaluation that takes place in a project, we will call out areas that can be leveraged for business gain or identify suspicious activity even if it has nothing to do with the project at hand.
Knowledge transfer – It is our goal to establish long-term relationships with clients, but it is not our goal to establish a relationship of dependency. We have built a reputation centered on excellence and expertise, and the transfer of this knowledge to clients empowers them to achieve their goals.
How do I get started?