Business Analytics, Big Data, Data Science, Data Analytics & Data mining, these are some of the most buzzing word in the current tech world. It has been termed the sexiest job of the 21st century.
It is going to be biggest job creator for next couple of decades and jobs which will come with lucrative pay packages and benefits.
As a result, there are many who trying to enter this field.
But as the field is new and so much is written about it that, information is scattered all over the place and you don’t know where to find it.
One of the first steps for working professionals to enter such field is by learning the subject and acquiring relevant certifications like “SAS Certified Business Analytics”.
In the next few post, I will introduce you to concepts of data science and you will learn the necessary topics to get your foot wet in the data analytics and if you need, you can go for certifications and required trainings for that.
But it is better to first have some basic understanding about the subject, otherwise you will simply end up losing few hundred dollars on training and certification cost without much gains.
Data Science Or Big Data
First you need to understand these terminologies;
Big Data as per Wikipedia is described as
Big data is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications.
The challenges include analysis, capture, curtain, search, sharing, storage, transfer, visualization, and privacy violations. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to “spot business trends, prevent diseases, combat crime and so on.
Now look at the description of Data Science as per Wikipedia,
Data science is, in general terms, the extraction of knowledge from data. The key word in this job title is “science,” with the main goals being to extract meaning from data and to produce data products. It employs techniques and theories drawn from many fields within the broad areas of mathematics, statistics, and information technology, including signal processing, probability models, machine learning, statistical learning, computer programming, data engineering, pattern recognition and learning, visualization, uncertainty modeling, data warehousing, and high performance computing. The discipline is not restricted only to so-called big data, although an important aspect of data science is its ability to easily cope with large amounts of data. The development of machine learning, a branch of artificial intelligence used to uncover patterns in data from which practical and usable predictive models can be developed, has enhanced the growth and importance of data science.
There is lot of confusion between Data Science and Big data. Often people tend to use them interchangeably. But as you can see they both are related but not exactly the same. I have highlighted the difference in the bold above.
Now look at Business Analytics and Business Intelligence, again as per wiki;
Business analytics (BA) refers to the skills, technologies, practices for continuous iterative exploration and investigation of past business performance to gain insight and drive business planning. Business analytics focuses on developing new insights and understanding of business performance based on data and statistical methods. In contrast, business intelligence traditionally focuses on using a consistent set of metrics to both measure past performance and guide business planning, which is also based on data and statistical methods.
Similarly, let’s look at concept of data mining as per wiki;
Data mining (the analysis step of the “Knowledge Discovery in Databases” process, or KDD), an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.
Data Science-Moving Forward
I have discussed about these concepts here because newbie’s get confused between these terms and spend time learning something irrelevant. Therefore it is important for you to first understand your requirement.
All of these fields mentioned above are inter-related and primarily deals with;
- Data collection
- Data warehousing
- Data mining
- Identifying patterns in data and knowledge
There are two sides of it;
- Technical side which is all about computer science like servers, database systems, algorithm to read and extract data.
- Functional/business side where professional use computer applications to generate reports, charts etc to identify the existing pattern and predict future trend.
But don’t get surprised if you get a job call for the position of Data Scientist who will work on Big Data.
In the next few post, I will be taking you through the functional and business side where you learn about the concepts and theories of data science and how you can become one.