Data is playing a more and more important role in the
business world and even in our daily lives. Almost everyone is generating data
at an explosive rate of growth. From waking up, your alarm clock application is
gathering data of your waking up time; when you have your breakfast while
reading the news from your ipad, your news application is gathering the number
of clicks of news and other related information about you. This goes on and on.
Companies have collected massive data and the data sets is exploding as well. So
the big problem companies face now is how to deal with the data?
A typical reason for the companies being interested in those
data is that they want to acquire the information about their customers thus
being able to provide better service and reduce cost. The better service can be
better content, better timing, full racks; reduce cost can come from reduction
in inventory due to more accurate forecast, or better targeted marketing initiatives
resulting a better ROI, conversion rates and so on. But still… this is still a
beautiful dream and vision of many companies. Let’s take a look at why.
First, data collection. Companies try very hard to collect
data from various sources. Let’s take a CPG company as our model. It collects
most of customer and POS data from retailers. Additionally, it purchases data
from the market. There are multiple different sources such as Nielsen data, social
media data and so on. In the data collection process, there is a high
possibility that the data is not ready to use. By that, I mean there are
missing values, wrong values, and different types of data mixed together. Companies
aren’t going to gain any insights from it.
To give an example, I have done a predictive modelling
project with one of the major CPG companies. The data they provided are of very
low quality: there is no clear documentation of the data; there are lots of
missing values; there are lots of related data being stored separately and so
on. The processing of data is with potential risk of losing values of data.
Sometimes analyst have to replace missing values with the mean value of the
data sets, sometimes analyst have to remove whole sets of data because obvious
outliers and missing of key values. So the first problem comes from the
collection process of collecting data.
Second, data utilization. In many cases, companies do not
trust the data when making big decisions. This is a result of bad data quality
but also it is a reason why data quality is always poor. The awareness and
emphasis on data analytics is a must from high level management for success in
this field. For the company mentioned above, the high level just green lighted the
project with CMU team for predictive modelling experimental. It’s a good news,
but still, a lot of companies have done that way back.
Third, the integration of supply chain system. Take that CPG
company as an example, if the data analysis and visibility is only at the cooperate
level, rather than integrated with all parts of the supply chain, the force of
using the analysis will be weakened. This requires huge investment and effort
to bring visibility to the whole supply chain stake holders.
My question is who will be the biggest driving force from
this revolution? Is it the companies in the industry or the data analysis
expertise, or the customers?
Source:
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.