

<font style=
“ 到2018年,有140,000到190,000个空缺的数据科学家职位 ” - 麦肯锡
“ 预计到2020年将有超过10万名数据科学家短缺 ” - Gartner
数据科学或数据科学家在这个富有魅力的新词中,今天有许多文章正在嗡嗡作响。那么究竟数据科学还是Data Scientist的炒作呢?
今天的数据科学经常被等同于软件工程,主要是因为它是由代码编写的。但是,它们之间并不存在距离。诸如敏捷,瀑布,混乱等方法学不是容易嵌入的方法论,可以与数据科学相结合。数据科学更科学,工程更少; 所以应该采取更科学的方法。统计学家,数据挖掘者,数据分析师甚至与数据科学家一起等待数十倍。让我在这里澄清这里关于数据科学家如何运作不同的基本差异。建模者通常具有确定的范围和数据,他们应该获取,分析和处理数据。与它们相关的典型术语是线性回归,逻辑回归,已知分布,置信区间,预测变量和适合度。相反,数据科学家主要是受到人性本质倾向,我们无法想象的好奇心和需要找到最难解答的问题。数据科学家是好奇的,有一个问题的诀窍,一开始可能不是那么直观的问题,做广泛的“假设”分析,质疑现有的基本假设和一切照常的过程。掌握数据和分析结果后,顶级数据科学家将在整个组织的领导结构中传达知情的结论和建议。对于他们来说,世界在一个黑盒子里产生数据(通常与算法建模相关联),而且他们的词汇往往有机器学习,人工智能,神经网络,随机森林,支持向量机,未知的多变量分布,迭代分析,预测准确性等。数据分析师可以从定义的来源(例如,CRM,调查等)查看数据。数据科学家通常以不同的方式运行,并且主要检查来自众多不同来源的数据。他们希望能够嗅出所有传入的数据,意图揭示一些可能为业务增加巨大价值的隐藏的洞察力。数据科学家从不同的角度审视数据,超出了常规报告的范围,并将形式上的洞察力呈现给可能适用的业务用户。简而言之,他们具有强大的商业头脑,能够与IT和业务部门良好沟通,将复杂的概念简化为可以理解的信息块,了解分析和建模,善于处理数据,可以作为部分分析师艺术家。
数据科学家一生中的一个典型日子包括执行历史数据审查和准备(缺失值估计,异常值检测,描述性统计),然后是数据分离(训练和验证集)和变量选择(检查多重共线性,选择重要变量)。在数据按摩后,接下来的关键步骤是构建预测算法(逻辑回归/随机森林/决策树/ K均值聚类/序列挖掘/文本分析)并审查结果以优化模型(模型诊断评审)。模型定案是难题的最后一部分,涉及购买倾向,客户流失,渠道优化,客户生命周期价值等常见问题可以得到回答。
Let me take a quick example of how a Data Scientist could add value beyond the visible boundaries & across the value chain of any business. Imagine an online retailer intending to build a recommendation engine that renders a whole new customer experience, promotes specific products based on current trends, browsing behavior, past purchase history and sentiment analysis. A typical solution is expected to increase both the conversion ratio and the average basket size. A data scientist in most cases would grow beyond his usual role firstly to explore data available in the public domain as well & may also deliver insights on the supply or the procurement side, how to avoid inventory stockouts, what are the right pricing strategies for a certain segment of customers, placement of certain SKU’s etc. He may come up with specifics around how the retailer can identify previously undiscovered products for cross-selling opportunities? Where can we find new revenue streams to offset the decline in revenues from certain channels? From a vendor management standpoint, it might be, who are the vendors to be reached out to service a given online order with minimal turnaround time and optimal costing (based on the order delivery committed to the customer in next 3-hr, 1-2 days, 3-4 days).
Data Science has already been creating an impact in every aspect of our lives, from preventive healthcare management, to rehashing internal business processes, to mitigating risks effectively, and even the convenience of having highly relevant, hyper-personalized experiences on ecommerce websites. Hopefully this article would have shared some perspective to how these extra-ordinary thinkers could truly impact businesses across the board for multiple industries by asking the right questions and gleaning into not-so-ordinary data from different realms of business. Waves of change have just begun. Data science has lot more to offer than what we could imagine sitting at this point of time & surely as we move forward in 2015, there would way more exciting applications of Data Science getting unraveled. Do share thoughts/comments/experiences on how Data Science added value or can add value to your business.
Share this:
Click to share on Twitter (Opens in new window)
103Share on Facebook (Opens in new window)103
Click to share on Google+ (Opens in new window)
627Click to share on LinkedIn (Opens in new window)627
Click to share on Reddit (Opens in new window)
Click to share on Tumblr (Opens in new window)
7Click to share on Pinterest (Opens in new window)7
Click to share on Pocket (Opens in new window)
Like this:
Design thinking | Behavioral Sciences: Strategic elements to building a successful AI enterprise
Today’s artificial intelligence (AI) revolution has been made possible by the algorithm revolution. The machine learning algorithms researchers have been developing for decades, when cleverly applied to today’s web-scale data sets, can yield surprisingly good forms of intelligence. For instance, the United States Postal Service has long used neural network…
In "All Blog Posts"
Digital transformation is here to stay ; How Artificial Intelligence(AI) is at the core & will drive the next wave of digital transformation...Unleashing productivity, efficiencies & new jobs
In the past months, most of you may have run into a situation where you open the GPS app on your smartphone and it ‘knows’ where you would like to go because it fits a certain repetitive pattern of behaviour or the photo app organizing your pictures based on who…
In "All Blog Posts"
With the recent massive explosion of data availability, significant leap in computing capabilities, substantial reduction in data storage costs and greater belief of businesses in analytical models has fueled the growth of businesses across the globe and demand of skilled professionals across all levels. However, businesses are demanding high level…
打开APP,阅读全文并永久保存 查看更多类似文章
更多类似文章 >>
分享 收藏 导长图 关注 下载文章
