Big Data Implementation Guide
I’ve been fortunate to complete some big data implementation projects in conjunction with ERP and CRM deployments that have achieved remarkable ROI. Interestingly, my results with Big Data make it all the more frustrating when I see business leaders sit on the sidelines and somehow wait for this opportunity to make itself happen.
In many talks with many smart folks, it's apparent that most everybody gets that fact that better information leads to better decisions, and that alone can and will grow businesses. But it's also become apparent that these business leaders need more information support before they can consider such a move, and compare it to other projects vying for their attention. To that end, I've put together sort of an executive's guide for the business case and deployment approach for Big Data.
The Business Problem
The transformation of business data into business intelligence is a costly and technical process, and consequently limited in scope and beneficiaries. In almost all businesses, data is stored in many data siloes, and getting it through the ETL (Extract Transform Load) process and into data visualization tools is slow, costly and limited to few decision makers in the company.
An IBM research study found that over half of the business leaders surveyed say they don’t have access to the insights they need to do their jobs. Clearly, this problem isn’t a lack of data availability, but an inability to transform data into the intelligence and insights needed by decision makers.
The challenge is further exacerbated as data increasingly no longer resides in nicely formatted relational databases on company servers. Now to outperform competitors and achieve the intelligence needed to meet revenue, customer affinity and other business goals, business leaders and line staff must tap into data that resides outside the company and in unstructured mediums, such as online social media, email, audio, video, images, streaming data and more. Data is further being multiplied by machine automation such as sensors, RFID, cameras, programs, phones and other smart devices.
Consider these data facts.
- 80 percent of the world’s information is unstructured.*
- Unstructured data is growing 15 times faster than structured data.*
- Raw computational power is growing so rapidly that a person with a PC has the power of a supercomputer from about a decade ago—and when combined with the new democratization of freely available data this power is available to anybody that chooses to leverage it for business or other opportunities.
* Source: Understanding Big Data, McGraw Hill
The increasing rate of data volumes, velocities and varieties have defined the concept of Big Data, and more importantly a new opportunity to better transform data from raw form into business intelligence.
The Solution
Big Data has emerged as a response to the challenge of accessing and synchronizing more and different types of data across disparate sources to achieve holistic views that deliver insights that solve problems.
Unlike traditional business intelligence or analytics solutions which convert data into a common format for subsequent analysis, Big Data normally leaves the data in its native form and instead provides flexible access or synchronization tools to bring data types together for analysis when needed.
At a basic level, Big Data tools empower data access, sync, search, visualization, analytics and mining.
Consider Big Data when:
- Data obtained from structured, semi-structured and non-structured sources will contribute to better decision making.
- Data helpful for decision making resides in locations outside the company servers.
- Data is needed for early exploration analysis and before parameters which define decision making boundaries are known; essentially to discover and then refine data sets later.
- Solving problems that incur infrequently and thereby benefit from more volumes of data to determine patterns, or correlating or causal relationships, or historical occurrences or trends that are otherwise difficult to detect.
- Aiding new decision making where the organization lacks internal data needed to make first time business decisions within a reasonable confidence level.
- Existing data is unusable by inflexible analytics tools because the source data is in a raw, unstructured or denormalized format.
- Searching for anomalies, or things that didn't happen as opposed to searching for known events or things that did happen.
Business leaders know the competitive value of information and many are now tapping into this information explosion for profit-driven objectives. Big data is about to become a new normal with regard to decision support. Your choice of when and how to leverage this asset and transform it into intelligence for specific business objectives will determine how and when you are more or less successful than your competitors.
A Big Data Implementation Approach
There is no singular method to deploy a business intelligence solution to answer unique company questions, but there is an approach to take advantage of Big Data which minimizes risk and increases the likelihood of a successful outcome.
- Begin with Stakeholders — No project should begin before identifying your stakeholders and their success criteria, and while the C-suite is part of this crowd, Big Data stakeholders are also the knowledge workers and decision makers. Hmm … that actually means pretty much everybody in the company, which means you’ll need to categorize stakeholders by role, prioritize their information making value and pursue a sequential and progressive road map.
- Consider Culture — For many organizations, better decision making requires a cultural shift which expects data-driven, fact-based decisions, and does not accept unsupported or gut-feel conclusions. Business leaders need to champion an internal emphasis on optimizing business performance through quantitative measurements. “Show me the data” is the mandate when executives or managers are being asked to approve decisions or recommendations.
- Find Your Data Stewards — Finding the right people to define data governance and implement the data management processes can be tough. Complex analytics have historically been relegated to statisticians, analysts, data scientists or other highly cerebral thinkers. But such titles are not within the org charts of most companies, and integrating these roles with line of business managers to solve business problems can be a challenge. New technologies and a new breed of data stewards are finding that a mix of technical and business skills, whether from a single person or members of a tightly aligned team, are producing successful results.And interestingly, these roles may or may not report to IT. In fact, with increasing frequency, analysts are operating in more decentralized environments closely aligned with departmental functions. According to Forrester’s James Kobielus, data analytics teams are usually organized by business function or placed directly within a business unit. Kobielus shares that developing, testing and maintaining complex analytical data models requires strong domain and business knowledge, a requirement that doesn't easily lend itself to centrally controlled analytics.
- Set Clear Goals — Big Data projects are hard, so don’t try to boil the ocean. Instead start small, show a win and grow incrementally. Catalogue use cases and decisions which benefit from Big Data attributes, weight each decision’s impact, and then make the goals use case driven. Scope the entire information management landscape, but pick out the low hanging fruit. And remember that goals are not complete until they are SMART (smart, measurable, actionable, realistic and time-bound).
- Create the Plan — When developing your plan, link the goals to the constructs that define Big Data (volume, velocity and variety). Also recognize Big Data is a compliment, not a replacement, to your existing analytics such as data warehouses, OLAP, and decision support systems (DSS). And of course no plan is complete without ROI projection, but don’t try to create an overarching “Big Data ROI” forecast. Instead develop ROI forecasts for each of the use cases. For example, if using customer sentiment social analytics permits changes to product offerings or customer support which in turn lowers customer churn by 2% annually, what's that use case worth to the company?
- Establish Metrics — In my experience in helping clients deploy Big Data solutions, I have found it helpful to limit the number of metrics to only a few high priority measures, rather than a more exhaustive list. The two I generally favor out of the gate are Time to Decision and Decision Impact. When flattening the Time to Decision, the entire use case cycle should be measured. Taking this data in motion metric a step further can tap into the velocity of data, or in this case, the complete cycle time from when data is sourced to when it is consumed. Most data has a limited shelf life, and data management isn’t free, so this measurement can aid data management effectiveness — and especially cost — in a big way. There are many ways to calculate Decision Impact, such as reduced risk, increased confidence levels or Quality of Decision, but I prefer to translate Decision Impact results to financial metrics either in the form of cost avoidance or incremental revenues.
- Deploy the Technology — There's a conundrum that Big Data technology can help resolve. Most companies are both drowning in data and starved for information. Most companies have a tough time getting value from the data they already have because that data is unstructured, unclassified or in an otherwise raw form. Further, while data volumes are growing exponentially, companies recognize they don't have the best sources of data and among the data they do have, their ability to process multiple data types is limited at best. And businesses that are unable to manage their data become overwhelmed by it, fail to harness it for any material benefit and are left to make serious business decisions in the dark.Big Data by definition applies to information that cannot be leveraged using traditional processes and tools and can resolve many of these all too common challenges. A common technology starting point is the open source Big Data engine, Hadoop. This tool is particularly well suited for loosely structured or unstructured data as well as high volume search and discovery. I should point out that it can also be used with structured data, but in my experience information systems plans seldom use this tool for this purpose. Also in the context of information systems plans, there is a data management mind shift that often needs to occur. We've grown to accept and even demand that in a data warehouse, DSS or structured data analytics tool data must traverse a data management path whereby it is sourced, cleansed, normalized, tagged with metadata and compliant with the Master Data Management (MDM) strategy, which collectively leads to an expensive and time consuming process for information with limited scope and definitive parameters.
Big Data is different. It normally stays in its native object format. Because the unstructured data incurs less processing and the fidelity of the data remains intact, the labor and data management costs of Big Data are lower than traditional structured data analytics. Big Data is available for the questions not yet asked, or the questions that will be posed from successive learning that leads to more questions. Big Data repositories with intelligent search tools permit decision makers to sift through large volumes of data in order to discover the data nuggets that are valuable. Sometimes, once those valued data sources are identified they sync with or migrate into more structured tools such as data warehouses and DSS.
Due to the exponential rise in data volume, velocity and variety, this approach of managing big data in a relatively low cost model, until such information is gleaned and warrants further processing, makes the best financial sense. A slew of start-ups and large IT vendors now offer tools that compliment and extend Hadoop for various purposes. And while these tools are powerful, they are not required to tap into the benefits of Big Data. In my most recent project, we did a quick and dirty deployment of Radian6 to collect customer sentiment, and then used Hadoop as a data management platform and Cognos Consumer Insights as a rich data visualization engine to discover new data points, data snapshots and trends in order to bring data to what was otherwise guess work. The project achieved a 5 month payback and a very impressive ROI.
- Make Big Data Little — Delivering little data in context with business use cases to decision makers in a way that insights are easily consumed and acted upon represents the last mile in making Big Data useful. To aid the challenge, data insights should be tailored by role and included in the applications, devices and channels where decision makers spend their time, and in way where the insights are aligned or joined with their existing presentation technologies. For example, rather than requiring a separate application for big data presentations, it’s far more effective to include big data insights within existing decision support systems, or within existing business apps such as CRM, ERP and HCM software applications.
- Design for Continuous Process Improvement (CPI) — Making better business decisions is not a onetime activity, so integrating a CPI methodology with the plan will achieve learning, improve performance and earn increasingly higher ROI over time. There are plenty of good CPI methods to test, learn and improve over successive iterative loops, however, my experience in deploying Big Data is that simplicity is a big benefit, and for that reason I recommend a CPI such as the Malcolm Baldrige Plan-Do-Check-Act (PDCA). It's a KISS method that works well in simplifying complex projects to the maximum extent. I've spoken with Big Data consultants who opt for Six Sigma, but except for Government, Healthcare and some Financial Services companies, I find this methodology to be more investment than it's worth.
Big Data Use Cases
Sometimes use cases help to drive points home. In reality, Big Data use cases are as varied as big data itself. Nonetheless, here are some use cases I’ve seen or implemented which may stimulate your creative thinking and better enable you to apply these concepts to your own business.
Customer Sentiment — Customer sentiment is being expressed for every company, product and service in existence over multiple social channels at an increasing rate. Using social monitoring and text mining tools, there’s a compelling opportunity to analyze what prospects and customers think about each of your products or services, as well as what they think about each of your competitors’ products or services, and correlate this sentiment analysis to sales efforts, product mix, marketing spend, advertising expense, loyalty programs, market share, customer share, competitor programs and specific cost and profit measures. This type of correlation is powerful in manipulating company operating decisions to influence customer behaviors with predictive responses. Taking this a step further, there’s also an opportunity to correlate customer sentiment analysis with broad economic factors, specific market indicators, competitor moves or other factors that may uncover patterns that permit companies to model changes for improved customer consumption and company performance. It's become abundantly clear that companies which do not track customer sentiment are losing customers to companies that do, and who are manipulating their business models or product offerings to capitalize on that customer sentiment.
Customer Experience (CX) — Big Data offers an opportunity to tap into the many internal and external customer interaction and behavioral data points to detect, measure and improve the desired but illusive objective of consistent and rewarding Customer Experience success. Big Data can access and bring together what are normally siloed data repositories housing many types of semi-structured and unstructured data in order to capture the information necessary to achieve a complete view of the customer experience, link this information to demographic, cultural and other preferences, and leverage the information for improved customer services delivery (order processing, product delivery, invoicing, customer support, renewals, etc.), increased revenue objectives (i.e. up-sell, cross-sell and customer share) and decreased customer churn.
Predictive analytics — IMHO, predictive analytics driven by Big Data may quite possibly be single biggest opportunity for business growth initiatives. Leveraging broader data sets to improve visibility and confidence for strategy development, new product introductions, increased geo presence and other business development decisions which incur investment, long periods of roll-out, risk and predicted payback can aid decisions which directly impact top line revenues.
There's no business segment that can't benefit from improved decision making, but here are some line of business examples for business managers to consider.
Marketers are using Big Data to better forecast what products to sell to what customers and when, and how to bundle products to increase sales. Marketers are sifting through external data to determine what products correlate to different customer segments for increased sales conversions. They are also using big data such as comparable and competitor price lists and market acceptance rates to calculate pricing elasticity for their own products and services. Marketers are using social graph analysis to implement influencer marketing campaigns with some big results. In this scenario, social media data is mined to show which customers or even non-customers have the most influence over others inside social circles. This helps marketers leverage those influencers in a unique way and with results that generally cost less and perform much better than traditional advertising methods.
Sales managers are analyzing website and social media data to identify products and services frequently viewed (i.e. measured by volume of page reads, long page durations and low exit points) but not as frequently purchased in order to uncover the barriers standing in the way of purchase, and unlock otherwise hidden sales opportunities. They are also using Big Data to reduce customer churn by combining internal data from customer service systems or help desks with external behavioral data from social streams in public or semi-public venues to detect customer dissatisfaction, predict customer churn and implement remediation measures.
Customer service managers are extracting unstructured data from social networks and social streams to better predict product defects based on consumption rates, usage patterns and geographies, and how product defects occur or accelerate when used with other products. They’re also using this information to take proactive actions and implement remediation measures, sometimes in advance of the defects occurring. By analyzing customer complaints (tweets, SNS/SMS, etc.) along with the volume, trending and responses to those complaints, extracting anomalies or patterns, and comparing the data to defect signs or product complications, proactive action can be taken while costs to repair both products and reputation are low. Some Call Center managers are even tapping into customer service call recordings (using customer profiles, keywords and sentiment analysis) to quickly detect product deficiencies early; not when the period-end reports are compiled and read two weeks later.
Human Capital Management professionals are better optimizing human resource assignments based on environmental factors, customer market trends and their employees’ online social profiles. In addition to cost optimization, many are also seeing results which include increased staff engagement, productivity and performance. In my discussion with the President of SHL, she describes how companies can apply big data to permit business leaders to compare and benchmark talent in their companies across staff, groups and even other companies.
Retail organizations are using store security videos to understand in-store customer traffic patterns (which can also be done with IC/RFID tags on carts) and determine how changes to in-store configuration can impact revenues. They are also correlating this information with Point of Sale (POS) and weather data to understand how environmental conditions impact product positioning, promotions and sales.
Risk managers and similar risk management professionals are using more external data sources to uncover correlations and patterns which lead to intelligence that reduces risk and moves management closer toward continuous risk management.
Finance is becoming a more recognized data and decision making enabler to the rest of the business. Finance staff are generally analytical thinkers and financial applications are often the only information systems that tie all other data to profit and loss measures. Finance has the ability to link all other Big Data elements to measures such as customer profitability, Customer Lifetime Value (CLV), product margins and other data sets which link to financial outcomes and therefore must be a part of any Big Data initiative.
Online recommendation engines, fraud detection, risk modeling, research and development and so on, the possibilities of applying more data for improved decision making are endless.
Big Data Risks and Challenges
Like all disruptive technologies, Big Data isn't without its risks.
A recurring problem I see is companies that put the technology ahead of the processes, people and specific outcomes. They essentially work forward from technology, instead of backwards from business outcomes. While many understand the value of information harnessed from a myriad of internal and external sources, fewer understand how to make that information accessible and actionable at the exact point where it can be used by knowledge workers across the organization.
These challenges to leveraging Big Data for SMART business objectives are no different than other information analysis methods and must start with an integrated people, process and technology plan which includes the processes to identify and capture the data, the tools to manage (access, sync, merge, store, tag, annotate, etc.) the data, and the right-time distribution of that data to the person or interaction where it can be applied for specific purposes and consistent results.
Another pervasive challenge with big data is data relevancy. With more data comes more noise. Business analysts will need to classify data into a spectrum running from noise to signals in large part based on the use case and weighted results of the data.
Other challenges such as data privacy, information security, information distribution, data presentation and even data overload (aka analysis paralysis) are not unique to Big Data, and the risks and resolutions can be learned from the lessons and best practices of other business analytics solutions.
Big Data ROI
Big Data technology investment can be relatively low due to underlying open source tools. The bigger cost is the labor needed for planning, cultural alignment, process definitions and deployment. And when labor is the largest cost involved in a technology deployment, that cost can be reduced or multiplied based on the specific resources allocated. Clearly, untrained resources apply a trial and error approach while leveraging expert resources streamlines the effort and generally gets it right the first time.
Also, I've found when working with clients that initially struggled in Big Data projects, they often lacked the creative thinkers which then slowed or stalled their project. The likelihood of project success will increase dramatically if you allocate one or more creative thinkers or innovators, who can collaborate with executives and line of business managers to flush out decisions that will benefit from increased data, and quickly hypothesize the types and sources of information that will aid those decisions. Huge sets of external data are available from Government and NGO bodies, social media and commercial services, and those people versed in these and other data sources will of course accelerate the process.
While deployment can be challenging, the really encouraging news about Big Data projects is that they generally deliver big paybacks. A recent Nucleus Research report titled The Big Returns from Big Data found that big data projects which connected internal data sets with social media earned, on average, 241 percent ROI. For example, one ROI analysis demonstrated how a vacation resort company drastically cut labor costs by syncing its scheduling process with data available from the National Weather Service. In the research survey, IF the big data deployment was successful, it likely earned a significant ROI well in excess of 100 percent.
Seize the Data
Big Data has not yet crossed the chasm to mainstream adoption, but is clearly delivering success for early adopters and is now at an inflexion point. For most businesses, Big Data methods are as unique as their corporate cultures and business processes, and the technologies are more bespoke than packaged.
This leaves Big Data success to those business champions that can act as change agents, rally staff around high payback projects and innovate new processes merged with enabling technology. For these reasons, Big Data will provide big payback for adopters, but it's not for everybody just yet. Business innovation and technology laggards will only acquire Big Data solutions when they are more packaged, easily deployable and no longer offer competitive advantage.
The ability for businesses to glean valuable nuggets of information from near limitless sources and apply that knowledge at just the right time to achieve tactical goals will clearly elevate those businesses over their competitors who operate without such knowledge. This is the power of Big Data and it will separate competitors in terms of business success.