integrating data science and business planning



Thomas leads a staff at Google known as “Operations Knowledge Science” that helps Google scale its infrastructure capability optimally. ln this publish he describes the place and the way having “people within the loop” in forecasting is sensible, and displays on previous failures and successes which have led him to this angle.

Our staff does a whole lot of forecasting. It additionally owns Google’s inside time sequence forecasting platform described in an earlier blog post. I’m generally requested whether or not there must be any position in any respect for “humans-in-the-loop” in forecasting. For prime stakes, strategic forecasts, my reply is: sure! However this does not need to be an either-or selection, as I clarify under.

Forecasting on the “push of a button”?

In conferences and analysis publications, there’s a whole lot of pleasure nowadays about machine studying strategies and forecast automation that may scale throughout many time sequence. My staff and I are excited by this too (see [1] for reflections on the latest M4 forecasting competitors by my colleagues). However wanting by the blogosphere, some go additional and posit that “platformization” of forecasting and “forecasting as a service” can turn anyone into a data scientist on the push of a button. Others argue that there’ll nonetheless be a novel role for the data scientist to cope with ambiguous targets, messy information, and understanding the boundaries of any given mannequin. These views may be helpful if seen as a part of a spectrum of forecasting issues, every calling for various approaches. However what’s lacking from this dialogue is that the vary of the position of people within the loop is wider than simply that of the info scientist. There are some issues the place not solely ought to the info scientist be closely concerned, however the information scientist also needs to contain non-data scientist stakeholders within the forecasting course of.

Tactical vs strategic forecasts

Forecasting issues could also be usefully characterised on a continuum between tactical on the one hand, and strategic on the opposite. This classification is predicated on the aim, horizon, replace frequency and uncertainty of the forecast. These traits of the issue drive the forecasting approaches.

The desk under summarizes totally different forecasting issues as tactical and strategic:

Strategic Forecasts

Tactical Forecasts

Downside Traits

  • Objectivean enter to medium- to long-term planning so as to information product, funding, or excessive stakes capability planning choices
  • Horizon: months to years
  • Replace frequency: month-to-month or much less
  • Uncertaintytroublesome to quantify solely primarily based on historic information on account of lengthy horizon, non-stationarity and probably censored information.
  • Objectivean enter to short-term and largely automated planning processes like stock replenishment, workforce planning, manufacturing scheduling, and many others.
  • Horizon: days to weeks
  • Replace frequencyweekly or extra
  • Uncertainty: quantifiable by mannequin becoming or backtesting on historic information

Forecasting Approaches

  • Strategies: triangulation between alternate modeling strategies and what-if evaluation
  • Key metrics: summaries of forecast adjustments and drivers; thresholds to flag important gaps between alternate forecasts
  • People within the loop: information scientist to recommend totally different forecast strategies and generate or acquire these forecasts; stakeholders to evaluation variations and approve a “consensus” forecast
  • Strategies: automated pipeline of time sequence forecasts; if many associated sequence can be found then international, ML and/or hierarchical fashions could also be acceptable
  • Key metrics: level forecast and prediction interval accuracy metrics for mannequin analysis
  • People within the loop: information scientist to construct and keep fashions; could embrace judgmental changes on mannequin output, however used sparingly

Desk 1: Strategic and tactical forecasts.

Some additionally distinguish additional between tactical and operational forecasts [2]. The latter contain updates at the very least every day. On this case there isn’t any time in any respect for human evaluation, and forecast automation is important.

In selecting the suitable methodology, a key distinction lies within the enterprise stakes related to a given forecast publication cycle. Primarily based on the choices being made and the way rapidly plans can alter to new forecast updates, what’s the price of forecasting too excessive or too low? If the prices of prediction error are uneven (e.g. predicting too low is extra expensive than predicting too excessive), choices ought to plan to a sure quantile forecast (e.g. 95th percentile). This can be true for each strategic and tactical forecasts. For instance, long-term capability or short-term stock could also be deliberate to a excessive quantile forecast, if the price of a scarcity is way better than the price of holding extra.

The ROI of human involvement

In the case of human involvement, the important thing distinction is within the magnitude of prices related to anybody forecast cycle. What’s the discount in value of a forecast that was improved by human intervention? This defines the ROI on the funding of human time.

Tactical forecasts have a better frequency of updates and a shorter forecast horizon. Thus, there’s each much less time to make changes and fewer return on human time in doing so. If the variety of forecasts shouldn’t be too unwieldy and the forecasts not too frequent, there could also be some room for what Hyndman calls “judgmental changes” to the mannequin output, as a type of lightweight model of strategic forecasting. Hyndman cautions that changes must be made sparingly utilizing a structured and systematic method and are “handiest when there’s important further data at hand or robust proof of the necessity for an adjustment.” [3].

In distinction, strategic forecasts profit from a better stage of human evaluation and a extra formal course of for triangulating between totally different forecast strategies, a few of which can rely totally on judgment and forward-looking data.

Determine 1: A Google information middle

For example, think about Google’s forecasting and planning for information middle capability. This capability is deliberate years upfront on account of lengthy lead occasions for land, utility infrastructure, and building of bodily buildings with cooling infrastructure. As soon as constructed, the info facilities may be populated with compute and storage servers at a lot shorter lead occasions. Future demand for servers is unsure, however the price of empty information middle area is way lower than the scarcity value of not with the ability to deploy compute and storage when wanted to help product development. We subsequently plan capability to a excessive quantile demand forecast.

Prediction intervals are essential for our quantile forecast. However in contrast to within the tactical case, now we have a restricted time sequence historical past out there for backtesting. Nor can we be taught prediction intervals throughout a big set of parallel time sequence, since we try to generate intervals for a single international time sequence. With these stakes and the lengthy forecast horizon, we don’t depend on a single statistical mannequin primarily based on historic tendencies.

I generally see the misguided utility of a tactical method to strategic forecasting issues. Strategic forecasts drive excessive stakes choices at longer horizons, so that they shouldn’t be approached merely as a black field forecasting service, divorced from decision-making. Carried out proper, strategic forecasts can present insights to choice makers on tendencies, incorporate forward-looking information of product plans and know-how roadmaps when related, expose the dangers and biases of counting on anybody forecasting methodology, and invite enter from stakeholders on the uncertainty ranges. On this case, there’s a excessive return on funding (ROI) of human time to triangulate amongst totally different forecasts to reach at a consensus forecast.

This concentrate on the ROI of human time activates its head the traditional knowledge from 50 years in the past, the place the essence of how to decide on the proper forecasting method was how a lot computational time to take a position to reach at “ok” forecast accuracy [4]. At the moment, as computation has turn into low cost, the important thing tradeoff is between the human time invested vs “ok” forecast accuracy. For tactical forecasts of many parallel time sequence, computational time should be a consideration. However even right here, a better concern is the time invested by information scientists (largely on the growth stage) in information evaluation and cleansing, function engineering, and mannequin growth.

A forecast triangulation framework

As acknowledged earlier, strategic forecasts ought to triangulate between quite a lot of methodologies. However this doesn’t imply merely presenting a menu of forecasts from which choice makers can select. Contemplate once more the instance of long-term forecasts for information middle capability planning. We would generate at the very least three kinds of forecasts with basically totally different world views on which components actually drive development: an “affordability” forecast primarily based on forecasted income development and the connection between information middle capability and income development; a “useful resource momentum” forecast primarily based on historic tendencies for compute and storage utilization translated into information middle capability wants utilizing know-how roadmaps; or a “energy momentum” time sequence forecast primarily based on historic consumption of usable information middle capability (measured in watts of usable capability). Every has some benefit, however merely presenting all three as a selection shirks duty for really arriving on the “greatest” forecast.

The information scientist might attempt to construct a single mannequin that integrates all of the indicators collectively, however doing so usually depends on historic information to find out which options have essentially the most predictive worth. Boiling all the data all the way down to a single mannequin doesn’t assist us problem to what diploma we predict the long run will differ from the previous. A single mannequin may not make clear the uncertainty vary we really face. For instance, we could desire one mannequin to generate a variety, however use a second scenario-based mannequin to “stress check” the vary. If the alternate mannequin is believable with a small chance, then we’d wish to see that the “stress check” forecast situation nonetheless falls contained in the prediction interval generated from our most well-liked mannequin.

Slightly than offering a menu of fashions, or a single mannequin, the info scientist must play an even bigger position in reviewing and evaluating forecasts. Specifically, the info scientist should take duty for stakeholders approving the “greatest” forecast from all out there data sources. By “greatest forecast”, we imply essentially the most correct forecasts and prediction intervals. Utilizing a number of forecasts forces a dialog in regards to the drivers, a revisitation of the enter assumptions. It gives the event for deeper exploration of which inputs that may be influenced and which dangers may be proactively managed.

Over the lifetime of the forecast, the info scientist will publish historic accuracy metrics. However as a result of very long time lag between forecasts and actuals, these metrics alone are inadequate. The information scientist will conduct autopsy analyses and changes when precise demand deviates considerably from the forecast. Each forecast replace will embrace metrics to supply perception on change drivers, and can flag important gaps between totally different mannequin forecasts.

Notice that this method assumes that the forecasts straight drive excessive stakes choices, and should not judged subsequently. If forecasts are merely used as a baseline to detect development adjustments, then different approaches and fewer funding could also be acceptable. However even in these instances, it could be well worth the information scientist’s time to know which choices are the truth is being made primarily based on the tendencies detected by the forecast. (If the forecast can’t affect choices, it doesn’t benefit an information scientist). Lastly, information scientists should acknowledge that their forecasts could also be used extra broadly than first anticipated, and broad communications could have extra worth than first realized. It’s subsequently good self-discipline to supply forecast launch notes explaining key dangers and drivers of change.

The diagram under exhibits how we method strategic forecasting for top stakes infrastructure capability planning choices at Google. The information scientist surfaces variations between a proposed forecast and a number of benchmark forecasts. The proposed forecast is the forecast believed to require the fewest subsequent changes. This proposed forecast should produce other shortcomings, akin to being vulnerable to biases of human judgment, or missing a sturdy prediction interval. The benchmarks forecasts are used as cross-checks, and to realize perception into how the long run could differ from the previous.

The information scientist advocates for strategies to incorporate as benchmarks, in addition to the tactic used because the proposed forecast. Some however not essentially all of those forecasts could also be generated by the info scientist straight. For instance, proposed forecasts could come from clients, if their forecasts are primarily based on forward-looking details about product and know-how plans that may be troublesome for the info scientist to extract as inputs right into a predictive mannequin. Not less than one of many forecast strategies can have a quantitative prediction interval generated by the info scientist, in order that different forecasts may be thought of within the context of this vary.

Determine 2: Forecast triangulation

Integrating buyer forecasts with statistical forecasts

In strategic forecasting, the proposed forecast could rely partially on forecasts or assumptions not owned by the info scientist. Within the provide chain context, forecast and knowledge sharing between consumers and suppliers is named “collaborative forecasting and planning.” This collaboration may also be between inside clients and an inside provider. Utilizing buyer forecasts because the proposed forecast can seize precious details about future inorganic development occasions or tendencies which are troublesome to extract as options for a predictive mannequin. Then again, these buyer forecasts may be aspirational and infrequently lack top quality prediction intervals. Buyer forecasts could additional undergo from what Kahneman calls the “inside view”, the place a forecaster (a buyer on this case) could extrapolate from a slim set of non-public experiences and particular circumstances with out the advantage of an “exterior view” that may be taught from a a lot bigger set of analogous experiences. 

So what to do? An operations staff in want of a forecast to plan in opposition to could poorly body this as an either-or proposition — both they settle for the shopper forecast (maybe decoding it as a excessive quantile forecast situation), or discard it in favor of a statistical time sequence forecast with a quantitative prediction interval. The choice we use is the forecast triangulation framework described above. We acquire base and excessive situation buyer forecasts and generate statistical forecasts, and we construct a course of to approve a “consensus” forecast utilizing as inputs the proposed buyer forecasts and a number of benchmark time sequence forecasts. This enables us to seize forward-looking data signaled within the buyer forecast, whereas checking for bias and including prediction intervals from the time sequence forecasts. A variant of this methodology can be to supply the shopper with a baseline statistical forecast and permit them to make changes to it. Both can work nicely, so long as the distinction between the statistical forecast and authorized consensus forecast is reviewed, understood and authorized by a set of choice makers who’re accountable for the prices of each under-forecasting and over-forecasting. The place there are important gaps between a buyer forecast and a statistical forecasts, the method requires a great story to clarify the hole that everybody understands earlier than it’s authorized because the consensus forecast. It could take a number of forecasting cycles to resolve, however usually we see the next:

  • an approval by all events that the hole is respectable on account of forward-looking components
  • removing of outlier occasions from historical past or mannequin changes to enhance the accuracy of statistical forecasts
  • convergence between the shopper and statistical forecast
In an inside customer-supplier setting, now we have discovered it helpful to require “consensus” to imply alignment between clients and different stakeholders, for the reason that clients are those who most acutely really feel the ache of shortages on account of under-forecasting. Additionally included within the approver group is Finance (who’re notably involved in regards to the prices of extra capability) and operations groups (who’re accountable for executing on the plans and assist mediate between clients and Finance to drive forecast alignment).

There are extra refined options, akin to contracts and risk-sharing agreements between clients and suppliers. Actually, there’s a physique of literature on optimum contracting constructions between consumers and suppliers [5]. Sadly, formal risk-sharing agreements may be cumbersome and troublesome to place in place. That is notably true in operational planning domains the place contracting round threat shouldn’t be frequent, as it’s in monetary markets. We now have discovered {that a} easy however efficient method is to assist clients forecast by offering helpful statistical benchmark forecasts, whereas additionally inviting their enter on what “inorganic” occasions could require adjustment of the statistical forecasts. Not solely does this enhance forecast high quality and construct a typical understanding of forecast drivers, it additionally creates a shared destiny. Because the outdated adage goes, all forecasts are improper. It’s straightforward in hindsight for stakeholders to second-guess the info scientist’s statistical forecast if the info scientist didn’t make concerted efforts to seek the advice of them about forward-looking data they could have had.

Case examine: machines demand planning

Beneath is an instance of the evolution of an essential strategic forecast course of at Google. It illustrates the advantages and pitfalls of automation, and follows the thesis-antithesis-synthesis narrative.

Authentic course of — customer-driven forecast

In our provide chain planning for brand new machines (storage and compute servers), we’d inventory stock of part elements primarily based on forecasts in order that we might rapidly construct and ship machines to meet demand from inside clients akin to Search, Advertisements, Cloud, and Youtube. The operations groups would plan with solely excessive stage “machine depend” forecast primarily based on enter from our inside clients. There was no accounting for error within the buyer forecasts and no credible benchmark time sequence forecast. There was little inside alignment between product and finance capabilities on the machine depend forecasts — it was not unusual to see a 2x distinction within the machine depend forecasts throughout annual planning discussions. These would really be competing forecasts, and there was no clear course of for reconciling them nor documenting a single plan of file consensus forecast seen to all.

This disconnect between buyer and finance forecasts typically was not resolved exterior of provide chain lead time (over 6 months for some elements). The groups planning part stock had been subsequently left having to guage how a lot to low cost the shopper forecast (and threat being accountable for a scarcity) or uplift the finance steerage (and threat being accountable for extra stock). Even when the operations staff introduced supplies in early as a security inventory hedge, the forecasted mixture of elements would typically be improper. The end result was each poor on-time supply on account of shortages on particular elements and extra stock on account of general machine depend forecasts being too excessive. Within the face of poor on-time supply and lengthy lead occasions for brand new machines, inside clients wanted to carry giant reserves to buffer in opposition to unpredictable machine deliveries. We had the worst attainable outcomes — excessive provide chain stock and a poor buyer expertise that led to excessive idle deployed stock within the fleet.

First try — Stats forecast

Our first try at fixing this downside was to take away all people from the forecasting loop — we positioned all our bets on the statistical forecasts for brand new part demand. Our information scientists developed statistical forecasts for every machine part class and decided the forecast quantile we wanted to plan in opposition to for every part sort to fulfill on-time supply targets for machines. We invested a number of quarters in constructing and tuning the forecasting fashions to scale back the error as a lot as attainable. However this was by no means carried out as a result of the predictions required us to triple our security inventory stock so as to meet our on-time supply targets. The forecast ranges had been too huge and so the answer was simply too expensive. Element stock is deliberate to excessive quantile forecasts, and the excessive quantile forecast was pushed by outlier conduct prior to now, akin to inorganic jumps in demand on account of new product launches or particular machine configuration adjustments our clients requested. Our clients typically knew when these adjustments had been coming. Making an attempt to forecast primarily based on historical past alone, our fully-automated method was ignoring this forward-looking data.

Forecast triangulation

We needed to discover a extra environment friendly method to buffer for buyer demand uncertainty that didn’t permit for unbounded buyer forecast error primarily based on previous inorganic occasions or machine configuration adjustments. We subsequently invested in processes and instruments that may think about buyer forecasts as proposed forecasts, examine with benchmark forecasts, and arrive at consensus excessive situation forecasts that provide chain groups might plan supplies in opposition to. This required funding from software program engineering groups to seize forecasts from our clients in machine-readable type and convert these forecasts into items related to part capability planning. It required investments from our information science staff to re-think our statistical forecasting method to make it simpler to check in opposition to buyer forecasts. As an alternative of forecasting machine elements as we had first tried, we forecasted nearer to the true supply of demand — buyer compute and storage load on the information middle stage. We forecasted load at a finer granularity that allowed us to check buyer and statistical capability forecasts straight, netting load development versus current capability already deployed within the fleet. Our prediction intervals had been additionally shared with our clients as a software to rightsize the quantity of deployed stock they wanted to carry within the fleet. With this modification in focus, whereas leaving the precise mixture of machine configs to these clients who required particular machine sorts, our statistical forecasts had been extra secure with narrower ranges and a way more credible benchmark.

The information science staff additionally outlined metrics to drive forecast accountability for each the operations groups and buyer groups. Any important scarcity or extra within the consensus forecast may very well be traced again to its root trigger primarily based on the consensus forecast of file at lead time. The operations staff facilitated a month-to-month course of to drive a rolling 4-quarter alignment between clients and Finance on the consensus “base” and “excessive” forecasts so that each one downstream groups might confidently execute in line with the excessive situation forecast.

This consensus constructing course of helped us form the long run in a manner that neither the customer-driven nor stats forecast alone might do. By combining information science with course of rigor, we might expose key dangers and higher handle them, expose disagreements between stakeholders and negotiate to resolve them. This course of decreased imply p.c error (forecast bias) of the consensus forecast. We additionally realized that the majority of our clients more often than not weren’t really asking us to cowl almost as giant of an uncertainty vary as estimated by our first try at an automatic part forecast. The outcomes are summarized within the desk under.


Compared with a purely algorithmic forecast, including humans in the loop certainly adds ambiguity, complexity and effort. This can be uncomfortable for data scientists, and can make us vulnerable to feeling insufficiently technical or scientific in our approach. To the extent possible, we all want to take technically rigorous approaches that are free from human bias. We dream of the one model to rule them all that has access to the perfect set of useful features with a long data history from which to learn. But in strategic forecasting, the available time series is short relative to the forecast horizon, and the time series is likely to be non-stationary. Perfection is not possible. Ambiguity already exists in the business problem and in the variety of information one can bring to bear to solve it. Models that ignore key business drivers or uncertainties due to lack of hard data bring their own type of bias. It is the data scientist’s job to grapple with the ambiguity, frame the analytical problem, and establish a process in which decision makers make good decisions based on all the relevant information at hand. We believe this applies as much to forecasting as any other kind of data science. With oversight from good data scientists, there is much value in having humans in the loop of strategic forecasts.

[5] Graves, S.C. and A.G. de Kok. Provide Chain Administration: Design, Coordination and Operation, 2003. See the chapters “Provide Chain Coordination with Contracts” from G. Cachon and “Data Sharing and Provide Chain Coordination” by F. Chen.


Source link

Write a comment