Monday, June 23, 2008

Understanding The Analytic Spectrum

Reporting and analytic solutions have a wide footprint. A simple listing of orders that is due to be delivered today will pass as a reporting solution, as will the report on total corporate spend across a product category. However that is pretty much where the similarity ends. Almost everything else for these reports is significantly different, including the business process that each of these supports, the audience, frequency, and the process for producing the two reports. This complexity is generally hidden from the users, and frequently produces the frustration in the relationship between the IT, and business on long lead-times and large budgets in deploying the reporting solutions.

Below we present the analytic spectrum from a techno-functional point of view to add the business understanding to IT centric teams and the technology understanding to the business teams.

Operational Reporting is the lowest granularity of reporting. Its objective is to support day-to-day operations of a specified role. These reports need real-time data, and any exceptions need to be addressed immediately. These reports are frequently part of the application that supports the business function, and are directly queried from the underlying applications’ OLTP (on-line transaction processing) database or its mirror.

For example, take the Purchase Order Management function. An expediter may need the listing of POs that are late for delivery. This is simply a list of POs that fit the user specific date filter where the need date has passed and the status of the PO has not changed. An inventory analyst may need a list of all the POs that are expected to be received today to make allocation decisions, or a financial analyst may need a list of all POs received a day before for accruals. All the three reports are immediately produced from the Purchase Order Management application directly from its transaction data, and no data-processing is required for creating the report. The target audience for operational reporting is the people who manage daily operations of the supply chain functions like purchasing, receiving, shipping, etc.

Process Support Analytics is the next level of reporting where the data from Operational Reporting applications is consolidated, processed, and used to create process metrics. These process metrics typically point to inefficiencies in the processes, and help the managers tune them for better performance. These reports typically lose the individual transaction character present in the operational reporting. While an expediter needs the list of PO line-items due on a given day (operational), a manager may need information on the number of items that needed to be expedited from a given vendor in a month to establish if the process is operating normally or not.

This type of analysis typically needs information for a longer time horizon to compare and establish trend lines. The individual transaction information is consolidated and processed to produce counts, summaries, cumulative values and so on. The reports are typically produced by moving the transaction data from the application OLTP database to a process centric database that consolidates the information. For example you may have a purchase database where all purchase transactions from all purchase applications are brought together. In order to bring together data from disparate systems, the data may need standardization, cleansing and referencing. The data is not real-time, and typically brought over after the active life of the transaction is over, for example after the POs are “closed”. Such data stores are often called Operational Data stores (ODS).


Decision Support Analytics finally not only consolidate data for a process, but actually combine it across the processes. The objective of the decision support analytics is to provide inputs for improving corporate efficiencies across processes though better planning and optimization. Combining data across the processes typically needs the companies to be able to harmonize all master data so that the transactions from different business processes can be consolidated with the same context.

For example producing a total spend for a given product category across all vendors means that the financial and purchasing systems either have a common vendor, items, currency, and item hierarchy; or must know the mutual references to produce the common context.

Deploying the Analytic Spectrum

While it is quite simple to provide the operational reporting from individual applications, the complexity of the analytic environments increases exponentially for the Process and Decision support analytics. The most difficult part of establishing good functioning analytic environments is to be able to create common reference master and organizing data. The common master data refers to the entities like items, vendors, customers, locations, time, etc. that is used by several systems. The common organizing data refers to the hierarchies for items, locations, organizations, locations, etc. that is used to process the data up or down the hierarchies, or groups that are used to create consolidated numbers.

Creation of common master and organizing meta-data is a pre-requisite for success, and requires clear leadership from business and IT teams. Business teams need to understand the need of having a common reference, and provide the rules for cleansing and harmonization of this data. The IT teams need to be able to elaborate the need, and establish data staging areas where such cleansing and harmonization can happen with proper error and exception handling strategies. Without such common reference data and active IT-business partnership, any enterprise-wide reporting and analytic initiative is bound to fail.

In a future article we will look at the above spectrum in the supply chain context to establish what a supply chain reporting and analytics environment would look like.

Friday, June 6, 2008

Supply Chain Analytics & Data Discovery

When planning supply chain analytics, plan for conventional warehouse based reporting and analytics; and more sophisticated data exploration environment for data mining and statistical analysis. While the conventional reporting and analysis helps track and report on the efficiency of the supply chain processes, the data exploration tools can actually enhance your ability to continuously optimize the process parameters to function at their best. The good news is that you can use the same data warehouse for both the purposes.

Let us expand on the two underlying concepts for clarity.

Reporting and Analytics

This is the more conventional, familiar and easier to understand area of analysis. Most companies have reporting environments that allow them to generate standard reports. Some also have analytical environments that allow users to interactively generate multi-dimensional views for dissecting the data as they need. Most such environments are based on data warehouses that pull planning and transaction data from the supply chain applications, and present this data with common master data references. Some well known characteristics of such environments are as under.

  • Pre-defined data models and dimensions: Data models are pre-defined and rigid. The dimensions are constructed based on the original data model design and any changes need IT effort. The metrics are generally a combination of pre-defined standard metrics, as well as user created ad-hoc metrics that are based on formulas and use the existing data in the warehouse.
  • Processed and harmonized data: The contextual data such as items, locations, time, vendors, customers, etc. (also known as master data) is pre-processed and harmonized across all reporting applications. This is important because very few corporations currently have enterprise master data management systems. Any enterprise level reporting needs consistent master data to pull together the information from various applications and geographies and present an enterprise view.
  • Tactical or strategic, but repetitive and consistent metrics: This is the defining attribute of these systems. The metrics are consistent and repetitive. It is this characteristic that makes the warehouse’s rigidly defined data models possible.
  • Ad-hoc analysis components do not equate to data discovery: While ad-hoc reporting may be available, it is limited to providing user driven metrics that are computed on-demand. The analytics may provide multiple data views by user selected dimensions, and even consolidate data/metrics as user selects a different view. However none of this provides a true data discovery function that we would talk about in the following section on data exploration.

The conventional reporting and analytics provides a great way to build and report metrics on several supply chain areas, such as inventory, supplier compliance and sales analysis.

Data Exploration & Discovery

A lot of supply chain applications leverage data patterns. As these data patterns change and emerge, these applications provide solutions that are less than optimal. To keep these solutions at their most optimal levels, a data exploration and discovery environment should be created along-side the supply chain data warehouse. Such capability can provide clues to power users when the underlying data patterns change and allow them to change application configurations pro-actively rather than react to such changes using a conventional reporting environment.

A data exploration/discovery environment has the following characteristics.

  • No pre-defied data models, or dimensions: The data discovery models typically have no pre-defined models. They thrive on raw data. The models for discovery are created by the power users with a specific problem in mind. These models could be retained for future simulations, or thrown away after the target problem has been resolved, or the underlying reasons have been discovered.

As an example, consider a product profitability profiling model using data mining techniques. Once defined using the historical data on profitability, this model can be reused to “predict” the profitability profile of new merchandise before introducing the new products.

On the other hand an inventory profiling model may change from one season to another as the underlying parameters that drive such profiles change with time.

  • Involves data discovery: These models use raw data and discovery algorithms to find new and hidden patterns in data. The user does not have to “know” the data before using it, rather the system “discovers” the relationships, similarities, profiles, etc, that exist in the data and provides the output of such “discovery” to the user for review and decision support.

For example if you wish to create a profile of poorly performing stores, the user does not need to know what parameters to look for. Rather the discovery algorithms can group the poorly performing stores by exploring the historical data, and “discover” the parameters that are most relevant for such profiling.

  • Uses raw data: Unlike the reporting environments, the data discovery algorithms need raw, unprocessed data to be most effective.
  • Helps in setting up decision parameters for downstream processes: The data discovery and exploration tools are decision support applications that help the power users analyze data and make decisions on how best to run other related processes. For example demand forecasting, seasonal planning, strategic sourcing processes can all benefit from such analysis by detecting changes in data patterns through discovery.
  • Relatively long term and strategic in nature.

There are more supply chain processes that can benefit from data exploration and discovery. Some of these are flow path optimization, store cluster optimization for determining merchandise assortments, and targeted marketing etc. In fact any process that can benefit from the following functions is a good example of where data exploration and discovery can be leveraged.

  • Processes that depend on statistical analysis such as inventory planning, demand planning, supply planning
  • Processes that can leverage data mining and clustering techniques such as creating inventory groups for maintaining inventory policies, store groups for assortments, merchandise groups for profitability
  • Processes that can leverage simulation such as inventory planning, allocations, etc.

While most corporations plan and invest large amounts of capital on the conventional reporting solutions, the high-end data exploration and discovery solutions are comparatively rare. Part of the reason is lack of understanding on how these tools can be used as well as lack of people skilled in such tools who also understand the business of supply chain.

Most of the techniques mentioned above in the context of data discovery and exploration are part of a larger discipline known as predictive modeling and analysis. The predictive modeling techniques are used for projecting and managing risks and are quite well adapted in the financial and insurance industries. However their use for modeling and managing supply chains is still emerging.