Gradient

Starschema Inverba NLG™

Transform dashboards into easy-to-read text

From Dashboards to Words

As humans, we think and communicate in narratives. Numbers and visualizations help, but taken on their own, they often don’t tell the full story. A good dashboard may be useful for showing trends and conveying metrics, but it’s words that resonate with decision-makers and provide them with important context and situational awareness — especially where complex, multi-source intelligence needs to be condensed to a concise and accessible narrative.

In many cases, this is accomplished manually, at great time expenditure, and subject to personal bias when determining what to put into the verbal summary. Starschema’s Inverba NLG™ solution is designed to use a combination of cutting-edge natural language generation (NLG) technology, saliency detection and effect advanced NLG realization (composing grammatically and syntactically correct sentences from a knowledge graph) to support the creation of actionable, well-written text-form reports directly from dashboards. Whether the purpose is creating verbal summaries or presenting dashboard data with a verbal explanation that puts matters into context, Inverba NLG™ is the optimal solution for converting enterprise dashboards into well-composed, informative prose.

How Inverba NLG™ works

Unlike other natural language report generators, Inverba NLG™ isn’t merely concerned with generating natural language reflections of data. Rather, using a combination of anomaly detection, pre-set values and a dynamic template, it creates a summary of data that points out what matters most. Instead of converting the data through a rigid template to a static, repetitive text output, a saliency model identifies what the most important results are, and adjusts syntactic, grammatical and compositional structure accordingly.

Templates provide overall sources as well as ‘saliency priors’ – that is, initial values that describe the saliency of a particular KPI – which is updated by the quantitative saliency metric of the KPI in question (thus e.g. a rapid spike in a KPI makes that metric more salient while no change makes the metric less salient). Inverba NLG™ then fills the dynamic template with data to prioritize high saliency data (e.g. unexpected/anomalous data, data that has exhibited a significant change or data meeting certain pre-defined critical thresholds). The result is a natural language summary of one or multiple dashboards that reflects the most important insights, does so in the most efficient way possible and puts data into context, thus guiding decision-makers’ attention to the most important facts.

Technology

Inverba NLG™ is designed from the ground up to transform dashboards into coherent, easy-to-read text that maintains sufficient consistency to support standardized reporting needs but offers enough flexibility to emphasize what matters most. Unlike purely template-driven approaches, Inverba NLG™ does not need a rigid template definition, and can guide the reader to the most important items first. A saliency model, driven by both pre-set business rules (such as reporting requirements) and data-inherent saliency detection (anomaly detection, trends, differentials), evaluates each metric and composes the output text in a manner that guides business users’ attention to the most important KPIs. Each of these saliency metrics can be provided by an external system through standardized APIs (REST, SOAP, RPC). Composition and sequential order are used to highlight the most salient changes, ensuring that even readers under time pressure get the ‘bottom line up front’.

Inverba graphic 1

Figure 1. Starschema Inverba NLG™ generates a text output from a source dashboard by isolating individual metrics or KPIs, then evaluating their saliency based on pre-set business rules, trends, anomalies and differences.

Extensively customizable, Inverba NLG™ can accommodate a wide range of KPIs, including quantitative metrics (values and changes), qualitative metrics (names, locations, products) and categories. The underlying semantic model supports commonly used notions in business intelligence, including Semantic Drill-Down, which refers to using downstream data to explain an upstream KPI. This is especially useful to explain drivers for particular KPIs: Semantic Drill-Down would be used, for instance, to accompany a KPI showing a rise in overall sales by a list of the top products that account for most of that increase.

In addition, Inverba NLG™ can factor external anomaly and trend detection metrics into the saliency of a metric, seamlessly integrating the business’s own tried and tested anomaly and trend detection algorithms into the NLG workflow. Because such algorithms are often domain-specific, businesses get the best of two worlds: a high-quality anomaly and/or trend detection framework without additional investment, alongside Starschema’s powerful Inverba NLG™ core language output generation capabilities.

Architecture

Using state-of-the-art containerization practices, Inverba NLG™ operates as a fully isolated appliance with complete security against data leakage. REST/SOAP API interconnectivity is provided to pull in data from external data sources where required, and connectors to a range of data sources (RDBMSs, NoSQL databases, graph databases, BI tools) are provided to integrate with enterprise systems. The built-in scheduler can maintain a flexible operating schedule, and reports are versioned automatically.

Inverba graphic 2

Figure 2. Designed for fully on-premises operation, Inverba NLG™ comes with ‘batteries included’, such as connectors to a wide range of BI tools and data sources, its own scheduler and output connectivity.

Written in Python, Inverba NLG™ is highly maintainable and offers enterprise-grade reliability, with integrated error reporting. Advanced safety features such as no-cache on-the-fly generation of reports is available where safeguarding commercially sensitive information is paramount. Snippets – each corresponding to a KPI or group of KPIs – are defined using a convenient and feature-rich templating environment that automatically caters for plain text and rich text recipients alike (including conditional formatting) and are composed into a coherent text output according to the saliency metrics. This can be made available via a REST API, integrated into websites or automatically sent to e-mail addresses or distribution lists (outgoing e-mail server required).

A single instance of Inverba NLG™ can schedule and manage a range of different reports from different sources – for instance, a widely distributed monthly report and a highly confidential daily update for top executives could run in parallel without interference or risk of data leakage. Release approval functionality allows designated users to receive a preview before approving and sending the final version. Approvers can append notes and remarks to the e-mail, either in designated areas or by appending to the text of the main output.

Contact an expert:

Chris von Csefalvay

Principal Data Scientist

[email protected]

About Starschema

At Starschema we believe that data has the power to change the world and data-driven organizations are leading the way. We help organizations use data to make better business decisions, build smarter products, and deliver more value for their customers, employees and investors. We dig into our customers toughest business problems, design solutions and build the technology needed to compete and profit in a data-driven world