Unlocking Self-Serve Insights with Structured Data

Unlocking opportunity: Why structured data is the key to self-serve business insights

All data is not created equal. Any size business will store, analyse and manipulate both structured and unstructured data on a daily basis. And, as every data team knows, applying business intelligence tooling to either one offers specific benefits, and unique challenges. 

  

Plenty of third-party providers are building tools to enable access to unstructured data insights. Fluent’s SQL generation solution unlocks the untapped potential of structured data through natural language querying. Nothing on the market has cracked the challenge of offering insights into both formats simultaneously, and it’s particularly difficult with unstructured data. Anything other than structured data just isn’t consistent enough to train an LLM on. 

Some traditional BI tools can store the two separately, allowing very manual analysis - but only experienced analysts will be comfortable doing this. It would require huge internal culture change and extensive, ongoing upskilling to see non-technical adoption of traditional, incumbent BI tools.

Platforms like Tableau are hard to use as a beginner, with static dashboards and minimal flexibility to manipulate data. Snowflake has expanded into some self-service features, with more storage and basic query answering. Looker has tried to offer something similar, including some semi-structured data-handling capabilities. But none of them are pick-up-and-play style solutions. They’re built for technical people, to do technical tasks. 

As a data team, if you want to enable more self-service insights, you likely already see natural language querying as an effective means of achieving that goal.

Your first challenge is choosing which data store to target a solution around and derive the most value from. 

A quick recap  

  • Structured Data

Relational, discreet, smaller in size and fits in a predetermined data model. Often tabularised.

Examples

CSVs, SQL databases, POS data. Almost always short, discreet, textual and/or numeric.

  • Unstructured Data

Larger, contextual, non-standardised and impractical to tabularise due to its inconsistent nature. 

Examples 

Audio or videos, longform documentation, images, blogs, PDFs, social media streaming, geographic data, search engine queries. 


Why we use structured data 

If you ask us, structured data is the truest way to calculate a deep, holistic view of a business. 

Many of the most requested business performance metrics are calculated with it: 

  • A marketing team can calculate the Customer Acquisition Cost of a new campaign.
  • A product team can work out the Customer Lifetime Value of their latest product release. 
  • A sales team can understand their Lead-to-Customer Ratio for a specific period of time, and make strategic decisions based on actual data. 
  • A HR team can better understand their hiring process, from time-to-hire to training cost per employee.

Structured data is tabularised, using shared, custom definitions and metrics provided by a data team. The uniformity, and relational nature of clean, structured data is what makes it so powerful when plugged into models like Fluent. When a user asks a question, text-to-SQL models can understand the shared context of the data available and clarify the query before answering. 

Of all the applications of natural language solutions, we find the most consistency and immediate value-add using structured data. Generating insights manually is labour-intensive and time-consuming for analysts. The queries themselves are conversely time-sensitive and focused on gaining quick, short-term advantage. So 

TL;DR: Automating the workload of ad-hoc structured data analysis is far more valuable to a larger business model than another off-the-shelf chatbot or complex, expensive BI tool the average user can’t understand.  

  

Additional benefits of building on structured data

Less Data Preprocessing (launch faster)

  • Minimal preprocessing required because the data is already in a consistent format.
  • Data can be easily cleaned, filtered, and transformed.

Easier Data Integration (less development headaches)

  • Integration with BI tools is straightforward due to the standardised nature of the data.
  • Query languages like SQL can be used to retrieve and manipulate data efficiently.

Faster Model Training (launch *even* faster)

  • Machine learning models can be trained on structured datasets with clear features and labels.
  • Data teams can apply definitions and guardrails in SQL to tailor answer generation. 

Higher Performance and Accuracy (use with confidence) 

  • High accuracy and performance due to the clear structure and organisation of data.
  • Easier to validate and interpret results.

How Fluent’s Query Augmentation supercharges your structured data 

“95% of questions are some version of something that has already been asked and answered in the past.”

We hear this sort of thing a lot during client calls - the above is a verbatim quote from one of our clients; a major, global consultancy. Data teams are aware that they’re retracing steps when answering most ad-hoc questions. They might produce hundreds, if not thousands, of dashboards to try and reduce that workload. Fluent’s Query Augmentation, the leveraging of historic SQL queries in your data warehouse doesn’t make answer generation faster. It becomes richer, more insightful and consistent, too. 

Rather than repeating queries becoming a drag on a data team, they now become an efficiency driver, contributing to more consistent, enlightening conversations with the user. 

Structured data certainly packs a punch for only making up roughly 10-20% of the data the average business produces. 

Query Augmentation in practice

A diagram detailing how historic SQL queries can expand the area of questions an LLM can answer

The equivalent for unstructured data would be more unpredictable, lengthy and require a lot more development to achieve. Platforms like ChatGPT, Chatbase and Mendable’s chatbots tend to vaguely revise, summarise or classify unstructured data, instead of actually understanding its wider context and expanding on the data during conversation. 

Conclusion 

Which of your data offers the most value through BI enhancements? It’s easy to think that the largest proportion - your nuanced, unstructured data - must be. We see it differently. 


Which datasets cause your data team the most headaches? 

Which data is hardest to read without technical knowledge?

Which data is relied upon most often to answer basic, quantitative questions?

Structured data queries dominate your data team’s time, constantly being translated and manipulated to provide critical context. Fluent seamlessly transforms structured data that sits at the heart of a business model, into a conversation anyone can have with minimal supervision. 

In short, we offer a market-ready, automated semantic layer for your most impactful and esoteric data. Months of development in minutes, and self-serve insights from day one. Maybe deciding where to start enhancing your data function doesn’t have to be so difficult?

If we’ve piqued your interest with the above, drop our team a line at the link below.  

Book a meeting with our team

To read, or download, part one of this content series - our research whitepaper ‘To Build or Buy’ - head here.