Ticker

6/recent/ticker-posts

Ad Code

Responsive Advertisement

Three Ways Data Products Empower Internal Users

Many companies want to give their employees access to data, but are overwhelmed by the size and complexity of the data, as well as security and privacy risks inherent with opening it up. One powerful way that companies are overcoming these challenges is by embracing the concept of a data product.

A data product is an application that’s created to enable users to access curated data or insights generated from data. Data products can be developed for an external audience, such as Netflix’s movie recommendation system, or they can be used internally, such as a sales data product for regional managers.

The data product is not a new concept. The idea can be traced back to 2012, when DJ Patil, the Chief Data Scientist under the Obama Administration, wrote and published “Data Jujitsu: The Art of Turning Data into Product.” Ten years later, Zhamak Deghani was turning heads with the concept of using a data mesh to enable individual data teams to build data products.

In its Hype Cycle for Data Management 2023, Gartner said a data product is “a curated and self-contained combination of data, metadata, semantics and templates. It includes access and implementation logic certified for tackling specific business scenarios and reuse. A data product must be consumption-ready (trusted by consumers), kept up to date (by engineering teams) and approved for use (governed). Data products enable various data and analytics (D&A) use cases, such as data sharing, data monetization, domain analytics and application integration.”

As companies continue to stockpile huge amounts of data, they’re turning to data products to help them turn the deluge into insights. Here are three ways that data products can help companies empower internal users:

1. Enabling Data Exploration

The typical company stores vast amounts of data across a multitude of data silos, including databases, file systems, object stores, and even directly within applications. Knowing what’s contained in those data stores is a massive challenge in its own right, and is step one in the data product journey.

Many companies today are adopting data

(a-image/Shutterstock)

catalogs to help them explore structured and unstructured data in a controlled and predictable manner. Data catalog vendors like Alation and others use metadata to track data within an enterprise and use indexes and other methods to help customers find the data they need. In addition to catalogs, Alation helps control access to data through data governance, and supports the concept of a Data Products Marketplace, where users can browse a variety of data products their company exposes, including domain-specific data products created as part of a data mesh.

John Williams, the executive director of enterprise data and advanced analytics at RaceTrac is using Alation’s software simplify its complex data environment.

“Our goal is to shift to a product mindset and treat data as a product,” he says in a quote on the Alation website. “In addition, we will align our data resources closer to the domains they serve for better knowledge sharing and faster delivery. Alation will help us unlock business value through data products extending data decision-making in our organization.”

While data silos are proliferating, the data lakehouse concept is also gaining momentum among companies that value the cost effectiveness of object storage but without giving up the high quality data that a traditional database affords. Dremio is enabling its customers to build data products atop data lakehouse using either a full Dremio stack, or combining Dremio tools in combination with other vendors’ software.

“We can connect to a remote catalog, whether that’s Glue, Polaris, or Databricks Unity. They can connect to any of those as a source and then use Dremio’s” query engine, James Rowland-Jones (JRJ), Dremio’s vice president of product management, told BigDATAwire at the AWS re:Invent 2024 conference.

“A better scenario is that they are building and using more of our UI and our experience to curate data products, which basically you can think of as like descriptors like wikis, labeling, tagging, orchestration, security and everything else, and they’re building that using a catalog that Dremio has provided,” JRJ continued. “And then the best scenario is when they are then building on top of that and really prepping themselves for AI-ready data. This is the ‘good, better, best’ progression that we are seeing.”

(pichetw/Shutterstock)

2. Ensuring Data Quality

One important aspect of a data product is the quality assurance it affords. Raw data often contains errors or needs a certain degree of shaping and transformation before it can be used. This is particularly true for derived data sets that are used as the source for downstream data products, as well as data that’s used for training AI models.

Companies can use various techniques for ensuring high quality data in data products. Ataccama, for instance, enables data engineers to set up and enforce data quality rules that ensure that data meets minimum standards. That’s important considering that the vendor recently found that 41% of organization report data quality as a major challenge.

Data transformations developed using tools like dbt can also help to embed quality controls directly into data product pipelines. Mark Potter, the CTO of dbt Labs, says dbt plays a big role in validating data used for downstream data products.

“One of the things that dbt does is it lets you know for sure that out of those hundreds of data sources coming in, those tens of data products are up to date and valid, and kept up to date at the right pace, at minimal expense,” Potter says. “The data product…has the quality stamp of approval on it from dbt.”

3. Providing Data Governance

Another way that data products can empower data-driven decision-making is through rigorous data governance. By automating the processes that assure companies that correct procedures are being followed regarding the provenance, lineage, security, and privacy of data, companies can move more quickly with their data product rollouts without worrying whether shortcuts are being taken.

(Den Rise/Shutterstock)

One of the vendors providing data governance capabilities for data product development is Collibra. The company, which was listed in the Leaders Quadrant in Gartner’s first ever Magic Quadrant for data governance, is a backer of data meshes as a way to process and share data as a product.

Stijn Christiaens, Collibra’s co-founder and chief data citizen, says it’s important for data governance that every data product has an owner.

“Every data product the company builds needs to have a data product owner, because that owner is responsible for controlling access, and many other things,” says Stijn Christiaens, the company’s co-founder and chief data citizen. “You need to have roles and responsibilities. You need to have process in both data and AI governance.”

Data products have the potential to democratize access to data and accelerate adoption of analytics and AI to better position a company to compete. The best data products are custom-developed and are themselves products of various tools and techniques that companies can bring together. Specifically, the roles that data exploration, data quality, and data governance play in enabling data product development should not be overlooked by prospective data product users.

Related Items:

How to Build Great Data Products

In Search of Trustworthy Data Products

Data Products: The Solution to the Data Silo Trap

 

The post Three Ways Data Products Empower Internal Users appeared first on BigDATAwire.

Enregistrer un commentaire

0 Commentaires