What is data fabric?
Data fabric is a variation of a distributed, decentralized data analytics and management framework, or data integration layer design pattern, advocated by analysts at Gartner. Data fabric is largely seen as a competing framework to data mesh, advocated by industry thought leader, Zhamak Dehghani. The data fabric concept, as articulated by Gartner, is an emerging architecture and data management design pattern for attaining flexible, reusable, and augmented data integration pipelines, in support of faster and, in some cases, automated data access and sharing. A key attribute of the data fabric design pattern is active metadata, which serves to automate data management tasks.
Further, per Gartner’s definition, a data fabric supports both operational and analytics use cases delivered across multiple deployment and orchestration platforms and processes. Data fabrics support a combination of different data integration styles and leverage active metadata, knowledge graphs, semantics and ML to augment data integration design and delivery.
It is important to understand that data fabric (and the same is true with data mesh) is a design pattern that is built through multiple data platform components, not bought via a single SKU.
How does data fabric work?
Data fabric is a distributed data architecture that enables data to be stored, processed and unified across various locations and platforms. It enables data integration, virtualization, orchestration, metadata management, governance, security, and privacy.
A data fabric utilizes continuous analytics over existing, discoverable data assets, made possible by active metadata and AI/ML techniques, to support the design, deployment and utilization of integrated and reusable data across all environments, including hybrid and multi-cloud platforms.
Data fabric also provides a data catalog (a centralized repository of metadata and information about available assets) and data access APIs to make it easier for users and applications to interact with data seamlessly. It is a distributed data architecture that enables data to be stored and processed across various locations and platforms.
What are some technologies used to support data fabric?
Activating a data fabric involves using a combination of tools and technologies to seamlessly integrate, manage, and analyze data across an organization. A data fabric architecture enables a unified and consistent view of data across different sources and environments. Here are tools that would be useful in the process of implementing and activating a data fabric:
- Data integration / iPaaS
- Data storage and management
- Data virtualization
- Data catalog
- Data governance
- Metadata management
- Analytics and business intelligence
- Data security and compliance
- Machine learning and AI
- Workflow orchestration
Remember, that the specific tools you choose will depend on your organization’s requirements, existing infrastructure, and the nature of the data you’re working with. Additionally, the deployment might involve a combination of on-premises and cloud-based solutions based on your organization’s strategy.
What are the benefits of data fabric?
The benefits of data fabric are multifaceted:
- Technology-enabler specialists, such as IT directors, data platform managers, and data engineers can experience a boost in productivity through automated integration builds and automated access to data that relieves the burden of ever growing workloads.
- Citizen integrators, such as business users and executives, also benefit from automated access to data, as well as ease-of-use, that simplifies the efforts to get their jobs with data completed faster.
- The enterprise benefits from faster time-to-value from data across the enterprise. The enterprise is more agile and more responsive, and additionally, the enterprise becomes a stronger data-driven and data-literate culture.
What are the challenges with data fabric?
Because a modern data fabric is built and assembled, not bought via a single SKU, challenges with data fabric come into play when the necessary components of data fabric are not yet available or mature enough to achieve the vision of an automated data fabric environment. Examples include:
- Lack of metadata to enable the categorization and discovery of data assets.
- Difficulty creating integrations and automations due to hard-to-use interfaces.
- Undeveloped or immature AI/ML, which can lead to inaccurate results
- Slow performance, which traditionally is the trade-off when federating a query across distributed data systems
- Lack of a seamless, cross-domain data integration fabric, which means data silos will not be eliminated