Data Meshes are the Future of Data Architecture

Data management has been an issue for growing businesses for a long time. For decades businesses have relied on vast data lakes or monolithic data warehouses to manage their data. Unfortunately, the extensive investments many businesses have made in their data warehousing is not paying off the way they would like. That’s because centralized data platforms are ill suited to handle two important problems that often crop up—problems that data meshes can solve.

First is the problem of data source proliferation, or the sheer amount of data that is available today. Using a centralized data structure forces multiple (and growing) data sources through a single ETL pipeline (Extract, Transform, Load) where data engineers can process and transform it for each domain’s needs. Unfortunately this means less control over increasing volumes of data and can potentially overload your central platform as different transformations are often required depending on how each team/consumer uses the data. A single team overseeing all data ingestion, transformation, and distribution can quickly run into problems, especially if the business needs to scale up.

Traditional data lakes may also fail at supporting a business’s innovation cycle, which usually requires a huge number of different use cases and subsequent data transformations involved in testing. As a result, there is often an interminably long response time between initial testing and evaluating results.

Data meshes solve both of these problems by creating ETL pipelines for each domain of a business. The same data and syntax standards are applied to all data assets within the data lake to maintain interoperability across domains, but each domain is responsible for managing, transforming, and using the data as needed. This way, each domain can scale and innovate as needed, without relying on a single, separate team of engineers with competing priorities from different departments. Data producers and its consumers get to work closely to create the best data product they can for each use case.

Although data meshes are the next big thing and fix some very important problems, there are a few downsides you should consider. The first concern is data duplication. Because each domain’s ETL pipeline is drawing from the same data lake and transforming each datum to its specialized needs, data duplication becomes easier, and redundant data can impact data management costs and resource uses. Also, because each ETL pipeline is independent, data quality problems can emerge, so you’ll need to invest in some kind of data identification and governance policies. Finally, if you do decide to invest in a data mesh, remember that you’re likely going to face change resistance. Data warehouses and lakes have been the defacto architecture for data for 20 years now, and change management may be required.

Businesses are increasingly moving to data-driven strategies and decision making, so having the right data architecture is incredibly important, and data meshes offer an effective new means of overcoming problems in traditional data architecture methods. If you’re interested in making the shift to a data mesh for your organization, contact TRINUS today and we’ll be happy to discuss the details.

Sincerely,

The TRINUS Team
trinustech.com