Data mesh is an emerging domain-driven decentralized data architecture that
aims to minimize or avoid operational bottlenecks associated with centralized,
monolithic data architectures in enterprises. The topic has picked the
practitioners' interest, and there is considerable gray literature on it. At
the same time, we observe a lack of academic attempts at defining and building
upon the concept. Hence, in this article, we aim to start from the foundations
and characterize the data mesh architecture regarding its design principles,
architectural components, capabilities, and organizational roles. We
systematically collected, analyzed, and synthesized 114 industrial gray
literature articles. The review provides insights into practitioners'
perspectives on the four key principles of data mesh: data as a product, domain
ownership of data, self-serve data platform, and federated computational
governance. Moreover, due to the comparability of data mesh and SOA
(service-oriented architecture), we mapped the findings from the gray
literature into the reference architectures from the SOA academic literature to
create the reference architectures for describing three key dimensions of data
mesh: organization of capabilities and roles, development, and runtime.
Finally, we discuss open research issues in data mesh, partially based on the
findings from the gray literature