What is data sharing?
  • 17 Jul 2024
  • 1 Minute to read
  • PDF

What is data sharing?

  • PDF

Article summary


Data sharing is an essential component of the way organizations consume, distribute and monetize data. Data sharing is the process by which teams enable others inside and outside of their organization to access valuable analytical data.

Traditional forms of data sharing involve intensive replication and integration of data. The sources of data (providers) typically replicated a dataset and shared the duplicate file through a secure intermediary like an SFTP server. In the past decade, providers have also built bulk APIs that allow consumers to request specific data programmatically, but many of the same challenges remain. The consumer is responsible for extracting the data from the location, loading it into their workspace, and then transforming the data into an analytics-ready format.

Modern data sharing builds on new sharing technologies developed by major platforms such as AWS, Snowflake, Databricks, and others. With one version, in-place sharing, a provider uses a sharing protocol (often developed by a platform) to grant a consumer access to a data product via a public identifier who then uses their computing to transform the data as they see fit. The data appears in a ready-to-query format within the consumer’s main analytics platform (e.g. data lake like AWS S3 or data warehouse like Snowflake).

Topics:

Traditional Sharing

Modern Sharing

Core Technologies

  • SFTP

  • APIs

  • Sharing Protocols

  • Cross-cloud data transfer

Supporting Technologies

  • MFT Software

  • API Management

  • Data Sharing Platforms

Key Difference

Data is replicated and delivered to a consumer outside of their native analytical environment.

Access to the data is granted to the consumer in their native analytical environment.

Typical Advantages

Decades of development and product maturity 

Limited integration burden on consumers 

Typical Disadvantages

Substantial “integration” burden on consumer

Limited by platform and often region


Was this article helpful?

What's Next