- 08 Nov 2023
- 3 Minutes to read
Sharing Data with Bobsled
- Updated on 08 Nov 2023
- 3 Minutes to read
This article describes how data sharing works in Bobsled, the secure data sharing platform that lets you share data to any supported platform without the need to manage the destination platform.
This section introduces the set of fundamental concepts used in Bobsled.
A share comprises configuration settings for a source dataset, destination platform and region, data access permissions, loading semantics, and data transfer frequency, along with an event log for activity tracking. It serves as a centralized management hub where users can perform actions such as pausing transfers, revoking access, and deleting shared data.
Learn how to Create and manage a share.
The organization that is using Bobsled to transfer data to a consumer organization.
The organization that a Provider is transferring data to.
Learn how to Create and Manage Consumers and how consumer organizations affect billing.
In Bobsled, a source is a connection to the data provider's data location, a bucket containing files, or a data warehouse with tables. You can set up sources in Bobsled to access the data you want to send to a consumer via a Share. Once you have a source set up, you can easily configure a share to use it so a single source can be used across multiple shares and consumers.
In Bobsled, a destination is a fully managed instance of the chosen platform that provides a secure and isolated environment for your data. When you create a destination, Bobsled automatically sets up a dedicated account and the required infrastructure for the chosen platform, which is only used by your organization. This means that you don't need any pre-existing knowledge, accounts, or relationships with the target platform to share data in your consumer's chosen platform.
See currently supported Destinations.
A transfer is the process of moving data from the provider's storage to a Bobsled-managed destination, which is executed as an ongoing automated transfer.
At a high-level sharing data with consumer organizations is done via the following steps:
- Connect Sources: Providers give Bobsled read-only access to the data they wish to transfer to their consumer organizations and set up a data source in Bobsled to specify the location of the data.
- Create Share: Providers create a data share in Bobsled, selecting a destination, authorizing access, and defining the following configurations:
- Data loading patterns (e.g. append or overwrite)
- Query optimization settings (e.g. clustering)
- Transfer Data: Bobsled transfers the data to the target platform and region specified in the share configuration. Consumers can securely access the shared data from the designated destination.
Sharing data as files
When the source is stored in cloud object storage, providers may select an entire bucket or specific folders (prefix), to be transferred to a specified cloud object storage technology and a region.
Once the “Start automated transfer” action is taken all of the files under the selected folders are transferred to the destination under the prefix format <share-identifier>/latest/
Every 5 minutes Bobsled will check to see if there are changes to the selected folder paths and synchronize any new, updated, or deleted files to the destination.
Sharing data as tables
When the source data is stored in cloud object storage, providers may select an entire bucket or a folder (prefix) to be transferred to a supported storage layer that uses table formats. For some destinations, a cloud platform and specified region can be selected where necessary.
Once the “Transfer now” or “Create automated transfer” action is taken files under the selected folder path are loaded into the table name specified in the transfer configuration according to the loading pattern that was specified. When an automated transfer is configured Bobsled will check every five minutes for changes in the source and load any new data to the table.
Learn more about loading destination tables .