Setting up Cargo on Google BigQuery

In this guide, we will walk you through setting up Google BigQuery as your store of records in Cargo. This setup ensures Cargo has the necessary permissions in Google Cloud to read and write data efficiently.

As a result, BigQuery will serve as a persistence layer that Cargo will use when running automations.


Overview

Permissions

To prevent data loss, Cargo has a few limitations on what it can and cannot do within your Google BigQuery instance.

What Cargo can do:

  • Read data from schemas and tables, even if they are spread across multiple databases

  • Create and write data into new schemas and tables

What Cargo will never do:

  • Overwrite existing schemas and tables (Cargo always creates its own schemas and tables when needed)


Pre-requisites

Before you begin

To start, you need an existing Google Cloud Project with a payment method and billing enabled. Follow the official Google guide.

Once you have created the project, you can continue with this guide which will cover enabling and creating the necessary elements in your new GCP project:

  • BigQuery API & Cloud Resource Manager API
  • BigQuery dataset (dedicated to Cargo)
  • Object Storage Bucket (dedicated to Cargo)
  • Service Account

If you have an existing BigQuery project and technical knowledge, you may skip any of these steps.



Cargo uses two Google APIs that must be enabled. To do so:

  • Go to the Google Cloud Console.
  • Select APIs & Services.
  • Select Enabled APIs & Services.
  • Search for and enable the following APIs: BigQuery API, Cloud Resource Manager API


To enable Cargo to load and unload data from BigQuery, we need a dedicated storage bucket for this purpose.

To create a new bucket:

  • Go to the Google Cloud Console.
  • Search Object storage in the search bar.
  • Create a new bucket and follow the steps.


As mentioned above, Cargo won't write data to anywhere else than to a selected dataset to ensure no data-loss happens. It is recommended to create a new dataset dedicated for Cargo.

To create a new dataset:

  • Go to the Google Cloud Console.
  • Search BigQuery in the search bar.
  • In BigQuery Studio click three dots next to the project name
  • Click Create dataset
  • Follow the setsp, keep a note of the dataset name and location


Cargo will use this service account to access the APIs enabled in the previous step. To create a service account, follow these steps:

  • Go to the Google Cloud Console.
  • Click on IAM & Admin.
  • Click on IAM.
  • Click on Service Accounts.
  • Click on Create service account.
  • Give the service account a name.
  • Grant the following roles: BigQuery Data Editor, BigQuery Job User, Storage Object User

  • Click on Done.


Follow the steps below to generate a key:

  • In Service Accounts, click on the created service account.
  • Click on Keys.
  • Click on Add Key.
  • Choose Create new key.
  • Select JSON.
  • Click on Create.
  • This should download a file to your computer.


Now that we have all required elements, navigate to workspace settings and select "System of records".

Fill in the settings form with the data we gathered in previous steps:

  • Copy and paste the content of the service account key file into the field labeled Service Account

  • Select the location that was chosen during step 3
  • Fill in the name of the bucket created in step 2
  • Select Dataset as Scope
  • Fill in the name of the BigQuery dataset created in step 3
  • Click Setup

Outcome

Setup completed

You are now ready to use Cargo with Google BigQuery!