Push a dataframe from Vertex Notebook to BigQuery

Abhishek Sharma
2 min readMay 24, 2024

--

Push a dataframe from Vertex Notebook to BigQuery

Hey guys, in today’s very short blog we will see how we can Push a dataframe from Vertex Notebook to BigQuery. So without any further due, let’s do it…

Read the full article here — https://machinelearningprojects.net/push-a-dataframe-from-vertex-notebook-to-bigquery/

Steps to Push a dataframe from Vertex Notebook to BigQuery

  • Open a new Python notebook and paste the following code into it.
  • Now change the variables like ‘Database_Name’ and ‘Table_Name’.
  • Also, you need to define a KMS key which acts as a password. Ask your DevOps team for this.
  • ‘df’ is the Dataframe that you need to push.
  • And finally, run the code.

Code

# Push a dataframe from Vertex Notebook to BigQuery
from google.cloud import bigquery

def push(dbname, tablename,df):

kms_key_name = 'your-kms-key'
project_name = 'project_name' #something like prj-abc-data-prod

job_config = bigquery.LoadJobConfig(destination_encryption_configuration=bigquery.EncryptionConfiguration(kms_key_name=kms_key_name))

client = bigquery.Client()
datapath = f'{project_name}.{dbname}.{tablename}'
job_config.write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE

df.columns = [x.replace('\n','') for x in df.columns]
df.columns = [x.replace(' ','') for x in df.columns]
df.columns = [x.replace("'",'') for x in df.columns]
df.columns = [x.replace('-','') for x in df.columns]

job = client.load_table_from_dataframe(df.astype(str), datapath, job_config=job_config)

push('Database_Name', 'Table_Name', df.astype(str))

Conclusion

In conclusion, pushing a dataframe from Vertex Notebook to BigQuery is a seamless process and could be easily achieved by using the above code.

FAQs

How do I push a dataframe from Vertex Notebook to BigQuery?

To push a dataframe from Vertex Notebook to BigQuery, you can use the pandas_gbq library in Python. First, you need to authenticate your notebook with your Google Cloud credentials. Then, you can use the to_gbq() function from pandas_gbq to upload your dataframe to a specified BigQuery table.

What are the prerequisites for pushing a dataframe to BigQuery from Vertex Notebook?

Before pushing a dataframe to BigQuery, ensure that you have the necessary permissions to access BigQuery and that your notebook environment is connected to your Google Cloud project. Additionally, make sure that the dataframe you want to push is properly formatted and contains the data you want to upload.

Can I push large dataframes to BigQuery from Vertex Notebook?

Yes, you can push large dataframes to BigQuery from Vertex Notebook. BigQuery is designed to handle massive datasets, and the to_gbq() function efficiently uploads data in chunks, making it suitable for large-scale data transfers.

Are there any limitations or considerations to keep in mind while pushing data to BigQuery?

When pushing data to BigQuery, be mindful of the cost implications, especially for large datasets. Additionally, ensure that your dataframe schema matches the schema of the target BigQuery table to avoid any compatibility issues during the upload process.

Read my last article — List all files in a GCS Bucket from Vertex Notebook

Check out my other machine learning projects, deep learning projects, computer vision projects, NLP projects, Flask projects at machinelearni

--

--