Bigquery : Add New Column To Existing Tables Using Python Bq Api
Related question: Bigquery add columns to table schema using BQ command line tools I want to add a new column to existing tables (update the existing table's schema) in BigQuery us
Solution 1:
Base on Mikhail Berlyant
comments, I have to pass existing table's schema with new field (column) to the update()
method to update the existing tables's schema.
A python code example is given below:
...
tbObject = bigquery_service.tables()
# get current table schema
table_data = tbObject.get(projectId=projectId, datasetId=datasetId, tableId=tableId).execute()
schema = table_data.get('schema')
new_column = {'name': 'new_column_name', 'type': 'STRING'}
# append new field to current table's schema
schema.get('fields').append(new_column)
query_body = {'schema': schema}
tbObject.update(projectId='projectId', datasetId='datasetId', tableId='tableId', body=query_body).execute()
And also, there's no way to set value of new columns for existing rows (tables). Thanks for Mikhail Berlyant
suggestion, the way to set the value for existing rows is to create a seperate table for new columns with values, and join the existing table with that table to replace the old schema table
Solution 2:
summary of my comments (as i've got some minutes now for this):
- whole schema (along with new field) needs to be supplied to api
- new field will be added with null for existing rows. no way to set value
- you can have some logic in queries that you will be running against this table to compensate this. or you can have separate table with just this new field and some key that you will be joining your existing table with new table to get this field
Post a Comment for "Bigquery : Add New Column To Existing Tables Using Python Bq Api"