Skip to main content

BigQuery

Google BigQuery is a data warehouse platform.

Schema

You can define table schemas via JSON documents which get ingested at the same time as your data using the bq tool.

Data Types

Data Types List from google documentation

NameData typeDescription
IntegerINT64Numeric values without fractional components
Floating pointFLOAT64Approximate numeric values with fractional components
NumericNUMERICExact numeric values with fractional components
BigNumericBIGNUMERICExact numeric values with fractional components
BooleanBOOLTRUE or FALSE (case-insensitive)
StringSTRINGVariable-length character (Unicode) data
BytesBYTESVariable-length binary data
DateDATEA logical calendar date
Date/TimeDATETIMEA year, month, day, hour, minute, second, and subsecond
TimeTIMEA time, independent of a specific date
TimestampTIMESTAMPAn absolute point in time, with microsecond precision
Struct (Record)STRUCTContainer of ordered fields each with a type (required) and field name (optional)
GeographyGEOGRAPHYA pointset on the Earth's surface (a set of points, lines and polygons on the WGS84 reference spheroid, with geodesic edges)
JSONJSONRepresents JSON, a lightweight data-interchange format

 

Differences between JSON and Record/Struct

JSON type allows you to ingest JSON without pre-defining the schema whereas a record/struct must be pre-defined and all the fields must be known in advance.

JSON fields are more fiddly to query and work with in general. It seems like you can't do things like UNNEST them.

 

 

 

Nested/Repeated Columns

To allow a column (or object) to repeat (e.g. to have an array of values) you must use mode: NESTED in your schema.

See Nested and repeated columns documentation