Skip to main content

BigQuery

Google BigQuery is a data warehouse platform.

Schema

You can define table schemas via JSON documents which get ingested at the same time as your data using the bq tool.

Data Types

Data Types List from google documentation

Name Data type Description
Integer INT64 Numeric values without fractional components
Floating point FLOAT64 Approximate numeric values with fractional components
Numeric NUMERIC Exact numeric values with fractional components
BigNumeric BIGNUMERIC Exact numeric values with fractional components
Boolean BOOL TRUE or FALSE (case-insensitive)
String STRING Variable-length character (Unicode) data
Bytes BYTES Variable-length binary data
Date DATE A logical calendar date
Date/Time DATETIME A year, month, day, hour, minute, second, and subsecond
Time TIME A time, independent of a specific date
Timestamp TIMESTAMP An absolute point in time, with microsecond precision
Struct (Record) STRUCT Container of ordered fields each with a type (required) and field name (optional)
Geography GEOGRAPHY A pointset on the Earth's surface (a set of points, lines and polygons on the WGS84 reference spheroid, with geodesic edges)
JSON JSON Represents JSON, a lightweight data-interchange format

 

Differences between JSON and Record/Struct

JSON type allows you to ingest JSON without pre-defining the schema whereas a record/struct must be pre-defined and all the fields must be known in advance.

JSON fields are more fiddly to query and work with in general. It seems like you can't do things like UNNEST them.

 

 

 

Nested/Repeated Columns

To allow a column (or object) to repeat (e.g. to have an array of values) you must use mode: NESTED in your schema.

See Nested and repeated columns documentation