# Google Cloud Platform # BigQuery Analytics Hub Analytics Hub is a mechanism for sharing datasets between BigQuery users. Google's official product documentation for Analytics Hub is [here](https://cloud.google.com/bigquery/docs/analytics-hub-introduction#subscriber_workflow) ### Subscriber Workflow - A dataset publisher creates and shares a Linked Dataset with the subscriber - this essentially becomes a symbolic link to the original dataset that the subscriber can access. - In this model the person querying the data pays for it. # BigQuery Google BigQuery is a data warehouse platform. ### Schema You can define table schemas via JSON documents which get ingested at the same time as your data using the `bq` tool. #### Data Types Data Types List from [google documentation](https://cloud.google.com/bigquery/docs/schemas#standard_sql_data_types)
NameData typeDescription
[Integer](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#integer_type)`INT64`Numeric values without fractional components
[Floating point](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#floating_point_types)`FLOAT64`Approximate numeric values with fractional components
[Numeric](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#numeric_type)`NUMERIC`Exact numeric values with fractional components
[BigNumeric](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#bignumeric_type)`BIGNUMERIC`Exact numeric values with fractional components
[Boolean](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#boolean_type)`BOOL`TRUE or FALSE (case-insensitive)
[String](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#string_type)`STRING`Variable-length character (Unicode) data
[Bytes](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#bytes_type)`BYTES`Variable-length binary data
[Date](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#date_type)`DATE`A logical calendar date
[Date/Time](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#datetime_type)`DATETIME`A year, month, day, hour, minute, second, and subsecond
[Time](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#time_type)`TIME`A time, independent of a specific date
[Timestamp](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#timestamp_type)`TIMESTAMP`An absolute point in time, with microsecond precision
[Struct (Record)](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#struct_type)`STRUCT`Container of ordered fields each with a type (required) and field name (optional)
[Geography](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#geography_type)`GEOGRAPHY`A pointset on the Earth's surface (a set of points, lines and polygons on the [WGS84](http://earth-info.nga.mil/GandG/update/index.php) reference spheroid, with geodesic edges)
[JSON](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#json_type)`JSON`Represents JSON, a lightweight data-interchange format
#### #### Differences between JSON and Record/Struct JSON type allows you to ingest JSON without pre-defining the schema whereas a record/struct must be pre-defined and all the fields must be known in advance. JSON fields are more fiddly to query and work with in general. It seems like you can't do things like [UNNEST](https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays) them. #### Nested/Repeated Columns To allow a column (or object) to repeat (e.g. to have an array of values) you must use `mode: NESTED` in your schema. [See Nested and repeated columns documentation](https://cloud.google.com/bigquery/docs/nested-repeated) # BigQuery Working with Array Structures On a companies table where you can have multiple websites per company: ```sql SELECT * FROM project_namespace.dataset.companies as companies, UNNEST (companies.websites) as website LIMIT 1000 ``` The above would produce a flattened table structure where company id is repeated per website.