BigQuery is a fully-managed enterprise data warehouse that enables super-fast SQL queries.
The BigQuery UI helps you complete tasks like running queries, loading data, and even creating and training ML models.
Storing and querying massive datasets can be time consuming and expensive without the right hardware and infrastructure.
BigQuery is an enterprise data warehouse that solves this problem by enabling super-fast SQL queries using the processing power of Google's infrastructure.
Simply move your data into BigQuery and let us handle the hard work.
You can control access to both the project and your data based on your business needs, such as giving others the ability to view or query your data.
You can access BigQuery in the Console, the command-line tool, or by making calls to the BigQuery REST API using a variety of client libraries such as Java, .NET, or Python.
There are also a variety of third-party tools that you can use to interact with BigQuery, such as visualizing the data or loading the data.
How to query public tables and load sample data into BigQuery.
- Query a public dataset
- Create a new dataset
- Load data into a new table
- Query a custom table
Task 1. Open BigQuery
The BigQuery console provides an interface to query tables, including public datasets offered by BigQuery. The query you will run accesses a table from a public dataset that BigQuery provides. It uses standard query language to search the dataset, and limits the results returned to 10.
Open the BigQuery console
In the Google Cloud Console, select Navigation menu > BigQuery.
The Welcome to BigQuery in the Cloud Console message box opens. This message box provides a link to the quickstart guide and the release notes.
Click Done.
The BigQuery console opens.
Task 2. Query a public dataset
Click + (SQL query) to create a new query. Copy and paste the following query into the BigQuery Query editor:
#standardSQL
SELECT
weight_pounds, state, year, gestation_weeks
FROM
`bigquery-public-data.samples.natality`
ORDER BY weight_pounds DESC LIMIT 10;
Create a new dataset
To load custom data into a table, you first need to create a BigQuery dataset.
Datasets help control access to tables and views in a project. This lab uses only one table, but you still need a dataset to hold the table.
In the Explorer pane, near your project id, click on View actions (view actions icon) then click Create dataset.
Load data into a new table
Next you create a table inside the babynames dataset, then load the data file from your storage bucket into the new table.
The custom data file you'll use contains approximately 7 MB of data about popular baby names, provided by the US Social Security Administration.
In the Cloud Console, select Navigation menu > BigQuery to return to the BigQuery console.
Navigate to the babynames dataset, by clicking View actions (view actions icon) near your dataset then click Create table.
In the Create table dialog, set the following fields, leave all others at the default value:
In the Create table dialog, set the following fields, leave all others at the default value:
Field Value
Create table from Google Cloud Storage
Select file from GCS bucket spls/gsp072/baby-names/yob2014.txt
File format CSV
Table names_2014
Schema > Edit as text Slide on, then add the following in the textbox: name:string,gender:string,count:integer
Click the Create table button.
Task 5. Preview the table
Check your table! View the first few rows of the data.
Click the names_2014 table in the left panel, then click Preview.
Task 6. Query a custom dataset
Running a query against custom data is identical to the querying a public dataset that you did earlier, except that now you're querying your own table instead of a public table.
In BigQuery, click the + (SQL query) icon at the top.
Paste or type the following query into the query Editor.
#standardSQL
SELECT
name, count
FROM
`babynames.names_2014`
WHERE
gender = 'M'
ORDER BY count DESC LIMIT 5;
No comments:
Post a Comment