OUR LATEST NEWS

Analytic functions in Google BigQuery – Part 1: Basics

hero

Peter Farkas | January 16, 2019

3 Minutes Read

Google BigQuery, Google’s data warehouse solution, has many functions and capabilities. In Part 1 of a series of posts, we look at the definition of analytic functions, how they differ from aggregate functions, and the various uses they have in everyday scenarios.

Examples

First of all, let’s define our example table, Table 1, on which all queries in this post and their results are based.

Table 1: Orders

Example of Google BigQuery analytic function-Table of orders

Furthermore, all queries in this post are written in the BigQuery Standard SQL dialect.

Aggregate vs Analytic function

Those familiar with traditional SQL databases already know how aggregate functions work, but for the sake of completeness, I’ll include it here.

An aggregate function is a function that computes a single aggregate value over a group of rows. The syntax of this function is simple enough:

 

 

Each column that is not aggregated has to be present in the GROUP BY clause:

 

 

In contrast, an analytic function is a function that computes an aggregate value for each row over a group of rows. The syntax of this function is as follows:

 

 

Because grouping is done through the analytic function’s own condition block, the GROUP BY clause is not necessary.

Clauses

PARTITION BY

PARTITION BY is arguably the most important clause of an analytic function. Similar to the aggregate function’s GROUP BY clause, it limits the number of rows the function computes merely on a subset of all rows. While all non-aggregate columns must be included in a GROUP BY clause, there is no such restriction here.

Example 1

 

 

Example of Google BigQuery analytic function-PARTITION BY

This query computes the difference between the actual price and the store’s average price. Since we’re using an analytic function, we can include the OrderId and Date columns in our results without grouping by them. If we wanted to get the same result using aggregate functions, the query would look like this:

 

 

ORDER BY

Next, we have the ORDER BY clause. Similar to the ORDER BY at the (near) end of a query, its purpose is to order each partition prior to evaluating the analytic function. While functions such as SUM or AVG do not benefit from an ordered partition, there are certain functions where it is a must. One such function is LAG. This function has one argument (a column name), and its return value is the value of the column from the previous row.

Example 2

 

 

Example of Google BigQuery analytic function-ORDER BY

This query returns the IDs of all orders that were processed first on any given date for each store. (We assume the Order with a lesser ID came in before an order with a greater ID.) The same problem can be solved with aggregate functions, but this method (again) joins a subset of the table with the table itself, which is much less performant than the analytic function we used.

 

 

Conclusion

BigQuery’s analytic functions are powerful tools that can reduce difficult and expensive-to-compute queries to more simple solutions. Though we looked at the definition and semantics of analytic functions, we have yet to cover windowing and advanced use cases. Stay tuned for Part 2 of this blog series.

Interested in more about Analytic?

How to Power Banking Services with Google Cloud

clock

June 23, 2022

4 Minutes Read

Here’s why your company should be using the Google Cloud features to power banking services and how it makes things easy for financial service organizations.

Google Cloud Infrastructure Modernization - Stay Agile With An Open Architecture

clock

February 17, 2022

3 Minutes Read

Learn more about how Google Cloud infrastructure modernization solutions can help your business to become more competitive.

Google BigQuery materialized view test drive

clock

April 7, 2022

3 Minutes Read

I have tested the BigQuery materialized views against the documentation. While most of the functionality and limitations are accurate, there are a few gotchas you need to be aware of.

Ready for the future? Let’s talk!

Reach out, and let’s take your business to the next level.

By clicking submit below, you consent to allow Aliz.ai to store and process the personal information submitted above and share information about our products and services, as well as other content that may be of interest to you. For more information, please review our Privacy Policy. You may unsubscribe at any time. Your data will not be passed on to third parties.

I agree to receive other communications from Aliz.ai.

Istvan Boscha

CEO DACH

Balazs Molnar

CEO APAC

badge

New opportunities with cloud solutions!

Aliz is a proud Google Cloud Partner with specializations in Infrastructure, Data Analytics, Cloud Migration and Machine Learning. We deliver data analytics, machine learning, and infrastructure solutions, off the shelf, or custom-built on GCP using an agile, holistic approach.

logo

© Copyright 2022 Aliz Tech Kft.

logologologologo