OUR LATEST NEWS

How to Take Control of Your BigQuery Costs

hero

Csaba Kassai | November 13, 2016

3 Minutes Read

As we prepare for our workshop “A Big Data Adventure on the Google Cloud Platform” at Crunch Data Engineering and Analytics Conference, we have started to collect some tips and tricks when it comes to using Google’s Big Data platform. This first collection of advice shows how to control your costs when using BigQuery (BQ), discusses the kinds of built-in tools you have access to, and explains the related best practices. In the second blog post, we show how you can boost your BigQuery performance. Read this blog post to find out how to take control of your Bigquery costs.

What is the difficulty in BigQuery costs?

The BigQuery pricing is usage-based, you pay for the amount of data you store, query, and insert by streaming. This seems fair, although it has some consequences. When you have practically unlimited resources — as in the case of BigQuery — this pricing policy becomes a double-edged sword: it is easy to spend a lot if you are not cautious enough. When it comes to budget planning, you should know how much data you will store and process in the future and this can make it challenging to estimate your costs. Fortunately, there are some tools and best practices that can help you to control your costs.

Getting started

Probably the most difficult is to plan the costs if you’ve never actually used BQ. Google provides a pricing calculator tool that helps in these situations. It estimates your future costs based on the parameters we’ve already mentioned: storage, streaming inserts, queries. Of course, to get a result, you will still have to estimate the size of the datasets involved.

How to prevent overspending

BQ provides a couple of built-in tools to continuously control your spending and stop you going over budget.

  1. Query-level cost control

Check the amount of data BQ will process during the query before actually running that query. Simply click on the green exclamation mark (Validator) in the right bottom of the window. If you also add the BQ Mate extension to your Chrome browser, you will see how much the query will cost, in USD.

Query-level cost control

2. Project-level cost control

Set the project-level soft limit by requesting a billing alert. When you exceed the threshold you have set for the current month, the billing administrators will receive an email notification. Set the project-level hard limit by maximizing the number of bytes processed per day within the project. If you enable the BQ custom quotas feature, you can specify this limit in 10 TB increments. To activate this feature or to modify the value, submit the BQ Custom Quota Request form. When you exceed the quota, queries will return errors.

3. User-level cost control

Set the user-level hard limit within a project by maximizing the number of bytes processed per day by a given user. The method is the same as in case of the project-level hard limit, except that the error message is different when you exceed the quota.

4. Billing-account level cost control

If you have multiple projects, and you want to monitor the overall budget usage, set billing alerts for your entire billing account. The process is the same as point 2a.

How to monitor the costs

  1. There are also options to monitor your spending:
  2. Follow your monthly data usage on your project’s Cloud Console dashboard on the Billing panel. Be aware though, that the costs will appear only within a couple of days of usage, not in real time.
  3. Under the Billing/History menu on the Cloud Console, you can get a more detailed view of your spending.
  4. Export your detailed BigQuery audit logs, and visualize your spendings in Data Studio as Mike and Ryan suggest in this great post.

Is that all?

Not really. Beyond these built-in features, there are several best practices that can help to reduce your costs.

  1. If possible, always consider partitioning your data and storing it in sharded or partitioned tables. If you have chosen the right partitioning column, usually just part of the shards or partitions has to be read by a query, so the costs will significantly decrease. This way, you can save money on the storage side, too, with the built-in feature in BQ that puts those tables you haven’t changed in the last 90 days into a cheaper storage category.
  2. Try to avoid SELECT
    • option on big datasets. Don’t be lazy to write down the name of the columns you really need. BQ is a columnar storage. You pay for the amount of data your query needs to read. Do the math: fewer columns selected means fewer columns need to be read, which means lower costs.

Set the data expiration date on datasets. After a given time, all the tables in the dataset will be deleted so you won’t have to pay for storage. This is especially useful if you are in an experimentation phase with BQ. Another benefit of this feature is that you can prevent the overgrowth of tables and datasets in your projects.

How to Power Banking Services with Google Cloud

clock

2022-06-23

4 Minutes Read

Here’s why your company should be using the Google Cloud features to power banking services and how it makes things easy for financial service organizations.

Google Cloud Infrastructure Modernization - Stay Agile With An Open Architecture

clock

2022-02-17

3 Minutes Read

Learn more about how Google Cloud infrastructure modernization solutions can help your business to become more competitive.

Google BigQuery materialized view test drive

clock

2022-04-07

3 Minutes Read

I have tested the BigQuery materialized views against the documentation. While most of the functionality and limitations are accurate, there are a few gotchas you need to be aware of.

hero
logo

Ready for the future? Let’s talk!

Reach out, and let’s take your business to the next level.

image

By clicking submit below, you consent to allow Aliz.ai to store and process the personal information submitted above and share information about our products and services, as well as other content that may be of interest to you. For more information, please review our Privacy Policy. You may unsubscribe at any time. Your data will not be passed on to third parties.

I agree to receive other communications from Aliz.ai.

badge

New opportunities with cloud solutions!

Aliz is a proud Google Cloud Partner with specializations in Infrastructure, Data Analytics, Cloud Migration and Machine Learning. We deliver data analytics, machine learning, and infrastructure solutions, off the shelf, or custom-built on GCP using an agile, holistic approach.

logo

© Copyright 2022 Aliz Tech Kft.

logologologologo