5 Key concepts of Cloud Spanner under 2 minutes.

SP Kumar Rachumallu
2 min readJun 23, 2021

Google Cloud Spanner, often termed as NewSQL database for it’s ability to offer features of both relational and non relational database has come into general availability starting June 2017.

Data in Cloud Spanner is redistributed across multiple machines and zones thus by achieving high availability. Spanner has the ability to dynamically split and shuffle data chunks across regions/zones to ensure disaster tolerance.

Here’s top 5 concepts you may want to know before starting with Cloud Spanner:

Split points

Spanner is a distributed database. Which means as the database grows, data gets divided in smaller chunks based on column range. These chunks of data are called Splits and the range in which they are split are called Split points or Split boundaries.

Primary Key

The concept of Primary key is same as that to RDBMS.

But why specify it separately ?

In Spanner, choosing a primary key impacts interleaving. In relational database, choosing a Primary key is not mandatory but in Spanner it’s mandatory. Unlike RDBMS, a primary key in Spanner can be null and can have zero columns.

Hotspots

Hotspot by definition is one node in a distributed database taking all write loads thus by impacting overall performance. Hotspots usually happen by choosing a poor primary key.

Secondary Index

Same as RDBMS, secondary indexes are alternate sort orders for primary key. Spanner allows you to add secondary index on non-key fields thus by enabling faster querying.

Interleaving

The concept of creating a parent-child relationship between two tables by sharing the primary and/or secondary keys between them is Interleaving.

By doing so, rows in child table will be physically co-located with respective parent table rows. In Cloud Spanner, when creating a child table, you can choose the option Interleave and specify parent table. A child table thus formed is called Interleaved table and parent table is called Root table.

The PRIMARY KEY column names in parent and child tables must match. The primary purpose of Interleaving is to speed up queries and joins.

If you are interested to read Google’s White Paper for Spanner presented at Special Interest Group on Management of Data (SIGMOD ‘17), here is the link. Absolutely fascinating and insightful read.

--

--