Joins
When your database tables relate to each other, you can represent those relationships directly in your data models. This allows you to build dashboards using data from multiple tables without writing manual JOIN statements.
Imagine the following database:
Table | Key Relationships |
---|---|
customers | A customer can have many orders. |
orders | Each order is linked to one customer and one product. |
products | Each product can appear in many orders. |
With these relationships, you can answer questions like:
- "How many orders did a specific customer place last month?"
- "What’s the most popular product in a specific country this week?"
- "Which customer has spent the most money?"
There are two ways to approach combining these tables:
1. One Model for Multiple Tables
You can use the sql
parameter to combine tables. This approach is often very useful if multiple raw database tables need to be joined together to represent one clean user-facing concept.
cubes:
- name: customers_and_orders_and_products
sql: >
SELECT *
FROM public.customers
JOIN public.orders ON customers.id = orders.customer_id
JOIN public.products ON products.id = orders.product_id
...
2. Multiple Models with Joins
The general rule of thumb is to try and have one data model for each user-facing concept.
E.g. Spotify would probably want a data model to represent each of the following concepts: playlists
, listeners
, artists
, tracks
, etc.
Since customers
, orders
, and products
are user-facing concepts, the best approach is therefore to create separate models and then link them, like so:
cubes:
- name: customers
sql_table: public.customers
joins:
- name: orders # the name of the model you're joining to
sql: "{CUBE}.id = {orders}.customer_id" # references the customer_id *column* in the orders model, but you can also reference dimensions (learn more below)
relationship: one_to_many
dimensions:
- name: id
sql: id
type: number
primary_key: true
...
cubes:
- name: orders
sql_table: public.orders
joins:
- name: products # the name of the model you're joining to
sql: "{CUBE}.product_id = {products}.id" # references the id column in the products model, but you can also reference dimensions (learn more below)
relationship: many_to_one
- name: customers # the name of the model you're joining to
sql: "{CUBE}.customer_id = {customers}.id" # references the id column in the customers model, but you can also reference dimensions (learn more below)
relationship: many_to_one
dimensions:
- name: id
sql: id
type: number
primary_key: true
...
cubes:
- name: products
sql_table: public.products
joins:
- name: orders
sql: "{CUBE}.id = {orders}.product_id" # references the product_id column in the orders model, but you can also reference dimensions (learn more below)
relationship: one_to_many
dimensions:
- name: id
sql: id
type: number
primary_key: true
...
-
name
: In ajoins
block,name
refers to another model (cube), not the underlying database table. -
sql
: Specifies the SQL condition that defines how two models are related. You can reference either columns or dimensions:{products}.some_column
points to the some_column field in the database table associated with the products model.{products.some_dimension}
refers to the some_dimension dimension that’s defined inside the products model.my_products.some_column
accesses the some_column in the actual my_products table in your database (i.e. the raw table name).
-
relationship
: Setrelationship
to one_to_one (opens in a new tab), one_to_many (opens in a new tab), or many_to_one (opens in a new tab) depending on your schema. -
primary_key
: Each model should have a dimension markedprimary_key: true
. This ensures Cube can generate valid JOIN queries. By default, primary keys are hidden in the UI - usepublic: true
if you want to expose them as dimensions. If no single column uniquely identifies rows, build a composite key with a custom SQL expression:dimensions: - name: id sql: CONCAT_WS('|', tableA.id, tableB.id) type: string primary_key: true
Joining data from multiple data sources
You can also join data stored in different data sources. Visit Cube's documentation (opens in a new tab) to learn more.