Why You Need a Default Partition
Required default partitions to avoid lost data in Postgres and YugabyteDB
Postgres and YugabyteDB allow you to define partitions of parent tables. Partitions are useful in at least two ways:
- You can take advantage of partition pruning. The database doesn’t need to look at partitions it knows won’t meet the parameters of the query.
- You can easily archive data by disconnecting and/or dropping partitions instead of managing expensive delete queries.
Here’s one gotcha I ran into recently. What happens if you insert a row into a partitioned table, but there’s no partition for it? The insert fails with an error – see below for a reproduction of this scenario.
Create the parent table:
create database partition_test;
\c partition_test
create table my_logging (
id int,
name text,
date_timestamp timestamp without time zone
) partition by range (date_timestamp);
Create the partitions. Here I have intentionally skipped creating a partition for 202402-18. Note also that partitions are defined as _inclusive at the lower end of the range and exclusive at the upper end of the range.
create table my_logging_2024_02_17 partition of my_logging
for values FROM ('2024-02-17 00:00:00.000000') to ('2024-02-18 00:00:00.000000');
create table my_logging_2024_02_19 partition of my_logging
for values FROM ('2024-02-19 00:00:00.000000') to ('2024-02-20 00:00:00.000000');
create table my_logging_2024_02_20 partition of my_logging
for values FROM ('2024-02-20 00:00:00.000000') to ('2024-02-21 00:00:00.000000');
Add some data:
insert into my_logging values (1, 'a', '2024-02-17 00:00:10');
INSERT 0 1
insert into my_logging values (2, 'b', '2024-02-18 00:00:10');
ERROR: no partition of relation "my_logging" found for row
DETAIL: Partition key of the failing row contains (date_timestamp) = (2024-02-18 00:00:10).
insert into my_logging values (3, 'c', '2024-02-19 00:00:10');
INSERT 0 1
insert into my_logging values (4, 'd', '2024-02-20 00:00:10');
INSERT 0 1
You can see that the insert of the row with a date on the 18th fails. Hopefully your application is handling errors and not just ignoring them. If the application allows this to fail silently, that row never makes it to the database and could be lost forever.
This is what the data looks like after the above insert statements:
select id, name, date_timestamp from my_logging;
id | name | date_timestamp
----+------+---------------------
1 | a | 2024-02-17 00:00:10
3 | c | 2024-02-19 00:00:10
4 | d | 2024-02-20 00:00:10
(3 rows)
The solution to avoiding failed inserts for data outside defined partitions is to create a default partition:
create table my_logging_default partition of my_logging default;
insert into my_logging values (2, 'b', '2024-02-18 00:00:10');
INSERT 0 1
Under ideal circumstances, of course, nothing should go into this default partition, because you’ve defined partitions for all cases. If you do end up with a large default partition, it takes away the benefit of having smaller distinct partitions (for query optimization and maintenance). The solution is to ensure that your partition management scheme is kept up-to-date.