Jacob Swanner Development Blog

PostgreSQL MERGE and Sequences

Recently released PostgreSQL version 15 contains a new command: MERGE. We can use MERGE to do “upsert” operations, much like when providing an ON CONFLICT clause to INSERT, but MERGE can be used for other tasks as well. There’s an interesting difference when doing upserts between the two commands, and that’s what we’re going to discuss here.

Using INSERTON CONFLICT

Suppose we have a table defined as (note the unique email constraint):

CREATE TABLE people (
  id bigserial PRIMARY KEY,
  email citext,
  name text,
  UNIQUE(email)
)

After the table is created, we can run the follow INSERT with ON CONFLICT 4 times. The first time will insert a new record, and the other times will update the name since the email conflicts with the existing record:

INSERT INTO people (email, name)
VALUES ('alex@email.com', 'Alex')
ON CONFLICT (email) DO UPDATE SET name = EXCLUDED.name

Then, we run the following query to insert another record:

INSERT INTO people (email, name)
VALUES ('billie@email.com', 'Billie')
ON CONFLICT (email) DO UPDATE SET name = EXCLUDED.name

Afterwards, we query the table for all of its records:

SELECT * FROM people
 id |      email       |  name
----+------------------+--------
  1 | alex@email.com   | Alex
  5 | billie@email.com | Billie

As we can see, there’s a gap in the value of the IDs, and that’s because the sequence used for the id column is advanced with every INSERT query even it ends up doing an update via ON CONFLICT. Because of that behavior, it’s highly suggested to use bigint/bigserial for auto-incrementing columns when you are going to upsert data into the table.

Using MERGE

Now, let’s see what happens if we use MERGE instead. We’ll keep the table as it was, and we’ll upsert that last record several times using a MERGE query:

MERGE INTO people
USING (VALUES ('billie@email.com', 'Billie')) AS input(email, name)
ON people.email = input.email
WHEN NOT MATCHED THEN
  INSERT (email, name) VALUES (input.email, input.name)
WHEN MATCHED THEN
  UPDATE SET name = input.name

If we then use MERGE to insert a new record:

MERGE INTO people
USING (VALUES ('charlie@email.com', 'Charlie')) AS input(email, name)
ON people.email = input.email
WHEN NOT MATCHED THEN
  INSERT (email, name) VALUES (input.email, input.name)
WHEN MATCHED THEN
  UPDATE SET name = input.name

And check all the records:

SELECT * FROM people
 id |       email       |  name
----+-------------------+---------
  1 | alex@email.com    | Alex
  5 | billie@email.com  | Billie
  6 | charlie@email.com | Charlie

We now see that there’s no gap with the ID of the last two records, and that’s because our sequence was not advanced when the MERGE was used as an update task. Granted, the MERGE queries are more verbose than the INSERTON CONFLICT equivalents, but I believe that verboseness also provides clarity into the intentions of the query. I should also note that neither Ecto nor ActiveRecord currently provide first class support for MERGE.