Eric Radman : a Journal

Partition Pruning

Declarative partitioning in enables the PostgreSQL query planner to optimize queries in several ways, most notably by pruning partitions that are irrelevant for the query. This results in tremendous performance gains, but only works for values known during the planning stage.

If the partition key is also a natural key, such as a date, pruning may appear to work in all cases, but if an intermediate lookup is required (name → id), queries will resort to scan all partitions.

Memorizing values with PL/pgSQL

Standard SQL does not provide a way to define a session variable but PL/pgSQL does. By assigning a value, a subquery can generate a result that is treated by the planner as a consent

BEGIN;

DO $$
DECLARE
  vol_id integer = (SELECT id FROM sf_volumes.volume WHERE name='mnt-local');
  cursor1 CURSOR FOR
    SELECT name
    FROM file_current
    WHERE volume_id=vol_id
    AND name ~ '.iso$';
BEGIN
  OPEN cursor1;
END
$$;

FETCH ALL FROM cursor1;
COMMIT;

Stable Lookup Functions

Perhaps the most versatile method of providing a constant to a query is to define an IMMUTABLE or STABLE lookup function

CREATE FUNCTION volume_id_from_name(volume_name text)
RETURNS int
LANGUAGE sql IMMUTABLE
AS $$
  SELECT id FROM volume WHERE name = volume_name;
$$;
This function may used anywhere in the query
SELECT name
FROM file_current
WHERE volume_id = volume_id_from_name('mnt-local')
AND name ~ '.iso$';
Or used as a parameter to a prepared statement
PREPARE plan1 (int) AS
SELECT name
FROM file_current
WHERE volume_id = $1
AND name ~ '.iso$';

EXECUTE plan1(volume_id_from_name('mnt-local'));

CTEs remember the lookup result as a stable value, but subqueries serve as an optimization barrier

-- limited to one volume
WITH volume AS (
  SELECT volume_id_from_name('mnt-local') AS id
)
SELECT volume_id, file_current.id, parent_id, name
FROM file_current, volume
WHERE volume_id = volume.id
AND name ~ '.iso$';

-- scans all volumes
SELECT volume_id, file_current.id, parent_id, name
FROM file_current
WHERE volume_id = (SELECT volume_id_from_name('mnt-local'))
AND name ~ '.iso$';

Temporary Tables

Sometimes the easiest way to structure a query is to write intermediate results to a table. When combined with a single-use lookup function multiple partitions may be matched using ANY()

CREATE TABLE pg_temp.candidates AS (
  SELECT volume_id, id, parent_id, name
  FROM file_current
  WHERE name ~ '.iso$'
);
CREATE FUNCTION pg_temp.top_values()
RETURNS int[]
LANGUAGE sql IMMUTABLE PARALLEL SAFE
AS $$
  SELECT array_agg(DISTINCT volume_id) FROM pg_temp.candidates;
$$;
SELECT file_current.*
FROM file_current
JOIN pg_temp.candidates USING (id, volume_id)
WHERE volume_id = ANY(pg_temp.top_values())