Check if you have an Index on your IDs in PostgreSQL
Why?
PostgreSQL does not automatically create an index for your ID column. This article will explain why you should always create an Index on your ID columns and how to check if all your existing tables have one. You don't have a lot of time? Just click here to go directly to the final query.
How to proceed for new tables?
Let's say you create a table:
CREATE TABLE test (
id serial,
column1 text,
column2 text
);
You should always create an Index for your ID afterwards:
CREATE INDEX ON test (id);
The associated PostgreSQL documentation can be found here.
Performance demonstration
I have a table with 1.25 million data:
SELECT count(*) FROM my_table;
count
---------
1259722
(1 row)
Time: 540.203 ms
I want to update an ID but I do not have an Index on my ID:
UPDATE my_table SET update = now() WHERE id = 12345;
UPDATE 1
Time: 1213.705 ms
For the second test, I create the index:
CREATE INDEX ON my_table (id);
And I run again the sane query:
UPDATE my_table SET update = now() WHERE id = 12345;
UPDATE 1
Time: 2.108 ms
The exact same query went 575 times faster with the Index.
How do I check all my tables?
We just saw that everyone should always create an Index on the ID column for each table. It is (almost) never going to be useless.
But now, let's find out how to scan all the tables of our database to find out were did we forgot to put an Index of our IDs.
First, let's use the "information_schema" schema from PostgreSQL to find all the tables of our database:
postgres@db # SELECT table_name
FROM information_schema.tables
WHERE table_schema='public' -- Default is usually 'public'
AND table_type='BASE TABLE';
table_name
------------
test
test_2
test_3
test_4
test_5
(5 rows)
We can now use a second query to find all the columns for one table:
postgres@db # SELECT ordinal_position, column_name, data_type
FROM information_schema.columns
WHERE table_schema = 'public' -- Default is usually 'public'
AND table_name = 'test' -- the table we chose
ORDER BY ordinal_position;
ordinal_position | column_name | data_type
------------------+-------------+-----------
1 | id | integer
2 | column1 | text
3 | column2 | text
(3 rows)
Using both tables (information_schema.tables and information_schema.columns) we can now find all the tables with an ID (we don't need the tables with no ID) :
postgres@db # SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
AND table_type='BASE TABLE'
AND EXISTS (
SELECT 1
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = information_schema.tables.table_name
AND column_name = 'id'
);
table_name
------------
test
test_2
test_3
test_4
(4 rows)
We just found out that "test_5" has no ID and does not need to be checked.
Now, let's find all the indexes for the table "test" :
postgres@db # SELECT a.attname
FROM pg_class t
JOIN pg_index ix ON t.oid = ix.indrelid
JOIN pg_class i ON i.oid = ix.indexrelid
JOIN pg_attribute a ON a.attrelid = t.oid
WHERE
a.attnum = ANY(ix.indkey) AND
t.relkind = 'r' AND
t.relname ='test' -- table name
ORDER BY
t.relname,
i.relname;
attname
---------
column1
column2
id
(3 rows)
Using the last 3 queries, we can now find:
- The list of all the tables in our database
- The list of all the columns for each table (to skip tables with no IDs)
- The list of Indexes for each table
It's now easy to find out which tables have an ID column but no Index on it:
postgres@db # WITH tables_with_id AS ( -- list of tables with an ID column
SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
AND table_type='BASE TABLE'
AND EXISTS (
SELECT 1
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = information_schema.tables.table_name
AND column_name = 'id'
)
),
tables_ok AS ( -- list of tables with an ID column AND an Index on the ID column
SELECT t.relname
FROM pg_class t
JOIN pg_index ix ON t.oid = ix.indrelid
JOIN pg_class i ON i.oid = ix.indexrelid
JOIN pg_attribute a ON a.attrelid = t.oid
WHERE
a.attnum = ANY(ix.indkey) AND
t.relkind = 'r' AND
t.relname IN (SELECT table_name FROM tables_with_id) AND -- we seach only tables with an ID
a.attname ='id' -- we chack only Indexes on the column ID
ORDER BY
t.relname,
i.relname
)
SELECT CONCAT('CREATE INDEX ON ', table_name, ' (id);') as you_need_to_add_the_following_indexes
FROM information_schema.tables
WHERE table_schema='public' AND
table_type='BASE TABLE' AND -- Default is usually 'public'
table_name IN (SELECT table_name FROM tables_with_id) AND -- the table have an ID
table_name NOT IN (SELECT relname FROM tables_ok) -- the table is not in the list of tables with an Index on the column ID
;
you_need_to_add_the_following_indexes
----------------------------------------------
CREATE INDEX ON test_3 (id);
CREATE INDEX ON test_4 (id);
(2 rows)
That's it !