postgresql
install
# apt install gnupg2
# wget -q -O /etc/apt/trusted.gpg.d/postgresql.asc https://www.postgresql.org/media/keys/ACCC4CF8.asc
# echo "# PostgreSQL" >> /etc/apt/sources.list
# echo "deb https://apt.postgresql.org/pub/repos/apt bookworm-pgdg main" >> /etc/apt/sources.list
# apt update
# apt install postgresql
config
create admin and database
# "local" is for Unix domain socket connections only
local all all scram-sha-256
# su postgres
$ createuser pguser -P # roles starting with pg_* are reserved
$ createdb <DB_NAME> -O pguser -W # -O owner, -W password
fine-tuning
shared_buffeers = 256MB
work_mem = 4MB
maintenance_work_mem = 64MB
effective_cache_size =
checkpoint_segments =
checkpoint_completion_target =
max_connections = 100
# avoid disconnection on persistant cnx
# default is 0 (ie OS parameters)
tcp_keepalives_idle = 300
tcp_keepalives_interval = 60
tcp_keepalives_count = 5
psql
list all schemas
select schema_name from information_schema.schemata;
list all tables for a schema
\dt <SCHEMA_NAME>.*
clients
DBeaver
Right-click on db name, Edit connection
:
Connection settings
| Initialization
:
. ckeck auto-commit
. set keep-alive
to 60s
. set close idle connection after
to 14400s (or uncheck)
python
psycopg
$ pip3 install psycopg
pool
$ pip3 install psycopg[pool]
backup & restore
URI: postgres://user:pwd@host:port/db_name
Backup
$ pg_dump -Fc -d <DB_NAME> -n <SCHEMA> -U <USER> -f <FILENAME> -W
-Fc : postgres'custom format, mandatory for pg_restore
-n <SCHEMA> : dump specific schema only
-N <SCHEMA> : exclude schema
-t <TABLE> : dump specific table only
-T <TABLE> : exclude table
-h <HOST> : remote host
-p <PORT> : remote port
-W : force password prompt
Restore
Warning
With --no-owner
, pg_restore creates schemas with postgres
as owner.
If in use (because usernames differ between servers), use -U for specific user or create schemas manually before pg_restore.
postgres$ pg_restore -d <DB_NAME> --clean --no-owner -U <USER> <FILENAME>
--clean : delete all objets in db before restore
-n <SCHEMA> : public or specific schema
--no-owner : ignore ownership, replace by -U <USER>
$ psql -d <DB_NAME> -U pguser -W
server pooling - pgcat
HA
patroni
# apt install etcd
# pip3 install patroni
tips
cursor
use one cursor per query
update
At update
postgres creates a new row, then delete the old one.
So it should not have performance difference between updating 1 column or many.
list all schemas
SELECT * FROM information_schema.schemata;
partitions
Partition on tables is a efficient way to split tables to reduce performance downgrade as table grows.
Physically data are stored in different tables but partitions are logicaly bound as one table: application requests are simplier (only one table). Index creation is also done on main structure, then postgresql creates under-the-hood necessary indexes on different partitions.
# table migration to partioned table
ALTER TABLE events TO events_old;
CREATE TABLE events (
...events_old schema
) PARTITION BY LIST (col_name);
CREATE TABLE events_ping PARTITION OF events FOR VALUES IN ('ping');
CREATE TABLE events_default PARTITION OF events DEFAULT;
INSERT INTO events SELECT * FROM events_old;
DROP TABLE events_old;
PRIMARY KEY = UNIQUE + NOT NULL + index
It also adds some semantic to the table definition.
Another differences are: only one pkey per table, pkey are only b-tree.
TEXT vs VARCHAR(n)
TEXT and VARCHAR(n) are stored in identical way by postgresql; VARCHAR(n) only adds a check on string length. So read performance is identical, write has a negligible overhead.
BUT, if column is index (or primary key), postgresql can optimize VARCHAR because maximum lenght is known. So for primary key, VARCHAR might be a better choice.
ulid 26, uuid 36, email 100, last/first name 50
TEXT vs JSONB
JSONB exact match queries are ~1.2x slower than similar TEXT query (both with index). So it can be considered equivalent.
# JSONB index best practice
# use btree (default) for specific field queries
CREATE INDEX idx_data_field ON <TABLE> (<JSONB_COL>->>'field');
# use GIN index for complex queries
CREATE INDEX idx_ ON <TABLE> USING GIN(<JSONB_COL>);
# cast in index for numeric/integer queries field queries
CREATE INDEX idx_data_field ON <TABLE> ((<JSONB_COL>->>'field')::int);