PostgreSQL Performance: Optimasi Database PostgreSQL untuk Aplikasi High-Traffic
Pernah nggak sih aplikasi kamu yang menggunakan PostgreSQL tiba-tiba jadi lambat seperti siput yang kelelahan? Atau dapat alert “database connection timeout” ketika traffic sedang tinggi? Saya pernah mengalami ini di sebuah startup e-commerce – query yang biasanya jalan 50ms tiba-tiba jadi 5 detik saat flash sale!
Setelah investigasi mendalam, ternyata masalahnya bukan di hardware, tapi di konfigurasi dan query design yang tidak optimal. Yang menarik, setelah di-optimize, performance bisa meningkat hingga 10x tanpa upgrade server sama sekali!
Di panduan ini, kita akan explore teknik optimasi PostgreSQL dari basic sampai advanced. Kita akan fokus pada practical solutions yang benar-benar berdampak untuk aplikasi production.
Mengapa PostgreSQL untuk Aplikasi High-Traffic?
Sebelum masuk ke optimasi, mari pahami dulu kenapa PostgreSQL layak dipertimbangkan:
- ACID Compliance: Data integrity yang solid untuk transaksi kritikal
- Advanced Features: JSONB, Full-text search, Geospatial, Window functions
- Extensibility: Bisa ditambah dengan extensions seperti PostGIS, TimescaleDB
- Performance: Optimizer yang sophisticated dan parallel query execution
- Community: Open-source dengan development yang aktif
Companies yang menggunakan PostgreSQL: Instagram, Spotify, Apple, Reddit, Airbnb.
Diagnosis Performance Issues
Optimasi tanpa measurement seperti menyetir dengan mata tertutup. Ini tools diagnosis yang essential:
1. PostgreSQL Log Analysis
-- Aktifkan slow query logging di postgresql.conf
log_min_duration_statement = 1000 -- Log queries > 1 detik
log_statement = 'none' -- Jangan log semua query
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '
-- Setelah perubahan, reload config
SELECT pg_reload_conf();
2. Built-in Monitoring Views
-- Query yang paling lambat
SELECT
query,
calls,
total_time,
mean_time,
rows,
100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;
-- Table activity monitoring
SELECT
schemaname,
relname,
seq_scan,
seq_tup_read,
idx_scan,
idx_tup_fetch,
n_tup_ins,
n_tup_upd,
n_tup_del
FROM pg_stat_user_tables
ORDER BY seq_scan DESC
LIMIT 10;
3. EXPLAIN ANALYZE Deep Dive
-- Basic EXPLAIN
EXPLAIN SELECT * FROM orders WHERE user_id = 1234;
-- EXPLAIN dengan actual execution time
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 1234;
-- Format JSON untuk analisis detail
EXPLAIN (ANALYZE, FORMAT JSON)
SELECT * FROM orders WHERE user_id = 1234;
Configuration Optimization
1. Memory Settings (postgresql.conf)
-- Rule of thumb: 25% dari total RAM untuk dedicated database server
shared_buffers = 4GB -- 25% dari 16GB RAM
effective_cache_size = 12GB -- 75% dari 16GB RAM
work_mem = 64MB -- Untuk sorting dan hashing
maintenance_work_mem = 1GB -- Untuk VACUUM, CREATE INDEX
-- Check current settings
SELECT name, setting, unit FROM pg_settings
WHERE name IN ('shared_buffers', 'effective_cache_size', 'work_mem');
2. Connection and Checkpoint Settings
max_connections = 200 -- Sesuaikan dengan aplikasi
shared_preload_libraries = 'pg_stat_statements' -- Untuk query monitoring
-- Checkpoint optimization
checkpoint_completion_target = 0.9 -- Smooth out I/O
wal_buffers = 16MB -- Write-ahead log buffers
max_wal_size = 2GB
min_wal_size = 1GB
3. Parallel Query Settings
max_parallel_workers_per_gather = 4 -- Parallel workers per query
max_worker_processes = 8 -- Total parallel workers
max_parallel_workers = 8 -- Max parallel workers
Indexing Strategy untuk PostgreSQL
PostgreSQL punya beberapa jenis index yang powerful:
1. B-Tree Index (Standard)
-- Single column index
CREATE INDEX idx_orders_user_id ON orders(user_id);
-- Composite index
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Partial index (hanya index subset data)
CREATE INDEX idx_orders_active ON orders(user_id)
WHERE status = 'active';
-- Expression index
CREATE INDEX idx_orders_lower_email ON orders(LOWER(email));
2. BRIN Index (Block Range INdex)
-- Ideal untuk data sequential seperti timestamp
CREATE INDEX idx_orders_created_brin ON orders
USING BRIN(created_at);
-- Dengan pages_per_range adjustment
CREATE INDEX idx_orders_created_brin ON orders
USING BRIN(created_at) WITH (pages_per_range = 16);
3. GIN Index (Generalized Inverted Index)
-- Untuk JSONB dan array columns
CREATE INDEX idx_products_tags_gin ON products
USING GIN(tags);
-- Untuk full-text search
CREATE INDEX idx_products_description_gin ON products
USING GIN(to_tsvector('english', description));
4. GiST Index (Generalized Search Tree)
-- Untuk geospatial data dengan PostGIS
CREATE INDEX idx_locations_geom_gist ON locations
USING GIST(geom);
-- Untuk full-text search dengan ranking
CREATE INDEX idx_documents_content_gist ON documents
USING GIST(to_tsvector('english', content));
Query Optimization Techniques
1. Avoid N+1 Query Problem
-- ❌ BURUK: N+1 queries
-- Aplikasi query 1: SELECT * FROM users WHERE id = 123
-- Kemudian untuk setiap order: SELECT * FROM orders WHERE user_id = 123
-- ✅ BAIK: Single query dengan JOIN
SELECT u.*, json_agg(o) as orders
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.id = 123
GROUP BY u.id;
2. Efficient Pagination
-- ❌ BURUK: OFFSET untuk halaman dalam
SELECT * FROM products
ORDER BY created_at DESC
LIMIT 20 OFFSET 10000; -- Sangat lambat!
-- ✅ BAIK: Keyset pagination
SELECT * FROM products
WHERE created_at < '2024-01-01'
ORDER BY created_at DESC
LIMIT 20;
-- ✅ ATAU: menggunakan CTE dengan row_number
WITH numbered_products AS (
SELECT *, row_number() OVER (ORDER BY created_at DESC) as rn
FROM products
)
SELECT * FROM numbered_products
WHERE rn BETWEEN 10000 AND 10020;
3. Batch Operations
-- ❌ BURUK: Individual INSERTs
INSERT INTO orders (user_id, amount) VALUES (1, 100);
INSERT INTO orders (user_id, amount) VALUES (2, 200);
-- ...
-- ✅ BAIK: Batch INSERT
INSERT INTO orders (user_id, amount) VALUES
(1, 100), (2, 200), (3, 300), ...;
-- ✅ LEBIH BAIK: COPY command untuk data besar
COPY orders (user_id, amount) FROM STDIN WITH CSV;
Advanced PostgreSQL Features untuk Performance
1. Materialized Views
-- Untuk query aggregasi yang kompleks dan jarang berubah
CREATE MATERIALIZED VIEW sales_summary AS
SELECT
date_trunc('month', order_date) as month,
product_category,
COUNT(*) as order_count,
SUM(amount) as total_sales
FROM orders o
JOIN products p ON o.product_id = p.id
GROUP BY month, product_category;
-- Refresh materialized view
REFRESH MATERIALIZED VIEW CONCURRENTLY sales_summary;
-- Create index pada materialized view
CREATE UNIQUE INDEX idx_sales_summary_month_category
ON sales_summary (month, product_category);
2. Partial Aggregation dengan CTEs
-- Complex aggregation dengan Common Table Expressions
WITH monthly_sales AS (
SELECT
user_id,
date_trunc('month', order_date) as month,
SUM(amount) as monthly_total
FROM orders
WHERE order_date >= current_date - interval '1 year'
GROUP BY user_id, month
),
user_summary AS (
SELECT
user_id,
COUNT(*) as active_months,
AVG(monthly_total) as avg_monthly_spend,
MAX(monthly_total) as max_monthly_spend
FROM monthly_sales
GROUP BY user_id
)
SELECT
u.name,
us.active_months,
us.avg_monthly_spend,
us.max_monthly_spend
FROM users u
JOIN user_summary us ON u.id = us.user_id
WHERE us.active_months >= 3;
3. JSONB untuk Flexible Data
-- Simpan flexible attributes di JSONB
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
attributes JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Create GIN index untuk JSONB queries
CREATE INDEX idx_products_attributes_gin ON products
USING GIN(attributes);
-- Query efficient pada JSONB
SELECT * FROM products
WHERE attributes @> '{"color": "red", "size": "large"}';
-- Aggregate JSONB data
SELECT
attributes->>'category' as category,
COUNT(*) as product_count
FROM products
GROUP BY attributes->>'category';
Partitioning untuk Large Tables
1. Range Partitioning
-- Partition orders by month
CREATE TABLE orders (
id BIGSERIAL,
user_id INTEGER,
amount DECIMAL(10,2),
order_date DATE NOT NULL
) PARTITION BY RANGE (order_date);
-- Create partitions
CREATE TABLE orders_2024_01 PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE orders_2024_02 PARTITION OF orders
FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
-- Index each partition
CREATE INDEX idx_orders_2024_01_user_id ON orders_2024_01(user_id);
CREATE INDEX idx_orders_2024_02_user_id ON orders_2024_02(user_id);
2. Automatic Partition Creation
-- Function untuk create partition otomatis
CREATE OR REPLACE FUNCTION create_orders_partition()
RETURNS trigger AS $$
DECLARE
partition_date TEXT;
partition_name TEXT;
BEGIN
partition_date := to_char(NEW.order_date, 'YYYY_MM');
partition_name := 'orders_' || partition_date;
IF NOT EXISTS (SELECT 1 FROM pg_tables
WHERE tablename = partition_name) THEN
EXECUTE format(
'CREATE TABLE %I PARTITION OF orders
FOR VALUES FROM (%L) TO (%L)',
partition_name,
date_trunc('month', NEW.order_date),
date_trunc('month', NEW.order_date) + interval '1 month'
);
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Trigger untuk auto-partition
CREATE TRIGGER orders_partition_trigger
BEFORE INSERT ON orders
FOR EACH ROW EXECUTE FUNCTION create_orders_partition();
Connection Pooling dan High Availability
1. PgBouncer untuk Connection Pooling
# pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5
# Start pgbouncer
pgbouncer -d pgbouncer.ini
2. Read Replicas untuk Load Balancing
-- Di primary server (postgresql.conf)
wal_level = replica
max_wal_senders = 10
hot_standby = on
-- Di replica server (recovery.conf)
primary_conninfo = 'host=primary-server port=5432 user=replica password=secret'
standby_mode = on
Monitoring dan Maintenance
1. Automated Vacuum Management
-- Auto-vacuum settings
autovacuum_vacuum_scale_factor = 0.1 -- Vacuum ketika 10% data berubah
autovacuum_analyze_scale_factor = 0.05 -- Analyze ketika 5% data berubah
autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = 200
-- Monitor vacuum activity
SELECT
schemaname,
relname,
n_dead_tup,
n_live_tup,
last_vacuum,
last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 0
ORDER BY n_dead_tup DESC;
2. Query Performance Monitoring
-- Enable pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Top 10 slowest queries
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
rows,
100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
-- Reset statistics
SELECT pg_stat_statements_reset();
Case Study: Optimasi E-commerce Platform
Before Optimization (Query: 3.2 detik)
-- Query asli yang lambat
SELECT
p.id,
p.name,
p.price,
c.name as category_name,
(SELECT AVG(rating) FROM reviews r WHERE r.product_id = p.id) as avg_rating,
(SELECT COUNT(*) FROM order_items oi WHERE oi.product_id = p.id) as sales_count
FROM products p
LEFT JOIN categories c ON p.category_id = c.id
WHERE p.price BETWEEN 10000 AND 1000000
AND p.stock > 0
AND EXISTS (SELECT 1 FROM product_attributes pa
WHERE pa.product_id = p.id AND pa.attribute_id = 5)
ORDER BY p.created_at DESC
LIMIT 50;
After Optimization (Query: 0.15 detik)
-- Optimized query dengan materialized view dan better indexing
WITH product_stats AS (
SELECT
product_id,
AVG(rating) as avg_rating,
COUNT(*) as review_count
FROM reviews
GROUP BY product_id
),
product_sales AS (
SELECT
product_id,
COUNT(*) as sales_count
FROM order_items
GROUP BY product_id
)
SELECT
p.id,
p.name,
p.price,
c.name as category_name,
ps.avg_rating,
pss.sales_count
FROM products p
LEFT JOIN categories c ON p.category_id = c.id
LEFT JOIN product_stats ps ON p.id = ps.product_id
LEFT JOIN product_sales pss ON p.id = pss.product_id
WHERE p.price BETWEEN 10000 AND 1000000
AND p.stock > 0
AND p.id IN (SELECT product_id FROM product_attributes WHERE attribute_id = 5)
ORDER BY p.created_at DESC
LIMIT 50;
-- Indexes yang ditambahkan
CREATE INDEX idx_products_price_stock ON products(price, stock)
WHERE stock > 0;
CREATE INDEX idx_reviews_product ON reviews(product_id);
CREATE INDEX idx_order_items_product ON order_items(product_id);
CREATE INDEX idx_product_attributes_attribute ON product_attributes(attribute_id, product_id);
Best Practices Checklist
- ✅ Use EXPLAIN ANALYZE untuk setiap query baru
- ✅ Monitor pg_stat_statements secara berkala
- ✅ Implement connection pooling dengan PgBouncer
- ✅ Use appropriate index types (B-Tree, GIN, GiST, BRIN)
- ✅ Partition large tables berdasarkan access pattern
- ✅ Configure autovacuum appropriately
- ✅ Use materialized views untuk complex aggregations
- ✅ Implement read replicas untuk read-heavy workloads
- ✅ Regular maintenance dengan REINDEX dan VACUUM FULL
- ✅ Test with production-like data volume
Tools untuk PostgreSQL Optimization
- pgBadger: Log analyzer untuk performance reports
- pgAdmin: GUI tool dengan performance dashboards
- PostgreSQL Explain Visualizer: Visual explain plans
- pg_stat_monitor: Enhanced statistics collection
- Prometheus + Grafana: Time-series monitoring
Kesimpulan: Performance adalah Journey, Bukan Destination
Optimasi PostgreSQL bukan one-time task, tapi continuous process. Yang membedakan mediocre database dengan high-performance database adalah:
- Proactive monitoring – Jangan tunggu sampai ada masalah
- Deep understanding – Pahami bagaimana PostgreSQL bekerja internally
- Iterative improvement – Small, continuous optimizations
- Capacity planning – Anticipate growth sebelum jadi masalah
Dengan teknik-teknik di panduan ini, PostgreSQL-mu siap handle traffic tinggi dengan performance yang excellent. Ingat: measure, optimize, validate, repeat!
Selamat mengoptimasi! 🚀