Table of Contents

PostgreSQL Performance: Optimasi Database PostgreSQL untuk Aplikasi High-Traffic

Pernah nggak sih aplikasi kamu yang menggunakan PostgreSQL tiba-tiba jadi lambat seperti siput yang kelelahan? Atau dapat alert “database connection timeout” ketika traffic sedang tinggi? Saya pernah mengalami ini di sebuah startup e-commerce – query yang biasanya jalan 50ms tiba-tiba jadi 5 detik saat flash sale!

Setelah investigasi mendalam, ternyata masalahnya bukan di hardware, tapi di konfigurasi dan query design yang tidak optimal. Yang menarik, setelah di-optimize, performance bisa meningkat hingga 10x tanpa upgrade server sama sekali!

Di panduan ini, kita akan explore teknik optimasi PostgreSQL dari basic sampai advanced. Kita akan fokus pada practical solutions yang benar-benar berdampak untuk aplikasi production.

Mengapa PostgreSQL untuk Aplikasi High-Traffic?

Sebelum masuk ke optimasi, mari pahami dulu kenapa PostgreSQL layak dipertimbangkan:

ACID Compliance: Data integrity yang solid untuk transaksi kritikal
Advanced Features: JSONB, Full-text search, Geospatial, Window functions
Extensibility: Bisa ditambah dengan extensions seperti PostGIS, TimescaleDB
Performance: Optimizer yang sophisticated dan parallel query execution
Community: Open-source dengan development yang aktif

Companies yang menggunakan PostgreSQL: Instagram, Spotify, Apple, Reddit, Airbnb.

Diagnosis Performance Issues

Optimasi tanpa measurement seperti menyetir dengan mata tertutup. Ini tools diagnosis yang essential:

1. PostgreSQL Log Analysis

-- Aktifkan slow query logging di postgresql.conf
log_min_duration_statement = 1000  -- Log queries > 1 detik
log_statement = 'none'             -- Jangan log semua query
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '

-- Setelah perubahan, reload config
SELECT pg_reload_conf();

2. Built-in Monitoring Views

-- Query yang paling lambat
SELECT 
    query,
    calls,
    total_time,
    mean_time,
    rows,
    100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements 
ORDER BY mean_time DESC 
LIMIT 10;

-- Table activity monitoring
SELECT 
    schemaname,
    relname,
    seq_scan,
    seq_tup_read,
    idx_scan,
    idx_tup_fetch,
    n_tup_ins,
    n_tup_upd,
    n_tup_del
FROM pg_stat_user_tables 
ORDER BY seq_scan DESC 
LIMIT 10;

3. EXPLAIN ANALYZE Deep Dive

-- Basic EXPLAIN
EXPLAIN SELECT * FROM orders WHERE user_id = 1234;

-- EXPLAIN dengan actual execution time
EXPLAIN (ANALYZE, BUFFERS) 
SELECT * FROM orders WHERE user_id = 1234;

-- Format JSON untuk analisis detail
EXPLAIN (ANALYZE, FORMAT JSON) 
SELECT * FROM orders WHERE user_id = 1234;

Configuration Optimization

1. Memory Settings (postgresql.conf)

-- Rule of thumb: 25% dari total RAM untuk dedicated database server
shared_buffers = 4GB                   -- 25% dari 16GB RAM
effective_cache_size = 12GB            -- 75% dari 16GB RAM
work_mem = 64MB                        -- Untuk sorting dan hashing
maintenance_work_mem = 1GB             -- Untuk VACUUM, CREATE INDEX

-- Check current settings
SELECT name, setting, unit FROM pg_settings 
WHERE name IN ('shared_buffers', 'effective_cache_size', 'work_mem');

2. Connection and Checkpoint Settings

max_connections = 200                  -- Sesuaikan dengan aplikasi
shared_preload_libraries = 'pg_stat_statements'  -- Untuk query monitoring

-- Checkpoint optimization
checkpoint_completion_target = 0.9     -- Smooth out I/O
wal_buffers = 16MB                     -- Write-ahead log buffers
max_wal_size = 2GB
min_wal_size = 1GB

3. Parallel Query Settings

max_parallel_workers_per_gather = 4    -- Parallel workers per query
max_worker_processes = 8               -- Total parallel workers
max_parallel_workers = 8               -- Max parallel workers

Indexing Strategy untuk PostgreSQL

PostgreSQL punya beberapa jenis index yang powerful:

1. B-Tree Index (Standard)

-- Single column index
CREATE INDEX idx_orders_user_id ON orders(user_id);

-- Composite index
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- Partial index (hanya index subset data)
CREATE INDEX idx_orders_active ON orders(user_id) 
WHERE status = 'active';

-- Expression index
CREATE INDEX idx_orders_lower_email ON orders(LOWER(email));

2. BRIN Index (Block Range INdex)

-- Ideal untuk data sequential seperti timestamp
CREATE INDEX idx_orders_created_brin ON orders 
USING BRIN(created_at);

-- Dengan pages_per_range adjustment
CREATE INDEX idx_orders_created_brin ON orders 
USING BRIN(created_at) WITH (pages_per_range = 16);

3. GIN Index (Generalized Inverted Index)

-- Untuk JSONB dan array columns
CREATE INDEX idx_products_tags_gin ON products 
USING GIN(tags);

-- Untuk full-text search
CREATE INDEX idx_products_description_gin ON products 
USING GIN(to_tsvector('english', description));

4. GiST Index (Generalized Search Tree)

-- Untuk geospatial data dengan PostGIS
CREATE INDEX idx_locations_geom_gist ON locations 
USING GIST(geom);

-- Untuk full-text search dengan ranking
CREATE INDEX idx_documents_content_gist ON documents 
USING GIST(to_tsvector('english', content));

Query Optimization Techniques

1. Avoid N+1 Query Problem

-- ❌ BURUK: N+1 queries
-- Aplikasi query 1: SELECT * FROM users WHERE id = 123
-- Kemudian untuk setiap order: SELECT * FROM orders WHERE user_id = 123

-- ✅ BAIK: Single query dengan JOIN
SELECT u.*, json_agg(o) as orders
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.id = 123
GROUP BY u.id;

2. Efficient Pagination

-- ❌ BURUK: OFFSET untuk halaman dalam
SELECT * FROM products 
ORDER BY created_at DESC 
LIMIT 20 OFFSET 10000;  -- Sangat lambat!

-- ✅ BAIK: Keyset pagination
SELECT * FROM products 
WHERE created_at < '2024-01-01' 
ORDER BY created_at DESC 
LIMIT 20;

-- ✅ ATAU: menggunakan CTE dengan row_number
WITH numbered_products AS (
    SELECT *, row_number() OVER (ORDER BY created_at DESC) as rn
    FROM products
)
SELECT * FROM numbered_products 
WHERE rn BETWEEN 10000 AND 10020;

3. Batch Operations

-- ❌ BURUK: Individual INSERTs
INSERT INTO orders (user_id, amount) VALUES (1, 100);
INSERT INTO orders (user_id, amount) VALUES (2, 200);
-- ...

-- ✅ BAIK: Batch INSERT
INSERT INTO orders (user_id, amount) VALUES 
(1, 100), (2, 200), (3, 300), ...;

-- ✅ LEBIH BAIK: COPY command untuk data besar
COPY orders (user_id, amount) FROM STDIN WITH CSV;

Advanced PostgreSQL Features untuk Performance

1. Materialized Views

-- Untuk query aggregasi yang kompleks dan jarang berubah
CREATE MATERIALIZED VIEW sales_summary AS
SELECT 
    date_trunc('month', order_date) as month,
    product_category,
    COUNT(*) as order_count,
    SUM(amount) as total_sales
FROM orders o
JOIN products p ON o.product_id = p.id
GROUP BY month, product_category;

-- Refresh materialized view
REFRESH MATERIALIZED VIEW CONCURRENTLY sales_summary;

-- Create index pada materialized view
CREATE UNIQUE INDEX idx_sales_summary_month_category 
ON sales_summary (month, product_category);

2. Partial Aggregation dengan CTEs

-- Complex aggregation dengan Common Table Expressions
WITH monthly_sales AS (
    SELECT 
        user_id,
        date_trunc('month', order_date) as month,
        SUM(amount) as monthly_total
    FROM orders
    WHERE order_date >= current_date - interval '1 year'
    GROUP BY user_id, month
),
user_summary AS (
    SELECT 
        user_id,
        COUNT(*) as active_months,
        AVG(monthly_total) as avg_monthly_spend,
        MAX(monthly_total) as max_monthly_spend
    FROM monthly_sales
    GROUP BY user_id
)
SELECT 
    u.name,
    us.active_months,
    us.avg_monthly_spend,
    us.max_monthly_spend
FROM users u
JOIN user_summary us ON u.id = us.user_id
WHERE us.active_months >= 3;

3. JSONB untuk Flexible Data

-- Simpan flexible attributes di JSONB
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(200) NOT NULL,
    attributes JSONB NOT NULL DEFAULT '{}',
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Create GIN index untuk JSONB queries
CREATE INDEX idx_products_attributes_gin ON products 
USING GIN(attributes);

-- Query efficient pada JSONB
SELECT * FROM products 
WHERE attributes @> '{"color": "red", "size": "large"}';

-- Aggregate JSONB data
SELECT 
    attributes->>'category' as category,
    COUNT(*) as product_count
FROM products 
GROUP BY attributes->>'category';

Partitioning untuk Large Tables

1. Range Partitioning

-- Partition orders by month
CREATE TABLE orders (
    id BIGSERIAL,
    user_id INTEGER,
    amount DECIMAL(10,2),
    order_date DATE NOT NULL
) PARTITION BY RANGE (order_date);

-- Create partitions
CREATE TABLE orders_2024_01 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE orders_2024_02 PARTITION OF orders
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

-- Index each partition
CREATE INDEX idx_orders_2024_01_user_id ON orders_2024_01(user_id);
CREATE INDEX idx_orders_2024_02_user_id ON orders_2024_02(user_id);

2. Automatic Partition Creation

-- Function untuk create partition otomatis
CREATE OR REPLACE FUNCTION create_orders_partition()
RETURNS trigger AS $$
DECLARE
    partition_date TEXT;
    partition_name TEXT;
BEGIN
    partition_date := to_char(NEW.order_date, 'YYYY_MM');
    partition_name := 'orders_' || partition_date;
    
    IF NOT EXISTS (SELECT 1 FROM pg_tables 
                   WHERE tablename = partition_name) THEN
        EXECUTE format(
            'CREATE TABLE %I PARTITION OF orders 
             FOR VALUES FROM (%L) TO (%L)',
            partition_name,
            date_trunc('month', NEW.order_date),
            date_trunc('month', NEW.order_date) + interval '1 month'
        );
    END IF;
    
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Trigger untuk auto-partition
CREATE TRIGGER orders_partition_trigger
    BEFORE INSERT ON orders
    FOR EACH ROW EXECUTE FUNCTION create_orders_partition();

Connection Pooling dan High Availability

1. PgBouncer untuk Connection Pooling

# pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5

# Start pgbouncer
pgbouncer -d pgbouncer.ini

2. Read Replicas untuk Load Balancing

-- Di primary server (postgresql.conf)
wal_level = replica
max_wal_senders = 10
hot_standby = on

-- Di replica server (recovery.conf)
primary_conninfo = 'host=primary-server port=5432 user=replica password=secret'
standby_mode = on

Monitoring dan Maintenance

1. Automated Vacuum Management

-- Auto-vacuum settings
autovacuum_vacuum_scale_factor = 0.1    -- Vacuum ketika 10% data berubah
autovacuum_analyze_scale_factor = 0.05  -- Analyze ketika 5% data berubah
autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = 200

-- Monitor vacuum activity
SELECT 
    schemaname,
    relname,
    n_dead_tup,
    n_live_tup,
    last_vacuum,
    last_autovacuum
FROM pg_stat_user_tables 
WHERE n_dead_tup > 0
ORDER BY n_dead_tup DESC;

2. Query Performance Monitoring

-- Enable pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 10 slowest queries
SELECT 
    query,
    calls,
    total_exec_time,
    mean_exec_time,
    rows,
    100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements 
ORDER BY mean_exec_time DESC 
LIMIT 10;

-- Reset statistics
SELECT pg_stat_statements_reset();

Case Study: Optimasi E-commerce Platform

Before Optimization (Query: 3.2 detik)

-- Query asli yang lambat
SELECT 
    p.id,
    p.name,
    p.price,
    c.name as category_name,
    (SELECT AVG(rating) FROM reviews r WHERE r.product_id = p.id) as avg_rating,
    (SELECT COUNT(*) FROM order_items oi WHERE oi.product_id = p.id) as sales_count
FROM products p
LEFT JOIN categories c ON p.category_id = c.id
WHERE p.price BETWEEN 10000 AND 1000000
AND p.stock > 0
AND EXISTS (SELECT 1 FROM product_attributes pa 
            WHERE pa.product_id = p.id AND pa.attribute_id = 5)
ORDER BY p.created_at DESC
LIMIT 50;

After Optimization (Query: 0.15 detik)

-- Optimized query dengan materialized view dan better indexing
WITH product_stats AS (
    SELECT 
        product_id,
        AVG(rating) as avg_rating,
        COUNT(*) as review_count
    FROM reviews 
    GROUP BY product_id
),
product_sales AS (
    SELECT 
        product_id,
        COUNT(*) as sales_count
    FROM order_items 
    GROUP BY product_id
)
SELECT 
    p.id,
    p.name,
    p.price,
    c.name as category_name,
    ps.avg_rating,
    pss.sales_count
FROM products p
LEFT JOIN categories c ON p.category_id = c.id
LEFT JOIN product_stats ps ON p.id = ps.product_id
LEFT JOIN product_sales pss ON p.id = pss.product_id
WHERE p.price BETWEEN 10000 AND 1000000
AND p.stock > 0
AND p.id IN (SELECT product_id FROM product_attributes WHERE attribute_id = 5)
ORDER BY p.created_at DESC
LIMIT 50;

-- Indexes yang ditambahkan
CREATE INDEX idx_products_price_stock ON products(price, stock) 
WHERE stock > 0;
CREATE INDEX idx_reviews_product ON reviews(product_id);
CREATE INDEX idx_order_items_product ON order_items(product_id);
CREATE INDEX idx_product_attributes_attribute ON product_attributes(attribute_id, product_id);

Best Practices Checklist

✅ Use EXPLAIN ANALYZE untuk setiap query baru
✅ Monitor pg_stat_statements secara berkala
✅ Implement connection pooling dengan PgBouncer
✅ Use appropriate index types (B-Tree, GIN, GiST, BRIN)
✅ Partition large tables berdasarkan access pattern
✅ Configure autovacuum appropriately
✅ Use materialized views untuk complex aggregations
✅ Implement read replicas untuk read-heavy workloads
✅ Regular maintenance dengan REINDEX dan VACUUM FULL
✅ Test with production-like data volume

Tools untuk PostgreSQL Optimization

pgBadger: Log analyzer untuk performance reports
pgAdmin: GUI tool dengan performance dashboards
PostgreSQL Explain Visualizer: Visual explain plans
pg_stat_monitor: Enhanced statistics collection
Prometheus + Grafana: Time-series monitoring

Kesimpulan: Performance adalah Journey, Bukan Destination

Optimasi PostgreSQL bukan one-time task, tapi continuous process. Yang membedakan mediocre database dengan high-performance database adalah:

Proactive monitoring – Jangan tunggu sampai ada masalah
Deep understanding – Pahami bagaimana PostgreSQL bekerja internally
Iterative improvement – Small, continuous optimizations
Capacity planning – Anticipate growth sebelum jadi masalah

Dengan teknik-teknik di panduan ini, PostgreSQL-mu siap handle traffic tinggi dengan performance yang excellent. Ingat: measure, optimize, validate, repeat!

Selamat mengoptimasi! 🚀