PostgreSQL Benchmarking

PostgreSQL is a great RDBMS, cannot say enough good things. To ensure that one is getting everything possible from the system benchmarking is important, no it's critical. One needs to be able to show, study and predict behaviour, benchmarking is one method to accomplish this. This document describes the process of creating benchmarks, analysis of the results and methodologies for comparison.

Benchmarking Overview

In the world of PostgreSQL the standard tool of measurement is pgbench. This is the tool used here. An initial benchmark is performed to determine the baseline from which all other benchmarks will be measured. Between benchmarks the configuration of the operating system or database server can be altered to determine impact on performance. See the PostgreSQL Performance Tuning page for more details.

Creating a Benchmark

This initial benchmark is the starting reference point for your system. Simply initialize the benchmark database and have at it. The commands below are executed as a normal user with approiate permissions to the database.

~ $ createdb pgbench
CREATE DATABASE
~ $ pgbench -i pgbench -s 50
[ lots of output, truncated ]
~ $ pgbench -c 10 -t 100

The option -s 50 sets the scaling factor, should be at least as much as the maximum number of clients to test. The pgbench is executed with -c 10 clients and -t 100 transactions each.

These benchmark results should be run many times with different variations of clients and transactions. Save this information for later comparison to results generated from other tests after configuration tweaks.

Benchmark Testing

To successfully run a benchmark the tests must be done many times and then averaged out, standard deviation will tell the quality of the results. Between each benchmark test change only one configuration parameter, otherwise the impact of each change will be lost.

Pgbench is an excellent testing tool but keep in mind that when pgbench reports 417 TPS the production application will not get 417 TPS. The number will likely be very different from pgbench unless the production application is doing the exact same thing as pgbench. What this means is that pgbench will test the database, not the application. By performing the same simple tests repeatedly the benchmark results are accurate for measuring the same flavor of queries only.

To benchmark the production application tests must be performed that are similar to production loads and query types. The pgbench queries for example do not perform joins, where clauses using regular expressons on text fields and other query types that would be found in many applications. A suggestion for this type of testing would be to enable statement logging for postgresql, process those to find common querys and then build a test suite based on that.

Much time can be wasted waiting for results from benchmark programs. In the Creo section is a utility called pg_bench_suite for automating tests using pgbench. One would want to run the test suite once with the default configuration then change one parameter only and re-run the tests and compare the results.

ChangeLog

14 Jan 2006 - Created /djb

Benchmarking Overview

Creating a Benchmark

Benchmark Testing

See Also

ChangeLog