Jun 11, 2021
Benchmark: Using machine learning to optimize Amazon RDS PostgreSQL performance
Bohan Zhang
Amazon RDS PostgreSQL makes deploying and operating a PostgreSQL database instance easy. When you provision an RDS PostgreSQL instance, Amazon launches it with their default configuration settings for dozens of tunable “knobs.” These knob settings affect database performance, and the default settings are often not optimal for your database’s specific workload.
Tuning the knob settings in order to optimize price-performance for your database requires:
Specialized knowledge of PostgreSQL internals
Time to iteratively make changes and test the impact of the change on performance until it is optimized (i.e., trial-and-error)
If you run Amazon RDS PostgreSQL and you lack (1) or (2) above, or if your workload changes over time (thus requiring re-tuning), you’ll likely pay higher cloud costs than you ought to for your database.
Putting Amazon RDS PostgreSQL knob settings to the test
The question you’re probably asking is “how far off the mark could my Amazon RDS knob settings be?”
The question you’re probably not asking (yet) is “Can machine learning auto-tune my database knob settings to optimize its performance?”
To help answer those questions, we ran an experiment to test Amazon RDS PostgreSQL performance with three database configurations: (a) Default Amazon RDS knob settings, (b) knob setting recommendations from PGTune and (c) knob setting recommendations from OtterTune’s machine learning algorithms
To run the test, we used the OLTP-bench framework to run the TPC-C benchmark, a well-known transactional workload. We collected runtime metrics from the database, and calculated the throughput based on those metrics. Specifically, for Amazon RDS PostgreSQL, we optimized for the xact_commit metric (the number of transaction commits in the database).
We ran the tests on an Amazon db.m5.4xlarge RDS instance equipped with:
16vCPUs,
64GB RAM, and
4000 provisioned IOPs.
For PGTune, we applied the configuration recommended by PGTune to the database:
Result: OtterTune more than doubles performance (throughput)
In this test, the configuration changes that PGTune recommended increased throughput (transactions per second) by 41.9%.
OtterTune’s machine learning increased throughput over 3x more than PGTune…146.7% greater than the default RDS settings.
Double the performance, or half the cost?
In the previous test, we showed that using machine learning to find the optimal configuration knob settings more than doubled database performance. That’s a nice outcome if you’re seeking faster database performance.
For people who are satisfied with database performance, but would like to lower costs, knob-tuning can also help. To demonstrate that, we re-ran the test on a smaller RDS instance costing half as much, a db.m5.2xlarge equipped with:
8vCPUs
32GB RAM
2000 provisioned IOPs
Optimizing Amazon RDS PostgreSQL on the db.m5.2xlarge instance enabled it to achieve the same performance as PostgreSQL (with the default settings) on the twice-as-large db.m5.4xlarge, but at half the instance cost:
How long does it take OtterTune to find “optimal?”
OtterTune works by iterating through different knob setting configurations and observing the database’s performance. Its machine learning models are trained to make educated decisions about new configuration changes for the observed workload and optimization goals. It’s similar to the approach an expert DBA would take, but it performs it faster.
The iterations required to find optimal performance in the second test are shown below. From the first iteration to the final iteration it took 2 hours.
In addition, OtterTune will continue to monitor the database workload and automatically adjust knob settings if the workload changes.