How does postgresql scale




















When that day comes, first of all, try all the common vertical scale-up approaches and tricks. Try to avoid using derivative Postgres products, or employing distributed approaches, or home-brewed sharding at all costs — until you have, say, less than 1 year of breathing room available. Wow, what kind of advice is that for ? Oh, MySQL??? I would like to provide some food for thought to those who are overly worried about database performance — thus prematurely latching onto some overly complex architectures.

Also, remember that to go from covering The daemon we created detects when a table is blocking traffic i. Finally, we come to pgcompacttable. This tool is excellent for reducing table bloat in a completely nonblocking manner. It works by reordering tuples from the end of a table towards the front of a table, which allows the table size to shrink. Major upgrades of PostgreSQL are used as opportunities to change the on-disk format of data.

Upgrading requires the data to be rewritten in a new format. There are two upgrade approaches that provide different allowances for service availability. This tool rewrites the database from the old format to the new.

It requires that the database is offline while the upgrade is occurring. This requirement is a big problem if you have even a modestly sized data set and availability requirements — which is why we've never used this method to upgrade our database. Instead, we use logical replication to perform our major version upgrades. Logical replication is an extension of streaming replication typically used for hot standbys.

Streaming replication works by writing raw disk block changes from the upstream server to the replica, which makes it unsuitable for performing a major upgrade. The reason why logical replication can be used is that changes are decoded and applied as if a stream of SQL statements was sent to the replica as opposed to simply writing the page changes to disk.

There is one major caveat, though. The decoding process on the destination database is single-threaded. If the write load on your database is high enough, it will overwhelm the decoding process and cause the lag to increase until some limit is reached typically, available disk space. The replication target could be on an upgraded version of PostgreSQL.

This means that your application must be able to select different databases for different tables and requires you to handle the switchover in application code. To get started with logical replication, I recommend first reviewing the official PostgreSQL manual and also check out the pglogical extension, which provides more sophisticated control over conflict resolution under logical replication. Minor upgrades are almost a footnote after the major upgrade section. Minor upgrades can be performed simply by updating the PostgreSQL binary and restarting the process.

Most SLAs will provide plenty of buffer to support a quick restart of a database process. In cases where you need extreme availability, streaming replication and switchover may allow zero downtime minor upgrades. Process restarts take in the order of tens of seconds with our data set, server size, and load, and we take this simple approach to keep our databases on the latest minor version. That XID is used to track row versions and determine which row versions can be seen by a particular transaction.

If the XID counter were to wrap around, transactions that are in the past would appear to be in the future, and this would result in data corruption.

XIDs can be viewed as lying on a circle or circular buffer. As long as the end of that buffer does not jump past the front, the system will function correctly. However, there are failure modes which prevent it from freezing extremely old tuples and the oldest unfrozen tuple limits the number of past IDs that are visible to a transaction only two billion past IDs are visible.

If the remaining XID count reaches one million, the database will stop accepting commands and must be restarted in single-user mode to recover.

Therefore, it is extremely important to monitor remaining XIDs so that your database never gets into this state. We trigger alerts if this value is ever less than million. This is the maximum XID age required before a forced vacuum. We have gotten into or near XID exhaustion in the past when a wraparound vacuum appears to be stuck running indefinitely.

We think this was due to some extremely long-running transactions on the relation being vacuumed, but we can not say conclusively why this occurred. Nonetheless, we keep tabs on this number because of its ability to force a database out of service. The great folks over at sentry. For additional information, check out the manual section about preventing XID wraparound failures. If you plan to operate PostgreSQL yourself, I would consider this section of the manual required reading.

There are times when you need to promote a replica, possibly emergent. The way we achieve this is by putting our databases and its replicas behind haproxy. Increasing this value can result in speedup of maintenance jobs, parallel queries and index creation. Starting with version 9. Parallel querying involves spawning workers, distributing work amongst them and then collecting gathering the results. If you have idle CPUs and parallelizable queries, increasing these values can speedup such queries.

In Postgres 11, support for parallel creation of B-Tree indexes was added. Logical replication available in Postgres 10 and above , relies on worker processes at the subscription side to fetch changes from the publisher.

By asking Postgres to spawn more logical replication workers, the changes can be fetched and applied in parallel, especially if there are more tables.

This configuration setting increases the total number of replication workers:. In streaming replication, you can start off a sync with a base backup. For logical replication however, changes have to be pulled in via the replication protocol itself, over the network. This can be time consuming. Allowing for more workers during the sync phase can speed up this process:.

This is of course, called autovacuum, and the number of workers that the autovacuum launcher spawns each time can be set via the configuration setting:.

This reduces the amount of data that needs to be written to disk, at the expense of more CPU cycles to compress the data. It also reduces the size of data that needs to be sent across the wire for streaming replication. Practically, the benefits of WAL compression are well worth the very reasonable overhead. To turn it on, use:. So if your workload peaks below 50, inserts a second e.

In this post, we discuss how to scale read throughput and provide high availability in PostgreSQL using streaming replication. We then dig deeper into the several supported replication modes. We conclude with some performance numbers, measuring the impact of each of the replication modes on insert and read performance. As part of our analysis, we include numbers on TimescaleDB , an open source time-series database that we have developed, which as a PostgreSQL extension supports streaming replication out of the box.

An important part of our design is to ensure that TimescaleDB has a similar performance profile to vanilla PostgreSQL for core features like replication, so we ran extra benchmarks to confirm our design satisfies that requirement. It can be installed on an existing PostgreSQL instance or entirely from scratch. More information here. Note: This post will focus on metrics and high level concepts. For a hands-on walkthrough on how to set up streaming replication, please see our Replication Tutorial.

At a high level, PostgreSQL streaming replication works by streaming records of database modifications from the primary server to one or more replicas, which can then be used as read-only nodes to scale queries or as failovers for HA. The replication works by continuously shipping segments of the WAL from the primary to any connected replicas. Each replica then applies the WAL changes and makes them available for querying.

The Write Ahead Log, is an append-only series of instructions that captures every atomic database change i. Using a WAL is a common approach in database systems for ensuring atomicity and durability. In particular, durability is quite important: it is the notion that when a database commits a transaction, the resulting data is queryable by future transactions, even in the case of a server crash.

This is where the WAL comes in. When we run a query that modifies data or makes schema changes , PostgreSQL first writes that data modification to memory in order to record the change quickly. Memory is very fast to access, but also volatile, meaning that if the server were to crash, our recent data changes would disappear after a server restart. So we also need to eventually write the changes to a persistent store. Why utilize a separate structure and not write directly to the main data files?

The answer lies in the speed difference of writing sequentially versus randomly. Writes to the main data directory could be spread across multiple files and indexes, causing the disk to jump around a lot.

On the other hand, writes to the WAL are sequential, which is always faster especially on spinning disks, but even on SSDs. The transaction can then choose to return a commit after writing to the WAL, but before writing to the main data files. Now, if the server crashes, on restart it can replay all changes on the WAL since the last checkpoint.

In other words, the WAL is the canonical record of all database changes, so that we can replay the changes that were in memory but had not yet been written to the main data directory in the case of a server crash. The WAL helps if the server crashes and then restarts e. But it does have one glaring limitation: it cannot help if the disk becomes corrupted, or suffers another common unrecoverable issue, or is stomped on and beaten with a baseball bat:.

So we also need something to make sure we are resilient to unrecoverable failures. That is where replication comes in. It ships the WAL to other servers; the other servers replay the WAL as if they were recovering at a server restart; and voila!

It plans each transaction and determines where to store, update, delete, or find data. Any resulting instructions for modifying the database are then written to the WAL, and a commit message is returned to the client so it knows that the data has been persisted.



0コメント

  • 1000 / 1000