Deep dive into postgres stats: pg_stat_bgwriter reports

Deep dive into postgres stats: pg_stat_bgwriter reports

Everything you always wanted to know about Postgres stats

Today, I would like to make a little detour from the main series and will dive into pg_stat_bgwriter. If you’ve been following my previous posts, you will remember that pg_stat_bgwriter view has summary statistics about bgwriter and checkpointer. Here I would like to show an interesting report query which is based on pg_stat_bgwriter. Sources of the query were found in postgres mailing lists and shared by my colleague Viсtor Yegorov and slightly modified by me. This report provides comprehensive information about bgwriter and checkpointer activity and helps to better configure them.
A tiny recommendation to run this query with expanded output in psql. Report produces only one row and looks like this :

-[ RECORD 1 ]————————–+—————————————

Uptime | 112 days 12:58:21.394156

Since stats reset | 132 days 14:37:34.149519

Forced checkpoint ratio (%) | 7.6

Minutes between checkpoints | 56.61

Average write time per checkpoint (s) | 775.36

Average sync time per checkpoint (s) | 0.21

Total MB written | 4194915.7

MB per checkpoint | 105.26

Checkpoint MBps | 0.03

Bgwriter MBps | 0.24

Backend MBps | 0.10

Total MBps | 0.37

New buffer allocation ratio | 34.925

Clean by checkpoints (%) | 8.5

Clean by bgwriter (%) | 64.3

Clean by backends (%) | 27.2

Bgwriter halt-only length (buffers) | 0.00

Bgwriter halt ratio (%) | 0.25

————————————– | ————————————–

checkpoints_timed | 3117

checkpoints_req | 256

checkpoint_write_time | 2615284047

checkpoint_sync_time | 716686

buffers_checkpoint | 45447168

buffers_clean | 345439953

maxwritten_clean | 861

buffers_backend | 146062083

buffers_backend_fsync | 0

buffers_alloc | 18752891779

stats_reset | 2016-10-12 23:41:35.737168+03

total_checkpoints | 3373

total_buffers | 536949204

startup | 2016-11-02 01:20:48.492531+03

checkpoint_timeout | 1h

checkpoint_segments | 256

checkpoint_completion_target | 0.9

The report consists of two parts which are separated by horizontal dotted line, the first part is the report itself and the second is raw values from pg_stat_bgwriter and auxiliary items used in the report.

The first part is more interesting and here is why.

The first values that we see are the stats interval and uptime. General idea here is that the bigger the interval you have since last stats reset , the more inaccurate report you will get. Thus, I’d recommend resetting stats periodically, weekly or monthly. The uptime only shows how long postgres has been working. However, please note that in the example the uptime is less than stats collecting interval — it’s not bad, but for your report try using stats that were collected only after postgres startup, since at shutdown postgres runs forced checkpoint that isn’t related to the workload but might affect the stats.

Next is the information about checkpoints – „Forced checkpoint ratio“ is the ratio of checkpoints which occurred by xlog. Current value is 7.6 and it’s good enough. High values, for example more than 40%, indicate that xlog checkpoints occur too frequently. As you might remember, xlog checkpoints are less preferred than time checkpoints, so general idea is to reduce the number of xlog checkpoints by increasing the number of WAL segments required to trigger checkpoint. „Minutes between checkpoints“ is time interval between occured checkpoints. When everything is ok, this value should be near checkpoint_timeout. Values significantly lower than checkpoint_timeout also indicate on occurrence frequency of xlog checkpoints. General recommendation in both cases is raise max_wal_size (or checkpoint_segments for 9.4 or over).

Next is the average write and sync time in seconds. Write time is near the 13 minutes and that’s a good value, it shows that write stage of checkpoints was performed fast enough despite the long interval between checkpoints. Values that are closer to the earlier mentioned intervals between checkpoints aren’t good – it’s an indicator that storage spent too much time writing buffers. Average sync time should be near zero – values that far from zero would indicate on low performance of the storage.

Next items have informative value, they tell us about average throughput of checkpointer, bgwriter and backends. These numbers give us additional information about the workload. The „Total MB written“ is obviously the size of written data by all subprocesses. The „MB per checkpoint“ is an average value for checkpoints. Next values are measured in Mbps and they are about process throughput. In the example above, there are low values less than 1 Mbps means that server doesn’t have a huge amount of dirty data or maybe the report has been built with longer stats interval and throughput values are spread throughout.

„New buffer allocation ratio“ field is the ratio of new allocated buffers to all written buffers. When backends handle data, firstly they check are data already in shared buffers area? If there are no required data in shared buffers, backends allocate new buffers and load data from main storage to shared buffers and then process it (see details in BufferAlloc() and StrategyGetBuffer() functions). Thus, high number here tell us that backends allocated a lot of buffers since required data didn’t exist among shared buffers.

There are two reasons for this, the first is that the backends read rarely uses „cold“ data, old archived partitions or something similar; the second reason is that the early used data had been evicted from shared buffers because of lack of shared buffers. That’s not all however, this number means how many times data were read to shared buffers more than it had been written out from them. This item potentially comes with cache hit ratio and high allocation ratio and low cache hit ratio can indicate insufficient shared buffers, though it’s hard to know for sure.

Next set of values are on how many buffers in percent are cleaned by the sub-processes. High „Clean by checkpoints“ value is the sign of write-intensive workload. High „Clean by bgwriter“ tells us about read workload. High number of „Clean by backends“ is the sign that backends done a lot of bgwriter‘ work and that’s not good – values more than 50% tells us about ineffective setting of bgwriter, and in this case I would suggest trying to make it more aggressive.

Next values are „Bgwriter halt-only length“ and „Bgwriter halt ratio„. They are about frequency with which bgwriter was delayed due to exceeded bgwriter_lru_maxpages. The values in our example are perfect and high values conversely indicate that bgwriter went to sleep mode too frequently and didn’t do its work fast enough. In this case, I’d recommend to configure bgwriter in a more aggressive way — decrease delay and increase maxpages parameters.

The second part of the report is the raw values from pg_stat_bgwriter and configuration parameters which also related to bgwriter and checkpointer – they are used in report’s query, hence, you don’t need to see them in separate queries.

Here I prepared a few reports with my comments from different pgbench workloads:

here is eight hours of read-only workload with 4GB shared buffers (default bgwriter)
here is eight hours of read-only workload with 8GB shared buffers (default bgwriter)
here is eight hours of read/write workload with 4GB shared buffers (default bgwriter)
here is eight hours of read/write workload with 8GB shared buffers (default bgwriter)
here is eight hours of read/write workload with 8GB shared buffers (aggressive bgwriter)

Tests were performed on the server with 16CPU, 32GB RAM, RAID1 on 2xSSD (datadir), RAID1 on 2xSAS (wal) with PostgreSQL 9.6.2 and test database size is 96GB.

That is all for this time and I hope you enjoyed this post and found it helpful.

Bemerkungen: 6

6 Antworten zu “Deep dive into postgres stats: pg_stat_bgwriter reports”

Unknown sagt:

März 30, 2017 um 12:21 p.m. Uhr

Alexey, thanks for this great explanation!
I noticed, that in read/write workload reports with default bgwriter settings changing only shared_buffers from 4GB to 8GB reduces the fsync time from 8ms to 1.85ms. Why does one affect the other?

Antworten
Alexey Lesovsky sagt:

März 31, 2017 um 7:08 a.m. Uhr

Yep, there is no any changes except shared_buffers and sync time is seemed to me suspicious too. Finally I think, issue lies is in the customer's SSDs, and I don't have any other ideas. The way to confirm/refute that is check disks with fio and take a look on latency spreading.

Antworten
amine sagt:

Oktober 15, 2020 um 1:19 p.m. Uhr

Thanks for your sharing. In my db environment, I saw that it still writes backend dirt page even though I increased the bgwriter values a lot. and if you notice the maxwritten_clean value is 0. Where could the problem be? Why is it still writing backends dirty pages?

checkpoints_timed | 13
checkpoints_req | 1
checkpoint_write_time | 19993770
checkpoint_sync_time | 642
buffers_checkpoint | 522273
buffers_clean | 107335
maxwritten_clean | 0
buffers_backend | 394018
buffers_backend_fsync | 0
buffers_alloc | 3148261
stats_reset | 2020-10-15 09:35:03.154167+03
total_checkpoints | 14
total_buffers | 1023626
startup | 2020-02-12 22:22:34.563076+03
checkpoint_timeout | 30min
max_wal_size | 15GB
checkpoint_completion_target | 0.9
bgwriter_delay | 10ms
bgwriter_lru_maxpages | 100000
bgwriter_lru_multiplier | 10

Antworten
- Valeria K sagt:
  
  Oktober 16, 2020 um 9:13 a.m. Uhr
  
  Hi, Amine!
  The timestamp in stats_reset indicates that the stats accumulated by pg_stat_bgwriter covers period of time that is too short. For better analysis of bgwriter/checkpointer stats should be collected over longer period, at least one-two-three weeks.
  In either case, it’s impossible to completely avoid writes by backends, but it’s important to minimize them and ensure that these writes are smaller than those made by bgwriter/checkpointer.
  
  Antworten
amine sagt:

Oktober 17, 2020 um 5:53 a.m. Uhr

Hi Valeria
Yes I have reset the bgriter statistics shortly before, because I saw the buffers_backend value is far more than buffers_clean. I have tuned the bgriter and checkpointer again and then reset the bwriter statistics. But as I said I saw that the situation has not changed after resetting. thank u i will watch a few more weeks..

Antworten
- Alexey Lesovsky sagt:
  
  Oktober 26, 2020 um 11:31 a.m. Uhr
  
  Hi, Amine!
  After changing setting, it’s better to observe behavior using monitoring. Using charts it’s easier (and faster) to see changes in postgres behavior.
  
  Antworten

Schreibe einen Kommentar Antwort abbrechen

Paket	Preis*
PostgreSQL Configuration-Analyse	ab 500 €
Hardware und Betriebssystem Analyse	ab 600 €
Analyse langsamer SQL-Abfragen	ab 500 €
Umfassender Health Check	ab 3500 €
* für eine einzelne PostgreSQL Installation (1 master und 1 replica)

Basic	Premium	VIP
Inkl. bis zu 5 Stunden Beratungszeit pro Monat	Inkl. bis zu 15 Stunden Beratungszeit pro Monat	Inkl. bis zu 30 Stunden Beratungszeit pro Monat
Slack, Mattermost, Telegram	Slack, Mattermost, Telegram	Slack, Mattermost, Telegram
Antwortzeit bis zu — 1 Stunde	Antwortzeit bis zu — 1 Stunde	Antwortzeit bis zu — 30 Minuten
2 PostgreSQL Instance max	5 PostgreSQL Instance max	10 PostgreSQL Instance max

Basic

Premium

VIP

Inkl. bis zu 5 Stunden Beratungszeit pro Monat

Inkl. bis zu 15 Stunden Beratungszeit pro Monat

Inkl. bis zu 30 Stunden Beratungszeit pro Monat

Slack, Mattermost, Telegram

Antwortzeit bis zu — 1 Stunde

Antwortzeit bis zu — 30 Minuten

2 PostgreSQL Instance max

5 PostgreSQL Instance max

10 PostgreSQL Instance max

Nachrichten und Blog zurück

Deep dive into postgres stats: pg_stat_bgwriter reports

Everything you always wanted to know about Postgres stats

Das könnte Sie auch interessieren:

Sovereignty Needs an Exit Strategy — PostgreSQL Makes It Possible

pgBackRest PITR in Docker: a simple demo

Data Egret Services Now Available on AWS Marketplace

Operating PostgreSQL as a Data Source for Analytics Pipelines – Recap from the Stuttgart Meetup