Automated performance monitoring in WordPress core

Gathering performance metrics automatically is a way to track performance over time and ensure that WordPress continues to improve.  Automated performance tooling will also help the core team identify issues and resolve them with less effort.

Why add automated performance testing?

Adding automated performance testing will help us monitor performance changes in WordPress core continuously. It gives us a track record to capture how core performance is being enhanced over time, and it allows us to catch regressions early and accurately identify underlying causes. Similar to our unit test suite, automated performance testing would help protect core from introducing large performance regressions by catching problems immediately and tracking performance over time. Automating testing also means saving contributor effort by replacing a time consuming manual process.

The core performance team is focused on improving core performance. Examples of this work include introducing changes that reduce the number of database queries or improve caching. Each new performance team feature must show measurable gains, and each new WordPress release is performance tested by the team. In the 6.1 release cycle, this led to the discovery of some significant performance regressions, e.g. in this Gutenberg issue or this Trac ticket. Automated testing would catch this type of regression as soon as it was introduced, making it much easier to resolve.

It is worth noting that Gutenberg has already done extensive work on performance, tracking metrics with each commit and publicizing details with each release.  

What automated performance testing would do in core

Similar to Gutenberg, WordPress core would gather a set of automated performance metrics along with the existing test runs (e.g. unit tests, coding standards) we already have for each new commit. These metrics can be used to identify the exact point a performance regression is introduced into core. At milestones like a major release, the metrics can be compared against the previous release to gauge progress.

Goals for the initial version

The scope of the  initial version of automated performance testing is intentionally kept limited so we can deliver more quickly, then we can iterate. 

The initial version will include the following features:

  • For each core commit a GitHub action will run a set of automated performance tests, collecting key data points about how WordPress is performing (such as total load time and total query time) on block and classic themes. 
  • The system will collect server timing metrics using the standard WordPress environment and current recommended version of PHP.  

Future Enhancements

Several areas of work are considered out of scope for the initial implementation, primarily to keep the focus limited for the initial release — not because they aren’t good ideas! Once we build a solid foundation for tests and are confident in the metrics we are collecting, we can consider additional improvements.

  • Collecting additional metrics: initially we will focus on key server timing metrics
  • Collecting metrics for other contexts: initial metrics will only measure the home page of the latest core block and classic themes with their default demo content.
  • Slack or ticket reporting: initial work will focus only on collecting and recording metrics at each commit. 


Will the data be stable enough to be useful?

Performance results can vary in a CI environment, making regressions harder to detect. To mitigate this, the system will run several iterations and use the median value.

What metrics will be collected exactly?

Initially we will only collect a few key metrics to keep the dashboard simple, focused on the total load time. Once we have established the new tool, we can consider including additional metrics, or adding hooks to make the test runs extensible.

What about testing older PHP versions?

To reduce the time/cost associated with running these tests, they will be limited to the current recommended version of PHP. Unlike unit tests, performance tests are unlikely to produce significantly different deltas for different PHP versions: regressions are likely to be across PHP versions.

Why not test wp-admin or more routes?

In order to keep the time and cost of running these tests low, and the dashboard simplified, initial testing is intentionally being kept somewhat minimal. In the future it would be good to consider adding other common routes such as the wp-admin dashboard and single post page.

Thanks to @tweetythierry @flixos90 @joemcgill @mukesh27 for reviewing this post and to @youknowriad for the inspiration (and foundation) of the Gutenberg performance metrics.

#core-performance, #performance, #testing

Leave a Reply

Your email address will not be published. Required fields are marked *