I decided to use Wordpress plugin Warm Cache as it is simple and it’s working, but please note: it is far from perfect.
Synonyms: Cache Preload, Warm Cache, Warming, Prime Cache, Cache Priming, Cache Warming, etc
Problem with this it that we need external cache warming as we want ngx_pagespeed’s cache to be primed, also.
W3 Total Cache
W3TC has the needed option that must be enable inside Page Cache module. Just check option
Automatically prime the page cache and it
will warm cache externally (yes, I checked in the source), which is exactly what we want.
Default value for
Update interval: is 900 seconds, which is 15 minutes. And, by default, server will process only 10 pages in that
Pages per interval:).
I regularly set this to around one hour (3000 secs) and to max (1000) pages as our server finishes that for less than a minute.
Preload the post cache upon publish events. option, we can force cache prime on every publish event. This is simply too much and I never enable this.
A lot of users complained about this plugin’s functionality not working as expected. Mostly the problem was in relying on wp-cron mechanism and it’s characteristic to trigger only when site was accessed. Lot of them wrote replacements scripts like this, but I don’t see a point in it.
For me the real problem was the lack of any report when priming took place and what pages were warmed. As for triggering wp-cron, of course you can trigger it externally by invoking any page view from cronjob.
Other Wordpress plugins
Oddly enough, there is only one plugin doing this task. Bit it has almost everything it should have. You can trigger it with simple URL request and it will display a log of it’s executions.
Warm Cache it the plugin that does the right thing in a right way. I tested it’s performance and it also executed my 150 pages in around 40 seconds with repeated executions finished in 10 seconds.
This plugin has a couple of annoyances:
- interface is really messy, at best
- it didn’t automatically detect my sitemap
- user can’t select custom hidden key, and it is randomly set and reset
- there is no way to clear the log
- code spits-out a lot of PHP warnings and notices, and that’s really annoying
The good thing is that if you manually specify the main
sitemap.xml, it will properly detect sub-sitemaps and crawl them all, as it is expected.
Note: Don’t forget to add the cronjob for this plugin, as I did.
Optimus Cache Prime
Optimus Cache Prime is a smart cache preloader written in Go language. Note that older versions were in Python.
Custom server script
We can always execute server script as cronjob for this task. This is the fastest method, as on fast servers like ours, you can execute lot’s of requests in parallel.
First, we will need support for simple parallelism, so install it:
apt-get -y install parallel
Our command that does it all in one line and also measures time needed for that is the following:
This command will split the complete file list in chunks of 30 and then execute that all in parallel.
The script will also ignore robots.txt (
-e robots=off) file.
It’s extremely fast for a fast site, and for ours, it took only a 40 second to crawl everything and on repeated executions, after W3TC built it’s cache, it was much faster - only 15 seconds.
For me, conclusion how to execute was simple:
- Execute once so that PHP can build it’s cache of dynamic pages.
- 5 minutes later, execute multiple times more, 3 should be enough in 1 minute interval so pagespeed module can optimize and rewrite static resources.
We could also use curl instead of wget.
Does mobile needs extra warming?
At first, it seemed to me that mobile must be warmed separately so I experimented with adding different user-agent to wget requests. For mobile, I used:
--user-agent="Mozilla/5.0 \(Linux\; Android 6.0.1\; Nexus 5 Build/MMB29Q\) AppleWebKit/537.36 \(KHTML, like Gecko\) Chrome/48.0.2564.95 Mobile Safari/537.36"
and for desktop, used this:
--user-agent="Mozilla/5.0 \(Windows NT 10.0\; WOW64\; rv:44.0\) Gecko/20100101 Firefox/44.0"
Don’t forget that we must escape both parentheses and semicolon (
There is a site UserAgentString.com, with the list of all User Agent Strings that can also show your current user agent when visited.
I finally concluded that changing user agent didn’t do any improvements, so I abandoned this approach.
To put time measurements in cronjob, we can do it like this:
(time /usr/php /my/location/script.php) >>my_log_file 2>&1