A dutch version of this article has been published on the blog karelgeenen.nl.
Caching generates an enormous speed boost when optimizing websites. Fast page loading time makes for a pleasant user experience. And as this is one of Google’s top priorities a fast website is said to be an important factor in deciding your position in any search index. But what is caching exactly? Where and when do I apply it? And who will take care of caching for me as a customer or administrator of a website?
All of this will be touched upon in this article.
What is caching?
Caching can be understood as putting the data you need to reach most often in an easily accessible location. In English the term is pretty straightforward in both the offline and online domain. Etymologically the word derives from the French “cacher” and has the meaning of “hiding place”. For practical reference you can compare caching to a plummer storing his tools and pipes in his van when working on location. Having these close to the working spot s much more convenient than having to drive back and forth between there and his office.
How does this work for webpages?
When someone retrieves a web page the web server will forward the request to the WordPress application. This is where subsequently a number of fixed steps are handled before returning a complete WordPress webpage to a visitor.
Retrieving and constructing a page will temporarily require server capacity. Doing this for many visitors at the same time can put some considerable load on the server. This may then result in delayed loading times.
The server and WordPress application will compose the page from the same elements for every request. Both work continuously to do the same thing over and over. This is done for every visitor separately. So if two different visitors navigate to page abc.eu/xyz.php this page will be composed two times on the server.
Exactly this duplication of effort can be prevented by using a caching system. Caching saves the result of a composed page (or part of it) at a temporary location, making it easily accessible for the next visitor requesting the same result page.
In the event of the first visitor on abc.eu/xyz.php the page must be composed from scratch retrieving and building all information from the database and fileserver. With a caching option enabled this first run will trigger the cache system and save the data for later use. The second visitor for abc.eu/xyz.php will be served its much quicker since he gets his page directly from cache.
Where is caching applied?
Technically caching can be applied on many different locations. However, we can make a rough division between server side and visitor side (i.e. browser cache).
A second distinction with regard to the how of caching is the qualification of how much of the page is being cached. There is the self-explanatory page caching and the caching of individual assets. The latter is done either by opcode or database caching.
The scheme below illustrates all possible options in caching a page within the complete stack.
Click on the image for a bigger version
Logically, the closer cache is stored to the visitor the faster the user experience will be.
Pagecache layer one: Browser cache
The first layer is cached locally on the user’s system so the content doesn’t need to be retrieved over the internet. This is by far the quickest solution.
Despite the fact that this type of caching stores the data with the visitor, a hosting provider can still decide on what to include in browser cache.
A (good) hosting company will send a header (madage, etag or notmodifiedsince) along with the data when a page is requested. This will let the browser know what data to save and for what time, making it unnecessary to retrieve the same data again online.
Pagecache layer three: Plugins
WordPress handles page caching in the two lower layers using plugins like for example W3TotalCache (W3TC) or WPsupercache.
These caching methods will definitely contribute to an increased pagespeed. However if you take a look at our scheme you see they stand far from the visitor. WordPress needs to be accessed before the cache can be reached. This means both adding and extracting from the cache takes some time.
Aren’t we missing a layer?
Very sharp indeed, we haven’t discussed a second layer. Luckily we hadn’t been sloppy and there’s a good reason we skipped it earlier. Layer two is by far the least common. Furthermore it is the most difficult layer to set up caching for a site-owner.
Layer two is all about reverse proxy caching. An example of this is Varnish caching. This system is set up even before the web server that hosts WordPress. It is the first contact with a visitors computer.
When any visitor requests a page within WordPress the composed result will be saved in the caching proxy and returned to that visitor. For every next visit asking for the same page, the caching proxy will return the saved instance without WordPress being addressed.
Can caching always be used?
Caching is not always the right thing to do. Especially in cases where user unique page elements are used you may not want to use caching. The best example is a block displaying the contents of a shopping cart displayed on every page. What is displayed in the shopping cart has to be checked individually for every single visitor. therefor all pages displaying the cart cannot be retrieved from cache.
Additional caching methods
Apart from the full-page caching methods discussed in the above scheme we can distinguish more elements that can be saved in cache. These are either opcode or database queries. Both are smaller objects that are used in the process of composing a web page. So whenever a full page caching is out of the question (e.g. when using a shopping cart) these smaller objects can often still be cached. We’ve skipped these methods in the above scheme for reason of clarity.
With database caching the result of much used database queries is stored. For example, instead of having to chronologically check for the 10 latest posts, the result is saved in the cache memory and will be directly accessible from there.
Many CMS systems (WordPress too) are written in PHP language. Before a server can use this PHP code, it has to be ‘translated’ into the language understood by the server. This processed version of PHP is called ‘opcode’. Using opcode caching the translation is saved in a server’s memory in order to skip this step in the next iteration. Again, making for a much faster site speed.
So what happens when I add new content?
Whenever you published a new blogpost you want it to be visible on your homepage right away. But what happens when your page is saved in a caching proxy or cached by your freshly set up caching plugin?
If you’re still with us after reading above paragraphs you might have realized that visitors will not be able to see your new blogpost at the top of your homepage right away as the content will be served from saved cache. The solution is to empty – “flush” – your cache in order to start building it again and have your browser check for the data at the original source.
Cleaning a cache is done in many different ways and this differs per implementation. Some interfaces provide you with a button to empty cache, sometimes it is done automatically when publishing a new post or page. The latter makes use of WordPress’ renowned ‘hooks’ you might have heard about.
Who takes care of caching for me?
Depending on the caching location you can either set up a caching solution yourself or your webhoster might be responsible for the caching of your website. In general you can manage page and browser cache by yourself using a plugin like W3TC.
The fastest layer after browser caching, reverse proxy cache, is typically a technique implemented and managed within your hosting architecture. At Savvii we use Varnish’ reverse proxy cache to speed up your websites.
We tried to formulate a clear explanation of the caching process and its most important implications. If after reading this article you have any questions, please feel free to leave a reaction!