December 1, 2022
8 min. read

Understanding page caching to optimize website performance

Vanand Mkrtchyan

Tech Lead

Understanding Page Caching to Optimize Website Performance

For a while, the digital environment has been focused mostly on customer experience. We may not be mistaken in saying that customer experience is at the center of everything. Regardless of how beautiful and functionally flawless your product is, it will cause enormous damage to your business if it is lagging in speed.

As experts in software engineering and website performance, we want to address what website performance is, how it is measured, and which factors are critical for tracking performance. Let’s refer to the topic in more detail, which is, on one hand, underestimated by many specialists and, on the other hand, shrouded in mystery. What is caching and “what is it eaten with?”

Website Performance

The concept of web performance is all about making websites fast. There are two core aspects of website performance:

Page Load: How fast do pages load?
Usability: How quickly does your website or app interact with the visitor?

We will not dive deeper into performance issues and how to fix them from a usability perspective because they are mostly individual. But instead, we will explore page load performance.

Page Load

Page load performance is about how fast your website appears to visitors and how quickly it becomes interactive.

Let’s start with some statistics to show how important the page load is and how negatively it may affect your business in the case of disregarding it.

Google page load time (the primary search ranking factor for SEO) reveals how this criterion may affect your performance. This data will help you understand the importance of page load time and why you need to care about its speed.

~3 seconds - bounce rate probability - 32%.
~6 seconds - bounce rate probability - 90%.
~10 seconds - bounce rate probability - 123%.

What does this mean? There is one simple truth. The faster your page loads, the higher its user engagement, which affects search engine rankings. The page speed should be as fast as possible so as not to compromise the customer experience.

The major steps in page load and responsible for speed are the following;

The page load starts with a user choosing a hyperlink that sends a request to the web application server. It is referred to as an initial request.
The request reaches the web application.
The application sends an HTML response back to the user’s browser. It is referred to as the first byte.
The user’s browser receives the HTML response and starts to process the DOM (Document Object Model).
The user’s browser renders the page.
The window load event fires when page rendering is done.

Long story short, based on our knowledge of how the web works, we know that whenever a user is visiting our website, the browser is:

Calling our server to get the initial HTML response
Starting to render HTML
Loading static resources (images, CSS, images, etc.)
Finalizing the initial render
Loading async scripts
Does later dom manipulations

This means that we have to split website performance into several parts, which are considered main metrics (KPIs) to measure the overall performance.

Time to First Byte (TTFB) - the time between the browser requesting a page and the moment it receives the first byte of information from the server
First Contentful Paint (FCP) - the time between the browser rendering the first bit of content from the DOM and providing the first feedback to the user to show that the page is loading
Time to Interactive (TTI) - indicates the time it takes for the page to become interactive (displaying content and responding to user interactions within 50 milliseconds
Speed Index (SI) - performance metric referring to the speed at which the content becomes visible to the user
Largest Contentful Paint (LCP) - as one of the Core Web Vitals metrics, it represents the speed of the main content load (largest image or text block)
Total Blocking Time (TBT) - measures the total amount of time between the First Contentful Paint and Time to Interactive
Cumulative Layout Shift (CLS) - indicates unexpected shifting of page elements while it is still loading (shift in fonts, images, videos, buttons, etc.)

Since the hardware costs have decreased after the mobile/tablet "revolution", high-performance devices have become more accessible to users. The overall page rendering stuff is not a big deal when we are talking about websites (for complex applications it may be an issue even now), and since we cannot cover EVERYTHING in a single article, let’s focus on the most painful issue: the Time to First Byte metric.

Time to first byte

Time to First Byte is the time between the browser requesting a page and the time it receives the first byte of information from the server. This time includes a DNS lookup and establishing the connection using a TCP handshake and an SSL handshake if the request is made over HTTPS.

TTFB is the time it takes between the start of the request and the start of the response, in milliseconds:

TTFB = responseStart - navigationStart

Caching

So what is caching? Caching is the process of remembering the result of an expensive operation to efficiently reuse previously retrieved or computed data and speed up reads.

Let's imagine that we have a function that is doing some heavy calculations, and let’s call it calculateHeavyStuff(). Let's wrap that function in another function and call it cache().

// cache.js
export function cache(func){
  let value;
  
  return () => {
    if( typeof value === "undefined" ) {
      value = func()
    }

    return value 
  }
}


// calculateHeavyStuff.js
import {cache} from './cache'

export function calculateHeavyStuff(){
  // do some heavy calculations here
}

export const cachedCalculateHeavyStuff = cache(calculateHeavyStuff)

As you can see, the whole thing the cache function is doing is wrapping the argument function and remembering its value. On the first call of cachedCalculateHeavyStuff, the value will be undefined, so the cache function will call the calculateHeavyStuff function, assign the return value to the variable, and return it. The magic starts on the second and next calls. If you call cachedCalculateHeavyStuff once again, the if statement will not be undefined, so the if condition will be false, which means that the calculateHeavyStuff function will not be called but the previous value will be returned.

The same kind of thing is happening when we cache HTTP responses: the cache service calculates the response and stores it somewhere, and for the next calls it’s returning the stored response instead of calculating the response once again.

The same is for DB data aggregations, etc.

Now let’s talk about the levels of caching in the HTTP request-response workflow. By saying levels, we mean where the cached data is stored in that flow:

Client’s browser
Server
Cloud Edges

Browser Caching

In this caching method, the cached resource is stored on the user's browser. If the visitor is accessing the site based on cache headers, the browser decides to keep cached resources in browser cache storage, and on the next attempt, cached resources are brought from that storage instead of the server. The downside of this method is that caches are stored per user and per browser.

Server-side caching

In this caching method, the cached resource is stored on the server. There are different approaches to using server-side caching. We can either limit only calls to the DB or even heavy functions/calculations as well, and that’s why in the image below we have just a partially gray server.

Edge Caching

In this caching method, the cached resource is stored on special cloud servers called data centers that are spread around the world. Here we are putting an external service that is providing edge caching functionality between our client and servers, so connections to servers are very few since the caching provider is getting the cache resources and spreading them across all necessary data centers. Typically, since these kinds of services lie between the client and server, they also provide DNS services and SSL certificates. By using this kind of service, you will notice a decrease in time for all the factors affecting TTFB metrics:

DNS lookup
TCP and SSL handshake
The connection between the server and the back

Since these services have data centers all around the world, the browser will get the response from the nearest one, meaning not only is the data cached, but the time to connect to the server and get the response back is also minimal.

Everything is perfect, right?

Not always! Let’s talk about the downsides and solutions to this method.

Winter is coming...

What we have discussed may perfectly apply to static or old-school websites.

We've faced several revolutions on the web: from dial-up to wireless connectivity, and from slow devices to multiple types of powerful gadgets that bring new challenges. And now again, we have new challenges along with those new opportunities.

We may think that having faster connections and more powerful devices may solve our problems, but the fact is that it’s just bringing new features and challenges to consider, and as always, performance is staying a top priority for every product.

Dynamic Pages

How are dynamic sites and pages different from static ones? We know that pages have different and unique paths, and caches are stored by path, so having a dynamically increasing number of pages but also having updates on those pages, dom manipulations, etc.

In previous decades, we had a simple structure for websites.

Static HTML with minimal DOM manipulation
One combined CSS file for the whole project (maybe 2-3 based on the layout but the point is that there were just a few)
One combined JS file for the whole project (maybe 2-3 based on the layout, but the point is that there were just a few)
The majority of the responsibilities and functionality, located on the backend

Now we have the following picture:

Initial HTML may be different from the final page based on many factors
Most common projects are component-based
- most styles are split into chunks
- most JS is split into chunks
Client-side applications are logic-heavy, and the backend is used for data transfers and storage mostly

Things to consider

Single file download vs multiple smaller pieces
Caches only first response
Lazy Loading vs SEO
Regular updates and cache cleanups on time

Need to check your website performance?

Table of content

Website Performance Time to first byte Caching Dynamic Pages Things to consider

FAQ

What is website performance?

The concept of web performance is all about making websites fast, including making slow processes appear faster. There are two core aspects of website performance:

Page Load: How fast do pages load?
Usability: How quickly does your website or app interact with the visitor?

What is caching?

Caching is remembering the result of an expensive operation to speed up reads.

What is "page load performance"?