Wednesday, August 04, 2010

Website Performance: Choose Data Center Carefully

Data center which hosts servers for a website plays a significant role in the performance of the website. There are few things which should be kept in mind before choosing a data center:

1) Measuring distance in terms of hops is not the best approach. Latency is the accurate measure of speed on the internet and should be considered over the physical distance (hops). Thus, a data center closest to your customer might not be the fastest. Check latency of multiple data centers before choosing one. Latency can be tested by simply performing ping/traceroute tests to any server already hosted on the data center.

2) Check latency from regions which represent majority of the website's audience locations. So, if your site will cater to a specific region (like a city or state) then measure latency from only that region. If your site caters to a distributed audience (throughout the country or across continents) then test the latency from regions where you expect the maximum traffic.

3) Choose a ISP neutral data center. Data centers run by an ISP will perform really well with clients using the same ISP but might not be great when connecting with other ISPs.

Website Performance: Reverse Proxy for Spoon Feeding

When clients with slow internet connection request a page, the server holds on to the thread/process till the complete response is transferred. This results in resources holding up for longer than in cases where clients use fast connections.

To handle this, a reverse proxy (like squid, nginx etc.) can be used in front of the web server. The web server will simply need to transfer content to the reverse proxy (which is super fast as they are on the same network).

The reverse proxy takes the burden of transferring the response slowly and frees the web server immediately.

Website Performance: Web Server for Static Content

This approach is useful if you are not using a CDN to render static content (like js,css and images).

Servers which are used for the application (like Apache or Tomcat) are great for requests which require execution of some code before rendering the page.

There are a bunch of light weight web servers (like lighttpd, varnish and nginx) which are tuned to render static content really fast.

There are a bunch of benchmarking results out there. You will see clear benefits when a page with multiple static objects is rendered with any of these light weight web servers.

Website Performance: Delay Processing

When a page is requested from the server, process only what is necessary to generate the required output.

Any additional processing that might be necessary can be delayed or performed asynchronously.

Stuff like sending a an acknowledgement mail or logging are good candidates for delayed processing.

One tool which helps you achieve this is Gearman. Gearmand is a simple server which allows worker threads to register themselves for certain defined processes and clients can send processing requests to the Gearmand server. The Gearmand server queues up thsee requests and dispatches them to worker threads. Client and worker code can be in different languages. For delayed processing, use the asynchronous (do_background) call.

Tuesday, August 03, 2010

Website Performance: Use CDN Effectively

In most cases, we consider using a CDN (like Akamai) for static content like images stylesheets and JavaScript whereas ignore it for the HTML content.

Certain HTML might be cacheable. For example, if a certain HTML page changes every hour, for one hour it remains constant. It will be great if this page can be cached by the CDN somehow so that the first request from a region enables the CDN to cache it for the whole region.

The CDN is configurable for your site. Once you access the configuration, there should be options like:
1) Cache content on the edge server basis the cache headers sent by origin OR
2) Cache certain file/folder/url for X minutes/hours

It's imperative to understand and configure the CDN for optimal performance.

Website Performance: Memory as the primary storage

We normally use databases/filesystems as the primary source of storage and add caching (Memory/RAM) to improve performance of the application.

Consider the opposite approach. Use Memory as the primary storage and file-system as a recovery source. So, perform all read and write operations directly in memory but log inserts/updates in file-system. The writes to file-system can be asynchronous (delayed inserts) and thus never become a bottleneck.

This is a risky proposition and should be considered only if:
1) The database operations are becoming a bottleneck and you have tried all possible optimizations. Only the problematic data sets should be considered for this approach.
2) The data is non-critical i.e. it is acceptable even if the data is not available for certain time period. The time duration that this data will be unavailable will at least equal the recovery time from file-system.
3) Typical database constraints (unique, foreign key etc.) do not apply to the data.

Monday, August 02, 2010

Website Performance: Cache Database Query Results

Querying the database is an expensive operation and should be kept to a minimal.

Certain databases provide query caching capabilities. MySQL's query cache is great for tables which are used primarily for read operations Any insert/update query clears the complete query cache for the table. Thus, query caching cannot be leveraged for tables requiring regular insert/update operations.

Adding caching capabilities above the database layer can help boost performance. Before passing a read request to the database, an additional layer can check for appropriate content in the cache. If content is not available in the cache, request can be forwarded to the database and the cache populated before returning the result to application.

The caching layer can also trap any insert/update operation so that the cache is up-to-date.

If you are using Hibernate to persist your objects, the second level cache (and query cache) should be considered. They help achieve the same performance benefits using application level caching.

Website Performance: Cache HTML when possible

HTML content for a dynamic page is generated for each request made to the server. Though the page is dynamic (as it changes from time to time) there are 2 things which should be looked at:

1) What is the frequency of change. Does the content change with each request or does it remain constant for some duration (15,30 mins?).
2) If it remains constant for certain time-period, how much requests are made for the same content within that duration.

Using a combination of this, caching the content (on server side) might be feasible.

Example:
1) If the content is constant for 15 mins and only 2-3 requests are made for the same content within 15 mins, then caching is not of much benefit.
2) If content is constant for even 5 mins, and 10 requests will be made for the same content in that time, caching will certainly be beneficial.

Caching complete HTML can be expensive and if you do not have sufficient memory (RAM) to hold this data, it might be feasible to keep this data cached on the disk as well. If disk is chosen, then caching is beneficial only if the read operation from the disk is cheaper than actually generating the content dynamically :)

When caching HTML, an appropriate cache clearing mechanism will have to be built so that stale content is never shown.

Website Performance: Choose Appropriate Cache

Caching plays a key role in speeding up a web application. Before looking at what and when to cache, some consideration should be given to the appropriate cache which is suitable for your environment.

In case your application is deployed on a single server, then caching content on that server itself will suffice.

For distributed architecture (application deployed on multiple servers), distributed cache should be used.

Usually, the argument against a distributed cache is that it will involve accessing a remote machine which is expensive.

Let's look at how expensive this operation is.

Following are few interesting numbers picked from a presentation by Jedd Dean (from Google):

Time taken to read 1 MB sequentially from memory - 250,000 ns (thats nano seconds)
Time taken for round trip within the same data center - 500,000 ns

So, reading 1 MB from a remote server's memory should take roughly 750,000 ns (0.75 ms).

Considering that 1 second page load time is good enough, this is less than 1/1000th of the time. Thus, when we talk about web applications, reading from a remote server's memory will not degrade performance by any noticeable amount.

When using a distributed cache, it's advisable to use a bit more than what is required. This ensures that failure of a single server does not overload the application.

Example: If you need 4 GB of memory and you are using 4 servers (with 1 GB cache on each), add another (5th) server with 1 GB of memory for caching. This way, when 1 server goes down, the application performance will not get impacted much.

One of the most popular distributed cache implementation is Memcached. It's used by companies like Wikipedia, Flickr, YouTube and Twitter.

Sunday, August 01, 2010

Website Performance: Don’t Let Third Parties Slow You Down

Recently a presentation was made by 2 Googlers (Arvind Jain and Michael Kleber) @ Velocity 2010 where they talked abnout how third party code can slow a website.

Third party code (like Google ads, Digg widget etc.) usually includes an external script. Since browsers block rendering while fetching JavaScript, this third party code also blocks rendering of your page.

Analyse your page with and without this third party component to understand the impact that it has on your site.

If you have an option to choose from multiple vendors, then choose the one which has least impact on your page.

Example: The new Google Analytics code loads JavaScript asynchronously to ensure minimal impact on page loading.

Website Performance: Separate Static from Dynamic

A dynamic page is one which can potentially change with each request to the server. But in most cases, there is also some content within these pages which does not change. This content remains static even when the dynamic elements change.

Such static content within a dynamic page varies from application to application but mostly it's stuff like the header, footer, drop-down values (like city, state and country) etc.

Analyse the dynamic page from this angle and come up with a list of static elements on it. If the static elements are considerable (like 30% of the content on a page) then consider separation. There are various techniques which can be used to cache the static elements on the browser.

This approach is useful only when rendering similar layout repeatedly. So, if the same static content (header footer etc.) will be shown for multiple page requests, then separation of static from dynamic is feasible. Whereas, if the static portion changes for each request, there is not much to gain by separation, rather you will end up slowing the existing pages.

Some of the techniques which can be used once the static and dynamic elements are separated:
1) Ajax: A popular and commonly used technique for search result pages. The ajax request is made and only the results are updated whereas the layout remains constant.
2) XSLT (XML+XSL): While requesting for a page, the static elements are embedded in the xsl file and the dynamic elements are fetched using a xml. The xsl can have cache headers defined so that the browser caches it. This is beneficial to the Ajax approach in cases where a new page request is to be made and the user leaves the current page to get content of a new dynamic page. Google for "Browser side XSLT" to get more details on this.
3) HTML in JavaScript as string: This is a simple approach to add html snippets within JavaScript as string so that they can be rendered (inner html) wherever necessary. Not a particularly good design as you will need to add html (view) within the code (javascript).

Website Performance: Utilize Browser's Idle Time

Once the page loads completely, the user spends few seconds on the page before moving to the next one. The browser is idle during this time and can be used to speed performance of subsequent pages on the website.

For example, in case of search results, it's highly likely that the user will move the the next page once he completes viewing results on the current page. Thus, developers can use intelligent javascript to pre-fetch content of the next page when the browser is idle. This will help the next page load much faster.

Another condition for pre-fetching could be before launching a new version of the site.

New version of a website usually has new static content (javascript, style sheets and images). When a regular user of the site opens this new version for the first time, he will find the site extremely slow. Thus, initially many customers complain about performance.

If we start fetching the static content in the background couple of days before the launch of the new version, customers will not face the slowness and find the performance to be much better. Of course, care has to taken so that there is no clash in names of classes (CSS) and functions (JS).

Website Performance: Hosted JavaScript Libraries

Most web applications these days make use of Libraries like jQuery.

These libraries offer significant advantages, but increase the initial page load time to a certain extent.

Most popular libraries are widely hosted by companies like Google and Microsoft. Instead of hosting these libraries yourself, it is feasible to include these libraries from such common locations.

When using the common location, chances are that the Browser has already cached the same URL and need not re-fetch the library for your site.

All in all, it's a win-win situation for you:
1) Saves bandwidth as you need not refer to your hosted library. In some cases this may save you some money.
2) Page loads faster as browsers might already have the libraries cached.

Google Libraries API provides a wrapper around the well known and widely used libraries (jQuery, Dojo, prototype, YUI etc.). Once included, you can directly load any popular library with a simple function call (example: google.load("jquery", "1.4.2"))