Tech Scraps

Saturday, June 04, 2011

GWT for client side code

Google Web Toolkit(GWT) is a framework which allows you to write client code using Java and compiles it into JavaScript. It also allows you to write server code but you can choose a server side framework of your choice (like RoR, Spring MVC etc.). If you choose a Java-based framework for the server side, you will have the advantage of using only one language for the entire application (with the exception of CSS of course).

For thin client application, where only minimal code is executed by the browser, there is not much that you can exploit out of the GWT framework as most of the code will be on the server side. But if you are building a thick client, where some JavaScript execution is expected, GWT is a pretty neat choice.

Some features/tools which help manage the client code better than using plain JavaScript are:

1) Gin for Dependency Injection
2) EventBus to manage interaction between various components
3) GWTTestCase to test views (if they have any logic). We should have minimal logic in views but we all know that the views are getting more n more complicated these days ;)
4) Widgets which you can simply plug in
5) Built-in framework for MVP development

There are many more reasons for choosing GWT and depending on your requirement, GWT can be a good contender.

Tuesday, November 09, 2010

Sharing Internet Connection with Others using Ubuntu

We were recently at a client's location where the internet access was restricted. We had a single USB Mobile Broadband device which had to be exchanged periodically so that we can go online.

This was a pain and wasted a lot of time. The 'Create New Wireless Connection' feature of Ubuntu came to the rescue. Here's how it works:

1) Enable wireless on your laptop
2) Click on the network icon at the top-right corner
3) Select 'Create New Wireless Connection'
4) Enter the Network Name, Wireless Security (I used None) and a Key
5) Ubuntu will try to immediately connect to this local Wireless network
6) Your friends should now be able to connect to this newly created wireless network like they normally connect to any wireless network
7) Connect to the internet the way you normally do (I connected using the USB device)

That's it, using these simple steps multiple machines can use the same internet connection.

Note: I used Ubuntu 10.10 with the default (Gnome) desktop environment.

Monday, September 20, 2010

Using RSA Software Token with Blackberry

If you need to use RSA SecurID Software Token on your Blackberry phone, follow these steps:

1) Download bb350.zip from ftp://ftp.rsasecurity.com/pub/agents/bb350.zip
2) After unzipping, open the SecurIDTokenBlackBerry350_quickstart PDF document (under bb350\doc\English).
3) Goto the section 'Use the Application Loader to Perform the Installation'. This method of installation worked for me.
4) Once the installation completes. E-mail the sdtid (Token) provided to you by the administrator. Before mailing, prefix x-RIMDevicetoken to the filename (eg. x-RIMDevicetokenfilename.sdtid).
5) Goto the attachment on your phone (do not open attachment) and select the "Import SecudID Token" option. Enter the password provided for the token.

Thats it. When you start the RSA Application, it will automatically ask for the PIN and then display the OTP.

Thursday, September 09, 2010

Adding Volume Conrol Icon to the Panel - Ubuntu 10.04

After upgrading to Ubuntu 10.04, the Volume control icon disappeared from the Panel (Gnome).

To add it back, do the following:

1) Goto System > Preferences > Startup Applications
2) Click Add Button
3) Enter "/usr/bin/gnome-volume-control-applet" as the Command
4) Enter "Volume Control" (or whatever you like) as the name
5) Click Add

That's it. Next time you start Ubuntu, the Volume Control will start appearing.

Wednesday, August 04, 2010

Website Performance: Choose Data Center Carefully

Data center which hosts servers for a website plays a significant role in the performance of the website. There are few things which should be kept in mind before choosing a data center:

1) Measuring distance in terms of hops is not the best approach. Latency is the accurate measure of speed on the internet and should be considered over the physical distance (hops). Thus, a data center closest to your customer might not be the fastest. Check latency of multiple data centers before choosing one. Latency can be tested by simply performing ping/traceroute tests to any server already hosted on the data center.

2) Check latency from regions which represent majority of the website's audience locations. So, if your site will cater to a specific region (like a city or state) then measure latency from only that region. If your site caters to a distributed audience (throughout the country or across continents) then test the latency from regions where you expect the maximum traffic.

3) Choose a ISP neutral data center. Data centers run by an ISP will perform really well with clients using the same ISP but might not be great when connecting with other ISPs.

Website Performance: Reverse Proxy for Spoon Feeding

When clients with slow internet connection request a page, the server holds on to the thread/process till the complete response is transferred. This results in resources holding up for longer than in cases where clients use fast connections.

To handle this, a reverse proxy (like squid, nginx etc.) can be used in front of the web server. The web server will simply need to transfer content to the reverse proxy (which is super fast as they are on the same network).

The reverse proxy takes the burden of transferring the response slowly and frees the web server immediately.

Website Performance: Web Server for Static Content

This approach is useful if you are not using a CDN to render static content (like js,css and images).

Servers which are used for the application (like Apache or Tomcat) are great for requests which require execution of some code before rendering the page.

There are a bunch of light weight web servers (like lighttpd, varnish and nginx) which are tuned to render static content really fast.

There are a bunch of benchmarking results out there. You will see clear benefits when a page with multiple static objects is rendered with any of these light weight web servers.

Website Performance: Delay Processing

When a page is requested from the server, process only what is necessary to generate the required output.

Any additional processing that might be necessary can be delayed or performed asynchronously.

Stuff like sending a an acknowledgement mail or logging are good candidates for delayed processing.

One tool which helps you achieve this is Gearman. Gearmand is a simple server which allows worker threads to register themselves for certain defined processes and clients can send processing requests to the Gearmand server. The Gearmand server queues up thsee requests and dispatches them to worker threads. Client and worker code can be in different languages. For delayed processing, use the asynchronous (do_background) call.

Tuesday, August 03, 2010

Website Performance: Use CDN Effectively

In most cases, we consider using a CDN (like Akamai) for static content like images stylesheets and JavaScript whereas ignore it for the HTML content.

Certain HTML might be cacheable. For example, if a certain HTML page changes every hour, for one hour it remains constant. It will be great if this page can be cached by the CDN somehow so that the first request from a region enables the CDN to cache it for the whole region.

The CDN is configurable for your site. Once you access the configuration, there should be options like:
1) Cache content on the edge server basis the cache headers sent by origin OR
2) Cache certain file/folder/url for X minutes/hours

It's imperative to understand and configure the CDN for optimal performance.

Website Performance: Memory as the primary storage

We normally use databases/filesystems as the primary source of storage and add caching (Memory/RAM) to improve performance of the application.

Consider the opposite approach. Use Memory as the primary storage and file-system as a recovery source. So, perform all read and write operations directly in memory but log inserts/updates in file-system. The writes to file-system can be asynchronous (delayed inserts) and thus never become a bottleneck.

This is a risky proposition and should be considered only if:
1) The database operations are becoming a bottleneck and you have tried all possible optimizations. Only the problematic data sets should be considered for this approach.
2) The data is non-critical i.e. it is acceptable even if the data is not available for certain time period. The time duration that this data will be unavailable will at least equal the recovery time from file-system.
3) Typical database constraints (unique, foreign key etc.) do not apply to the data.

Monday, August 02, 2010

Website Performance: Cache Database Query Results

Querying the database is an expensive operation and should be kept to a minimal.

Certain databases provide query caching capabilities. MySQL's query cache is great for tables which are used primarily for read operations Any insert/update query clears the complete query cache for the table. Thus, query caching cannot be leveraged for tables requiring regular insert/update operations.

Adding caching capabilities above the database layer can help boost performance. Before passing a read request to the database, an additional layer can check for appropriate content in the cache. If content is not available in the cache, request can be forwarded to the database and the cache populated before returning the result to application.

The caching layer can also trap any insert/update operation so that the cache is up-to-date.

If you are using Hibernate to persist your objects, the second level cache (and query cache) should be considered. They help achieve the same performance benefits using application level caching.

Website Performance: Cache HTML when possible

HTML content for a dynamic page is generated for each request made to the server. Though the page is dynamic (as it changes from time to time) there are 2 things which should be looked at:

1) What is the frequency of change. Does the content change with each request or does it remain constant for some duration (15,30 mins?).
2) If it remains constant for certain time-period, how much requests are made for the same content within that duration.

Using a combination of this, caching the content (on server side) might be feasible.

Example:
1) If the content is constant for 15 mins and only 2-3 requests are made for the same content within 15 mins, then caching is not of much benefit.
2) If content is constant for even 5 mins, and 10 requests will be made for the same content in that time, caching will certainly be beneficial.

Caching complete HTML can be expensive and if you do not have sufficient memory (RAM) to hold this data, it might be feasible to keep this data cached on the disk as well. If disk is chosen, then caching is beneficial only if the read operation from the disk is cheaper than actually generating the content dynamically :)

When caching HTML, an appropriate cache clearing mechanism will have to be built so that stale content is never shown.

Website Performance: Choose Appropriate Cache

Caching plays a key role in speeding up a web application. Before looking at what and when to cache, some consideration should be given to the appropriate cache which is suitable for your environment.

In case your application is deployed on a single server, then caching content on that server itself will suffice.

For distributed architecture (application deployed on multiple servers), distributed cache should be used.

Usually, the argument against a distributed cache is that it will involve accessing a remote machine which is expensive.

Let's look at how expensive this operation is.

Following are few interesting numbers picked from a presentation by Jedd Dean (from Google):

Time taken to read 1 MB sequentially from memory - 250,000 ns (thats nano seconds)
Time taken for round trip within the same data center - 500,000 ns

So, reading 1 MB from a remote server's memory should take roughly 750,000 ns (0.75 ms).

Considering that 1 second page load time is good enough, this is less than 1/1000th of the time. Thus, when we talk about web applications, reading from a remote server's memory will not degrade performance by any noticeable amount.

When using a distributed cache, it's advisable to use a bit more than what is required. This ensures that failure of a single server does not overload the application.

Example: If you need 4 GB of memory and you are using 4 servers (with 1 GB cache on each), add another (5th) server with 1 GB of memory for caching. This way, when 1 server goes down, the application performance will not get impacted much.

One of the most popular distributed cache implementation is Memcached. It's used by companies like Wikipedia, Flickr, YouTube and Twitter.