Cloud Architectures – Storage in the Cloud

roadchimp clouds

Brief

Cloud technology is deployed across a wide variety of industries and applications. The term ‘Cloud’ itself has become so widely prevalent that we’ve devised additional terms in an effort to describe what type of cloud we’re talking about. What’s your flavor? Iaas, Paas or Saas? Or perhaps it’s Public, Private or Hybrid?

Regardless of the type of cloud you’re using or planning to implement, there’s no denying that storage is an essential component of every cloud architecture that simply cannot be overlooked. In this post, we will look into some of the most common usages of storage in the cloud and peel back the layers to discover exactly what makes them tick. Our goal is to come up with a yardstick to measure storage design.

Drivers towards Cloud Storage adoption

dropbox box_logo

What do Dropbox and Box Inc have in common? Both companies are less than 5 years old, offer services predominantly centered around cloud storage and file sharing and have been able to  attract significant amounts of capital from investors. In fact, Dropbox raised $250 million at a $4 billion dollar valuation from investors with  Box Inc raising another $125 million in mid 2012. It looks like Silicon Valley sees Cloud Storage services as a key piece in the future of cloud. So why is there such a tremendous interest around cloud storage? Consumers are drawn to a number of benefits of using cloud:

  • Redundancy: Large clouds incorporate redundancy at every level. Your data is stored in multiple copies on multiple hard drives on multiple servers in multiple data centers in multiple locations (you get the picture).
  • Geographical Diversity: With a global audience and a global demand for your content, we can place data physically closer to consumers by storing it at facilities in their country or region. This dramatically reduces round trip latency, a common complaint for dull Internet performance.
  • Performance: Storage solutions in the cloud are designed to scale dramatically upwards to support events that may see thousands or millions more consumers accessing content over a short period of time. Many services provide guarantees in the form of data throughput and transfer.
  • Security & Privacy:  Cloud storage solutions incorporate sophisticated data lifecycle management and security features that enable companies to fulfill their compliance requirements. More and more cloud providers are also providing services that are HIPAA compliant.†
  • Cost: As clouds get larger, the per unit costs of storage go down, primarily due to Economies of Scale. Many service providers choose to pass on these cost savings to consumers as lower prices.
  • Flexibility: The pay as you use model takes away concerns for capacity planning and wastage of resources due to cyclical variations in usage.

It should be noted that a Draft opinion paper released by the EU Data Protection Working Party while not explicitly discouraging Cloud adoption, recommended that Public Sector agencies perform a thorough risk analysis prior to migrating to the cloud. You can read the report here.

Storage Applications for the Cloud

We’ve listed some of the most common applications for cloud storage in this section:

  • Backup: The cloud is perceived to be a viable replacement for traditional backup solutions, boasting greater redundancy and opportunities for cost savings. The Cloud backup market is hotly contested in both the consumer and enterprise markets.
    • In the consumer market, cloud backup services like Dropbox, Microsoft SkyDrive and Google Drive offer a service that takes part of your local hard drive and syncs them up with the cloud. The trend for these pay for use services are on the rise, with Dropbox hosting data for in excess of 100 million users within four years of launching their service.
    • In the Enterprise Space, Gartner’s magic quadrant for enterprise backup solutions featured several pureplay Cloud backup providers including Asigra, Acronis and i365. Even leading providers such as CommVault and IBM have launched cloud-based backup solutions. Amazon’s recently launched Glacier service provides a cost-effective backup tier for around $0.01 per gigabyte per month.
      01 Gartner Magic Quadrant
  • File Sharing: File sharing services allow users to post files online and then share the files to users using a combination of Web links or Apps.  Services like Mediafire, Dropbox and Box offer a basic cloud backup solution that provides collaboration and link sharing features. On the other end of the spectrum, full-blown collaboration suites such as Microsoft’s Office 365 and Google Apps feature real-time document editing and annotation services.
  • Data Synchronization: (between devices): Data synchronization providers such as Apple’s iCloud as well as a host of applications including the productivity app Evernote allow users to keep files  photos and even music synchronized across array of devices (Desktop, Phone, Tablet etc.) to automatically synchronize changes
    evernote
  • Content Distribution: Cloud content distribution network (CDN) services are large networks of servers that are distributed across datacenters over the internet. At one point or another, we’ve used CDNs such as Akamai to enhance our Web browsing experience. Cloud providers such as the Microsoft Windows Azure Content Distribution Network (CDN) and the Amazon CDN offer affordable CDN services for serving static files and images to even streaming media to global audience.
  • Enterprise Content Management Companies are gradually turning to the cloud to manage Organizational compliance requirements such as eDiscovery and Search. Vendors such as HP Autonomy and EMC provide services that feature secure encryption and de-duplication of data assets as well as data lifecycle management.
  • Cloud Application Storage: The trend towards hosting applications in the cloud is driving innovations in how  we consume and utilize storage. Leading the fray are large cloud services providers such as Amazon and Microsoft who have developed cloud storage services to meet specific applications needs.
    • Application Storage Services: Products like Amazon Simple Storage Service (S3) and Microsoft Windows Azure Storage Account support storage in a variety of formats (blob, queue and table data) and scaling to very large sizes (Up to 100TB volumes).  Storage services are redundant (at least 3 copies of each bit stored) and can be accessed directly via HTTP, XML or a number of other supported protocols. Storage services also support encryption on disk.
      02 Azure storage
    • Performance Enhanced Storage: Performance enhanced storage emulates storage running on a SAN and products like Amazon Elastic Block Storage provide persistent, block-level network attached storage that can be attached to virtual machines running and in cases VMs can even boot directly from these hosts. Users can allocate performance to these volumes in terms of IOPs.
    • Data Analytics Support: Innovative distributed file systems that support super-fast processing of data have been adapted to the cloud. For example, the Hadoop Distributed File System (HDFS) manages and replicates large blocks of data across a network of computing nodes, to facilitate the parallel processing of Big Data. The Cloud is uniquely positioned to serve this process, with the ability to provision thousands of nodes, perform compute processes on each node and then tear down the nodes rapidly, thus saving huge amounts of resources. Read how the NASA Mars Rover project used Hadoop on Amazon’s AWS cloud here.

Storage Architecture Basics

03 Generic Storage Architecture

So how do these cloud based services run? If we were to peek under the hood, we would see a basic architecture that is pretty similar to the diagram above. All storage architectures comprise of a number of layers that work together to provide users with a seamless storage service. The different layers of a cloud architecture are listed below:

  • Front End: This layer is exposed to end users and typically exposes APIs that allow access to the storage. A number of protocols are constantly being introduced to increase the supportability of cloud systems and include Web Service Front-ends using REST principles, file-based front ends and even iSCSI support. So for example, a user can use an App running on their desktop to perform basic functions such as  creating folders,  uploading and modifying files, as well as defining permissions and share data with other users. Examples of Access methods and sample providers are listed below:
    • REST APIs: REST or Representational State Transfer is a stateless Web Architecture model that is built upon communications between clients and servers. Microsoft Windows Azure storage and Amazon Web Services Simple Storage Service (S3)
    • File-based Protocols: Protocols such as NFS and CIFS are supported by vendors like Nirvanix, Cleversafe and Zetta*.
  • Middleware: The middleware or Storage Logic layer supports a number of functions including data deduplication and reduction; as well as the placement and replication of data across geographical regions.
  • Back End: The back end layer is where the actual physical hardware is implemented and we refer to read and write instructions in the Hardware Abstraction Layer.
  • Additional Layers: Depending on the purpose of the technology, there may be a number of additional layers
    • Management Layer: This may supporting scripting and reporting capabilities to enhance automation and provisioning of storage.
    • Backup Layer: The cloud back end layer can be exposed directly to API calls from Snapshot and Backup services. For example Amazon’s Elastic Block Store (EBS) service supports a incremental snapshot feature.
    • DR (Virtualization) Layer: DR service providers can attach storage to a Virtual hypervisor, enabling cloud storage data to be accessed by Virtual Hosts that are activated in a DR scenario. For example the i365 cloud storage service automates the process of converting backups of server snapshots into a virtual DR environment in minutes.

Conclusion:

This brief post provided a simple snapshot of cloud storage, it’s various uses as well as a number of common applications for storage in the cloud. If you’d like to read more, please visit some of the links provided below.

Roadchimp, signing out! Ook!

Reference:

* Research Paper on Cloud Storage Architectures here.
Read a Techcrunch article on the growth of Dropbox here.
Informationweek Article on Online Backup vs. Cloud Backup here.
Read more about IBM Cloud backup solutions here.
Read about Commvault Simpana cloud backup solutions.

Technology in Government -Big Data

Executive Brief

In this article, we continue our series on technology in government by reviewing Big Data. We plan to review the impact of Big Data in the Government and common applications of technologies to manage this issue. First of all, let’s look at some basic definitions and define the scope of this article.

What is big data?

While Roger Magoulis of O’Reilly Media is most commonly credited for coining the term “Big Data” back in 2005 and launching it into the mainstream of consciousness, the term has been floating around for a number of years. (researchers found examples dating from the mid-1990s in Silicon Graphics (SGI) Slide Decks) Nevertheless, Big Data basically refers to data sets that are so large to the extent that their size becomes an encumbrance when trying to  manage and process the data using traditional data management tools.

According to IBM, we create 2.5 quintillion bytes of data each day and is commonly described by three characteristics:

  • Volume: Big Data refers to large amounts of data that is generated across a variety of applications and industries. At the time of this article, the order of magnitude from 100s of GB to Terabytes and Petabytes of data could easily qualify under the definition.
  • Variety: With a wide and disparate number of sources of Big Data, the data can be structured (like a database), semi-structured (indexed) or unstructured.
  • Velocity: The data is generated at high speeds, and needs to be processed in relatively short durations (seconds).

Why is big data important?

Big data conveys an important shift in how we interpret data to look for meaningful things in the world. The advent of Social Networking and E-commerce brought about a need for suppliers of rapidly non-differentiated online services to learn about the behavior of online users in order to tailor a superior user experience. Some of the most successful companies in the World (Hint.. starts with the letter ‘G) have based their entire business models on delivering customized ads to users based on their search queries. Prominent research projects such as NASA’s SETI (Search for Extra-terrestrial Intelligence) and Mars Rover projects; and the Human Genome sequencing program also called for similar needs:

The ability to perform lightning speed computational processes on extremely large sets of data that were also subject to frequent changes.

The challenges of traditional data management tools

The problem with conventional approaches towards managing data was that the data primarily had to be structured. Picture a database that  supports the catalog of a conventional online e-commerce website and holds hundreds and thousands of items. The database is structured and relational, meaning that each item put up for sale on the site can be stored as an object and described by a number of attributes, including the name of the item, the item’s SKU number, category, pricedescription, etc. For each item that we load onto the database ,we can perform searches according to product categories and descriptions and even sort the products by price. This is great and also efficient, because almost every object in the database will have the same types of attributes. Relational Database technologies such as SQL, Oracle etc. are great at handling this and are still very much in use today.

The problem we encounter when it comes to handling Big Data is that the data is subject to frequent change. With a Relational system, we need to define a structure or schema ahead of time. That’s not a big problem with an Online Shopping Cart database, since most items have the same attributes as described above. But what if we don’t know the types of attributes of the data we’re planning to store? Let’s imagine that we have a service that crawls the Web for Real Estate websites in a particular region. The objective is to build up an aggregated repository of information about properties for sale or rent that users can query.  Very frequently, the data that is being collected can be in a variety of sizes and types. For example, we could have HTML files, media files (JPEGs and MPEGs) or even strings of characters. In some cases it may be impossible to build a structure ahead of time, because we simply don’t know what’s out there.

So what happens each time we need to change the structure of a relational database? Rolling out schema changes for a database is a potentially complex, time and resource-intensive process and has a definite performance impact on the database during the change. Conventional solutions such as adding more computing resources or splitting up the database into shards are feasible, but do not fundamentally change how the data is being managed.

Solution: Big Data Technology

In the previous section, we explored the need for corporations and organizations to manage increasingly large amounts of data as well as the ineffectiveness of existing Database Management systems in dealing with these large data sets . In this section, we will briefly cover the most commonly deployed solutions in the industry for Big Data management.

Hadoop: Some industry executives have likened Hadoop to the brand “Kleenex”, meaning to say that Hadoop is synonymous with Big Data. Hadoop was largely developed at Yahoo and named after the toy elephant of a researcher’s son. Hadoop’s mechanism and components are described briefly:

  1. Distributed Cluster Architecture: Hadoop comprises of a collection of nodes (Master + Workers). The Master node is responsible for assigning coordinating tasks via a Jobtracker role. Hadoop has to basic layers:
    1. The HDFS layer: The Hadoop Distributed File System maintains consistency of data distributed across a large number of data nodes. Large files are distributed across the cluster and managed via a metadata server known as the Primary Namenode. Each datanode serves up data over the network using a proprietary block protocol. HDFS maintains a number of High Availability features including replication and rebalancing of data across nodes. A major advantage of HDFS is location awareness, where nodes are scheduled to run computational processes for data that is situated close to the nodes, thereby reducing network traffic.
    2. The Map Reduce layer: The Processing logic of MapReduce consists of the Map function and the Reduce function. The Map function applies a transformation to a list and returns an attribute value pair (ie. result,1). The Reduce function then concatenates the list into a string.
    3. Additional Components: Hadoop is commonly implemented with a number of additional services. We’re listing the most common components here:
      1. Pig: Pig is a scripting language for creating MapReduce queries.
      2. Hive: Hive is a data query infrastructure
      3. Squoop: Squoop is a Relational Database connector combined with data analysis tools that allows connectivity into a company’s Business Intelligence layer.
      4. Scheduling: Scheduling tools such as Facebook’s Fair Scheduler and Yahoo’s Capacity Scheduler allow users to prioritize jobs and implement some degree of Quality of Service.
      5. Other tools: A number of other tools are available for managing Hadoop and include HCatalog, a table management service for access to Hadoop data and the Ambari monitoring and management console.
  2. Batch Processing: Hadoop fundamentally uses a batch processing system to manage data. Processing is typically divided up into the following steps:
    1. Data is divvied up into small units and distributed across a cluster
    2. Each data node receives a subset of data and applies map and reduce functions to locally stored data/cloud storage
    3. Jobtracker coordinates jobs across the cluster
    4. Data may be processed in a workflow where outputs of one map/reduce pair become inputs for the next
    5. Data results may be applied to additional analysis/ reporting or BI tools
  3. Hadoop Distributions:  Hadoop was originally designed to work on the Apache platform and has very recently (Circa. October 2012) been released by Microsoft as Microsoft HDInsight Server for Windows and the Windows Azure HDInsight Service for the cloud. Other large vendor support for Hadoop includes the Oracle Big Data appliance which integrates with Cloudera’s distribution of Apache Hadoop; and Amazon’s AWS Elastic Map Reduce service for the cloud and Google’s AppEngine-MapReduce on Google App Engine.

Latest Trends in Government

Now that we’ve covered some basics on Big Data, we are now ready to explore common implementations in the government sector around the world. Large governments led the charge for Big Data implementations, with an excess of 160 large Big Data programmes being pursued by the US Government alone.

  • Search Engine Analytics: A pressing need to search vast amounts of data made publicly available by recent policy changes has seen a great practical application for Hadoop and Hive. For example, the UK government uses Hadoop to pre-populate relevant and possible search terms when a user types into a search box.
  • Digitization Programs: The cost implications for ‘going digital’ are large, and regulators are taking notice, with some estimates that online transactions can be 20 times cheaper than by phone, 30 times cheaper than by face-to-face, and up to 50-times cheaper than by post (link).For example, the UK government stated in it’s November 2012 Government Digital Strategy that it can make up to £1.2 Billion  by the year 2015 just by making public services digital by default. A number of  large government bodies have been tasked with identifying large volume transactions (>100,000 a year) that can be digitized. Successful digitization requires a number of key movements:
    • Non-exclusive policies: Bodies or groups that do not have the capabilities to go digital must not be penalized. This means that the choice to go digital should be open. Users who are not familiar with accessing digital information should also be given alternative mechanisms such as contact centers.
    • Consolidation of processes: A number of governments are moving closer towards a single consolidated online presence. For example, the U.K. government is consolidating all publishing activities across all 24 UK central government websites to the GOV.UK website. The consolidation of information without incurring any performance penalties requires the standardization to common platforms and technologies.
  • Large Agency initiatives: The largest agencies and ministries are spearheading programs on Big Data, with applications in Health, Defense, Energy and Meteorology taking on significant interests:
    •  Health Services: The US center for Medicare and Medicaid services (CMS) is developing a datawarehouse based on Hadoopto support analytic and reporting requirements of Medicare and Medicaid programs. The US National Institute for Health (NIH) is developing a the Cancer Imaging Archive, an image data-sharing service that leverages imaging technology used in assessment of therapeutic responses to treatment.
    • Defense: The US Department of Defense listed 9 major projects in a March 2012 Whitehouse paper on the adoption of Big Data anlysis across the government. Major applications involved Artificial Intelligence, Machine Learning, Image and Video recognition and Anomaly detection.
    • Energy: The US Department of Energy is investing in research on it’s Next Generation Networking program to move large datasets (>1 petabyte per month) for the Open Science Grid, ESG and Biology communities.
    • Meteorology: The US National Weather Service uses Big Data in their modeling systems to improve Tornado forecasting systems.Modern weather prediction systems utilize vast amounts of data collected from ground sources and a geostationary orbiting satellite planned to be launched in 2014 and as weather conditions are constantly changing, the need for rapid processing of high velocity is paramount to these systems.

Strategic Value

Big Data is transformative in the sense that it provides us with an opportunity perform deep meaningful analysis of information beyond what is normally available. The idea is that with more information at our fingertips, we can make better decisions.

Positive Implications

  • Greater Transparency: Big data has the opportunity to provide greater access to data by making data more frequently accessible to greater constituencies of people.
  • More opportunities for enhancing performance: By providing users with access to not only greater amounts of data, but also greater varieties of data, we create more opportunities to identify patterns and trends by connecting information from more sources, leading us to capitalize on opportunities and expose threats. This results in an overall enhanced quality of decision making that could potentially lead to greater performance.
  • Better Decisions: By allowing systems to collect more data and then applying Big Data analysis techniques to draw meaningful information from these data sets, we can make better, more timely and informed decisions.
  • Greater segmentation of stakeholders: By exposing our analytics to greater pools of raw data, we can find interesting ways to segment our constituents, identifying unique patterns at a more granular level and devising solutions and services to meet these needs. For example, we can use Big Data to analyze the Elderly living in a particular part of a city that are alone, have a unique medical condition requiring specialist care, and use this information to manage staffing and service avalability for these users.

Negative Implications

  • Big Brother: Governments are sensitive to the perception of using data to investigate and monitor the individual   and the storing and analysis of data by government has long had a strong reaction in the public eye. However, the enactment of information transparency legislation and freedom of information policies, together with the formation of public watchdog sites have led to an encouraging environment for governments to pursue Big Data.
  • Implementation Hurdles: Implementing Big Data requires a holistic effort beyond adopting a new technology. The task of effectively identifying data that can be combined and analyzed; to securely managing the data over it’s lifetime must be carefully managed.

Where to Start?

We’ve distilled a number of important lessons from around the web that could guide your Big Data implementation:

  • Focus first on requirements: Decision makers are encouraged to look for the low hanging fruit, in other words, situations that have a pressing need for Big Data solutions. BIg Data is not a silver bullet and target implementations should be evaluated thoroughly.
  • Start small: Care should be taken to manage stakeholder expectations before Big Data takes on the image of a large disruptive technology i the workplace. Focusing on small pilot projects that show tangible and visible benefits are the best way to go and often pave the way for much larger projects down the line. Often, extending the pipeline for Big Data projects allows technology stakeholders time to get over the learning curve of adoption.
  • Reuse infrastructure: Big Data technologies can happily coexist on conventional infrastructure. In fact, Big Data implementations can happily coexist with Relational Database Systems in existing IT environments.
  • Obtain high-level support: Big Data sees the greatest benefits in terms of performance and cost savings when combining different systems. But with this type of endeavor comes greater complexity and risks from differing priorities. Managing this challenge requires the appointment of senior stakeholders who can align priorities and provide the necessary visibility for forward movement.
  • Push for standardization and educate decision makers: The Policy Exchange, a UK think tank recommends that “… public sector leaders and policymakers are literate in the scientific method and confident combining big data with sound judgment.”
  • Address Ethical Issues first: A major obstacle to adopting Big Data is the pressure from groups of individuals who wish not to be tracked, monitored or singled out. Governments should tackle this issue head on by developing a code for responsible analytics

Useful Links

Information week article on Microsoft’s Big Data strategy here.
UPenn Research Paper > Development of Big Data here.
Research Trends Report on the evolution of Big Data as a Research topic here.
Cloudera whitepapers on Government Implementations here.
Article on Big Data’s success in Government here.
Article: UK govt. in talks to use Hadoop here.
Paper: UK Government Digital Strategy here.
Paper: US Federal Government Big Data Strategy here.
Article: Big data in government here.
Article: National Weather Service using Big Data here.
Research: Mckinsey Global Institute paper on Big Data here.
Report: Policy Exchange Report on Big Data here.

 

Cloud Architecture – Implementing Nginx in AWS

Brief:

In this post, we will be deploying Nginx as an AMI instance in Amazon’s Elastic Compute Cloud (EC2). This post will document our steps to configure and optimize Nginx for serving static pages to a global audience of users.

Planning and Preparation:

Nginx is a powerful web server that can be deployed in combination with other services such as Fast CGI or Apache backends to provide scalable and efficient web infrastructures. AWS makes it extremely fast and convenient to implement your own Nginx instances. You can choose to deploy the Linux variant of your choice and then deploy Nginx on your OS, or you can choose to deploy the Nginx AMI Appliance developed by Nginx Inc., available in the Amazon Marketplace for am additional licensing fee.

01 Nginx AMI

A few factors to consider when making this decision:

  • Customization: For those who are comfortable with Linux and interested in tweaking Operating System parameters to get the best performance, a custom implementation would make the most sense as you have a lot of control. 
  • Cost: The key to running an effective cloud services infrastructure is to ensure that you have a tight cloud implementation that leaves little wastage of excess resources with the help of cloud features such as auto scaling. That being said, the Nginx AMI has a license fee (US $0.13/hr for an m1.medium instance at time of writing) and should be factored into your total costs of ownership.

* Prior benchmark testing on loads up to 30,000 transactions per second revealed that performance differentials between the custom built AMI and Nginx Marketplace AMIs were insignificant.

Installation:

For this post, we will choose to implement the Nginx AMI which can be provisioned either from the AWS Web Management console or scripted and launched via Amazon’s EC2 tools. We will need to first access the Nginx AMI appliance page in the Amazon Marketplace to accept the terms and conditions.

03 EC2 console

After the installation in complete, we can verify that the host is setup via the following steps:

  1. Open a browser and connect to the Public DNS name of the EC2 host. We should be able to view the default welcome page.
    04 Nginx Welcome
  2. We can also SSH into the host and run the following command to verify the status of the host. The default user is ec2-user
    /etc/init.d/nginx status
    We should get a response like "nginx (pid  1169) is running..."
  3. Next, we check for any available updates and run sudo yum update to apply all updates
  4. We can view the default configuration files of Nginx here:
    /etc/nginx/conf.d/default.conf
    /etc/nginx/nginx.conf
  5. We should configure xginx to automatically start at reboot
    chkconfig nginx on
  6. Some basic commands:
    1. Start the nginx service: service nginx start
    2. Restart the nginx service: sudo nginx -s reload

Alternatively, I like to use Chef to control automated installations of Nginx. Nginx is currently supported on Ubuntu 10.04, 12.04 and CentOS 5.8 and 6.3 Operating Systems. This support page gives you more information regarding the implementation.

Component Configuration:

Our next task is to configure Nginx and all the necessary components we require to serve up our web content. For this post, we will be using Nginx to host some static web content.

  1. Configure Remote Access
    1. I provisioned a Security group in AWS known as Webserver, enabling SSH, HTTP and HTTPS traffic from all IPs. Depending on how you choose to secure your deployment, you may choose to deploy a single management host with SSH access and then only enable SSH from that host into the Web server.
  2. HTTP Server Components
    1. We should make sure that we install Nginx with the latest components.
    2. Nginx should be configured with only required  components in order to minimize it’s memory footprint. We can run the following command:
      ./configure --prefix=/webserver/nginx --without-mail_pop3_module --without-mail_imap_module  --without-mail_smtp_module --with-http_ssl_module  --with-http_stub_status_module  --with-http_gzip_static_module
    3. The Nginx AMI image is automatically configured at startup to serve a default index.html page at the location /usr/share/nginx/html
    4. To configure additional components, you can run the nginx-setup command. You will be asked to select which components to install, afterwhich the script will install all prerequisite packages.
    5. After completion, the web application will be installed in the following default location /var/www/default
  3. Load Static content
    1. Our Static content is stored on an EBS volume snapshot, which we can access as follows:
      1. Attach and mount the volume in EC2 console
      2. Run fdisk -l to identify which device is our EBS volume (in this case /dev/xvdf1)
      3. Create a new directory for this EBS volume sudo mkdir /mnt/ec2snap
      4. Set permissions to access this directory sudo chmod 0777 /mnt/ec2snap
      5. Mount the device into this folder sudo mount /dev/xvdf1 /mnt/ec2snap -t ntfs
    2. Our static content can now be copied to our default folder location
      1. First, we rename the default files created at the time of installation
        sudo mv /usr/share/nginx/html/index.html /usr/share/nginx/html/index_old.html
      2. Then we perform the copy and set necessary permissions on the file
        cp /mnt/ec2snap/html/index.html /usr/share/nginx/html
        sudo chmod 644 /usr/share/nginx/html/index.html
      3. Now we should test our web server to ensure that our configuration is working, by browsing to our public web server online.
      4. Lastly, we should also un-mount the EBS volume
        sudo umount /mnt/ec2snap
  4. Backup snapshot
    1. At this point, it’s wise to quickly run a snapshot before delving into the configuration files. (Command Ref.)
      ec2-create-snapshot –aws-access-key AKIAJMLFQQMQVPBBDJFQ –aws-secret-key SGm81OzfAQT/obL24hFH79NYvd8OAb/05qRSAlI3 –region us-west-1 vol-f41285d5 -d “backup-Nginx-$(date +”%Y%m%d”)”
      06 Snapshot

Nginx Optimization:

Nginx allows administrators to perform a considerable number of tweaks to optimize performance based on our underlying system resources. We’ve listed a number of basic tweaks here. Make sure that you thoroughly test these settings before deploying into a production environment.

  1. CPU and Memory Utilization – Nginx is already very efficient with how it utilizes CPU and Memory. However, we can tweak several parameters based on the  type of workload that we plan to serve. As we are primarily serving static files, we expect our workload profile to be less CPU intensive and more disk-process oriented.
    1. Worker_processes – We can configure the number of single-threaded Worker processes to be 1.5 to 2 x the number of CPU cores to take advantage of Disk bandwidth (IOPs).
    2. Worker_connections – We can define how many connections each worker can handle. We can start with a value of 1024 and tweak our figures based on results for optimal performance. The ulimit -n command gives us the numerical figure that we can use to define the number of worker_connections.
    3. SSL Processing – SSL processing in Nginx is fairly processor hungry and if your site serves pages via SSL, then you need to evaluate the Worker_process/CPU ratios. You can also turn off Diffie-Hellman cryptography and move to a quicker cipher if you’re not subject to PCI standards. (Examples: ssl_ciphers RC4:HIGH:!aNULL:!MD5:!kEDH;)
  2. Disk Performance – To minimize IO bottlenecks on the Disk subsystem, we can tweak Nginx to minimize disk writes and ensure that Nginx does not resort to on-disk files due to memory limitations.
    1. Buffer Sizes – Buffer size defines how much data we can store in the host. A buffer size that is too low will result in Nginx having to  upstream responses on disk, which introduces additional latency due to disk read/write IO response times.
      1. client_body_buffer_size: The directive specifies the client request body buffer size, used to handle POST data. If the request body is more than the buffer, then the entire request body or some part is written in a temporary file.
      2. client_header_buffer_size: Directive sets the headerbuffer size for the request header from client. For the overwhelming majority of requests it is completely sufficient to have a buffer size of 1K.
      3. client_max_body_size: Directive assigns the maximum accepted body size of client request, indicated by the line Content-Length in the header of request. If size is greater the given one, then the client gets the error “Request Entity Too Large” (413).
      4. large_client_header_buffers: Directive assigns the maximum number and size of buffers for large headers to read from client request. The request line can not be bigger than the size of one buffer, if the client sends a bigger header nginx returns error “Request URI too large” (414). The longest header line of request also must be not more than the size of one buffer, otherwise the client get the error “Bad request” (400).These parameters should be configured as follows:
        client_body_buffer_size 8K;
        client_header_buffer_size 1k;
        client_max_body_size 2m;
        large_client_header_buffers 2 1k;
    2. Access/Error Logging – Access Logs record every request for a file and quickly consume valuable disk I/O. Error logs should not be set too Low unless it is our intention to capture every single HTTP error. A warm level of logging is sufficient for most production environments. We can configure Logs to store data in chunks, defining chunk sizes in (8KB, 32KB,128KB)
    3. Open File Cache – The open file cache directive stores Open file descriptors, including information of the file, location and size.
    4. OS File Caching – We can define parameters around the size of the cache used by the underlying server OS to cache frequently accessed disk sectors. Caching the web server content will reduce or even eliminate disk I/O.
  3. Network I/O and latency – There are several parameters that we can tweak in order to optimize how efficiently the server can manage a given amount of network bandwidth due to peak loads.
    1. Time outs – Timeouts determine how long the server maintains a connection and should be configured optimally to conserve resources on the server.
      1. client_body_timeout: Directive sets the read timeout for the request body from client. The timeout is set only if a body is not get in one readstep. If after this time the client send nothing, nginx returns error “Request time out” (408).
      2. client_header_timeout: Directive assigns timeout with reading of the title of the request of client. The timeout is set only if a header is not get in one readstep. If after this time the client send nothing, nginx returns error “Request time out” (408).
      3. keepalive_timeout: The first parameter assigns the timeout for keep-alive connections with the client. The server will close connections after this time. The optional second parameter assigns the time value in the header Keep-Alive: timeout=time of the response. This header can convince some browsers to close the connection, so that the server does not have to. Without this parameter, nginx does not send a Keep-Alive header (though this is not what makes a connection “keep-alive”).  The author of Nginx claims that 10,000 idle connections will use only 2.5 MB of memory
      4. send_timeout: Directive assigns response timeout to client. Timeout is established not on entire transfer of answer, but only between two operations of reading, if after this time client will take nothing, then nginx is shutting down the connection.These parameters should be configured as follows:

        client_body_timeout   10;
        client_header_timeout 10;
        keepalive_timeout     15;
        send_timeout          10;
    2. Data compression – We can use Gzip to compress our static data, reducing the size of the TCP packet payloads that will need to traverse the web to get to the client computer. Furthermore, this also reduces CPU load when serving large file sizes. The Nginx HTTP Static Module should be used with the following parameters:
      gzip on;
      gzip_static on;
    3. TCP Session parametersThe TCP_* parameters of Nginx
      1. TCP Maximum Segment Lifetime (MSL) – The MSL defines how long the server should wait for stray packets after closing a connection and this value is set to 60 by default on a Linux server.
    4. Increase System Limits – Specific parameters such as the number of open file parameters and the number of available ports to serve connections can be increased.

Nginx configuration file:

Prior to rolling out any changes into production, it’s a good idea to first test our configuration files.

  1. We can run the command nginx -t to test our config file. We should make sure that  we receive an ‘OK’ result before restarting the services
    09 Test config

Conclusion:

In this post, we revealed how easy it is to set up an Nginx server on Amazon AWS. The reference links below provide a wealth of additional information on how to deploy Nginx under varying scenarios, please give them a read.. Ook!

Road Chimp, signing off.

Reference:

http://nginx.org/en/docs/howto_setup_development_environment_on_ec2.html
http://www.lifelinux.com/how-to-optimize-nginx-for-maximum-performance/
http://blog.martinfjordvald.com/2011/04/optimizing-nginx-for-high-traffic-loads/
https://calomel.org/nginx.html
http://nginxcp.com/forums/Forum-help-and-support
http://wiki.nginx.org/Pitfalls
Configure start-stop script in Nginx
Configuring PHP-FPM on an Nginx AMI
http://theglassicon.com/computing/web-servers/install-nginx-amazon-linux-ami
Custom CenOS: http://www.idevelopment.info/data/AWS/AWS_Tips/AWS_Management/AWS_10.shtml
Chef Configuration for Nginx
Github Cookbooks for Nginx in Chef
http://kovyrin.net/2006/05/18/nginx-as-reverse-proxy/lang/en
http://www.kegel.com/c10k.html
http://forum.directadmin.com/showthread.php?p=137288
https://engineering.gosquared.com/optimising-nginx-node-js-and-networking-for-heavy-workloads
Configure Larger System Open File Limits

Technology in Government – Cloud Computing

Executive Brief

A number of governments have implemented roadmaps and strategies that ultimately require their ministries, departments and agencies to default to Cloud computing solutions first when evaluating IT implementations. In this article, we evaluate the adoption of cloud computing in government and discuss some of the positive and negative implications of moving government IT onto the cloud.

Latest Trends

In this section, we look at a number of cloud initiatives that have been gaining leeway in the public sector:

  • Office Productivity Services – The New Zealand Government has identified office productivity services as the first set of cloud-based services to be deployed across government agencies. Considered to be low hanging fruit and fueled by successes in migrating perimeter services like anti-spam onto the cloud, many organizations see email and collaboration as a natural next step of cloud adoption. Vendors leading the charge include Microsoft’s Office 365 for Government, with successful deployments including Federal Agencies like the USDA, Veterans Affairs, FAA and the EPA as well as the Cities of Chicago, New York and Shanghai. Other vendor solutions include Google Apps for Government which supports the US Department of the Interior.
  • Government Cloud Marketplaces – A number of governments have signified the need to establish cloud marketplaces, where a federated marketplace of cloud service providers can support a broad range of users and partner organizations. The UK  government called for the development of a government-wide Appstore, as did the New Zealand Government in a separate cabinet paper on cloud computing in August 2012. The US government has plans to establish a number of cloud services marketplaces, including the GSA’s info.apps.gov and the DOE’s YOURcloud, a secure cloud services brokerage built on Amazon’s EC2 offering. (link) The image below lists the initial design for the UK government App store.
    03 UK App Store
  • Making Data publicly available  – The UK Government is readily exploiting opportunities to make available the Terabytes of public data that can be used to develop useful applications. The recent release of Met Office UK Weather information to the public via Microsoft Azure’s cloud hosting platform. (link)
  • Government Security Certification – A 2012 Government Cloud Survey conducted by KPMG listed security as the greatest concern for governments when it comes to cloud adoption and that governments are taking measures to manage security concerns. For example, the US General Services Administration subjects each successful cloud vendor to a battery of tests that include an assessment of access controls.

01a Canada Mappings

Canadian Government Cloud Architectural Components

Strategic Value

The strategic value of cloud computing can be summed up into a number of key elements in government. We’ve listed a few that appear on the top of our list:

  • Enhancing agility of government – Cited as a significant factor in cloud adoption, cloud computing promises rapid provisioning and elasticity of resources, reducing turnaround times on projects.
  • Supporting government policies for the environment – The environmental impact due to reduced data center spending and consumption of energy on cooling has tangible environmental benefits in terms of reduced greenhouse gas emissions and potential reductions in allocations of carbon credits.
  • Enhancing Transparency of government – Cloud allows the developed of initiatives that can make government records accessible to the public, opening up tremendous opportunities for innovation and advancement.
  • Efficient utilization of resources – By adopting a pay-for-use approach towards computing, stakeholders are encouraged to architect their applications to be more cost effective. This means that unused resources are freed up to the common pool of computing resources.
  • Reduction in spending – Our research indicated this particular element is not considered to be a significant aspect of moving to cloud computing according to technology decision makers, however some of the numbers being bandied about in terms of cost savings are significant (Billions of dollars) and can appeal to any constituency.

Positive Implications

We’ve listed a number of positive points towards cloud adoption. These may not be relevant in every use case, but worthwhile for a quick read:

  • Resource Pooling – leads to enhanced efficiency, reduced energy consumption and more economical cost savings from scale
  • Scalability – Unconstrained capacity allows for more agile enterprises that are scalable, flexible and responsive to change
  • Reallocation of human resources – Freed up IT resources can focus on R&D, designing new solutions that are optimized in cloud environments and decoupling applications from existing infrastructures.
  • Cost containment – Cloud computing requires the adoption of a ‘you pay for what you use’ model, which encourages thrift and efficiency. The transfer of CAPEX to OPEX also smoothes out cash-flow concerns  in an environment of tight budgets.
  • Reduce duplication and encourage re-use – Services designed to meet interoperability standards can be advertised in a cloud marketplace and become building blocks that can be used by different departments to construct applications
  • Availability – Cloud architecture is designed to be independent of the underlying hardware infrastructure and promotes scalability and availability paradigms such as homogeneity and decoupling
  • Resiliency – The failure of one node of a cloud computing environment has no overall effect on information availability

Negative Implications

A sound study should also include a review of the negative implications of cloud computing:

  • Bureaucratic hinderances – when transitioning from legacy systems, data migration and change management can slow down the “on demand” adoption of cloud computing.
  • Cloud Gaps – Applications and services that have specific requirements which are unable to be met by the cloud need to be planned for to ensure that they do not become obsolete.
  • Risks of confidentiality – Isolation has been a long-practiced strategy for securing disparate networks. If you’re not connected to a network, there’s no risk of threats getting in. A common cloud infrastructure runs the risk of exploitation that can be pervasive since all applications and tenants are connected via a common underlying infrastructure.
  • Cost savings do not materialize – The cloud is not a silver bullet for cost savings. We need to develop cloud-aligned approaches towards IT provisioning, operations and management. Applications need to be decoupled and re-architected for the cloud. Common services should be used in order to exploit economies of scale; applications and their underlying systems need to be tweaked and optimized.

05 Cloud Security concerns

Security was cited as a major concern (KPMG)

Where to start?

There is considerable research that indicates government adoption of cloud computing will accelerate in coming years. But to walk the fine line of success, what steps can be taken? We’ve distilled a number of best practices into the following list:

00 USG Roadmap

  1. Develop Roadmaps:  Before Cloud Computing can reap all of the benefits that it has to offer, governments must first move along a continuum towards adoption. For that very purpose, a number of governments have developed roadmaps to aid in developing a course of progression towards the cloud. Successful roadmaps featured the following components:
    • A technology vision of Cloud Computing Strategy success
    • Frameworks to support seamless implementation of federated community cloud environments
    • Confidence in Security Capabilities – Demonstration that cloud services can handle the required levels of security across stakeholder constituencies in order to build and establish levels of trust.
    • Harmonization of Security requirements – Differing security standards will impede and obstruct large-scale interoperability and mobility in a multi-tenanted cloud environment, therefore a common overarching security standard must be developed.
    • Management of Cloud outliers – Identify gaps where Cloud cannot provide adequate levels of service or specialization for specific technologies and application and identify strategies to deal with these outliers.
    • Definition of unique mission/sector/business Requirements (e.g. 508 compliance, e-discovery, record retention)
    • Development of cloud service metrics such as common units of measurement in order to track consumption across different units of government and allow the incorporation of common metrics into SLAs.
    • Implementation of Audit standards to promote transparency and gain confidence
  2. Create Centers of Excellence: Cloud Computing Reference Architectures; Business Case Templates and Best Practices should be developed so that cloud service vendors should map their offerings to (i.e. NIST Reference Architecture) so that it is easier to compare services.
  3. Cloud First policies: Implementing policies that mandate all departments across government should consider cloud options first when planning for new IT projects.

Conclusion

The adoption of cloud services holds great promise, but due to the far reaching consequences necessitated by the wide-spread adoption of cloud to achieve objectives such as economies of scale, a comprehensive plan compounded with standardization and transparency become essential elements of success.

We hope this brief has been useful. Ook!

Useful Links

Microsoft’s Cloud Computing in Government page
Cisco’s Government Cloud Computing page
Amazon AWS Cloud Computing page
Redhat cloud computing roadmap for government pdf
US Government Cloud Computing Roadmap Vol 1.
Software and Information Industry updates on NIST Roadmap
New Zealand Government Cloud Computing Strategy link
A
ustralian Government Cloud Computing Strategic Direction paper
Canadian Government Cloud Computing Roadmap
UK Government Cloud Strategy Paper
GCN – A portal for Cloud in Government
Study – State of Cloud Computing in the public sector

Technological Transformation in Government

Inauguration Obama

Photo (c) A/P Sandy Huffaker

Foreword

We live in an exciting juncture when the world is undergoing massive and visible transformation. The Internet has given us instant access to information and it has affected how we do things on a global scale. Our children go to school and interact with knowledge in ways that we could have never imagined before; while demand and supply interact within virtual, global marketplaces where consumers are informed and empowered and suppliers are intelligent and efficient. Yet there is no place where the impacts of technology are more visibly felt than in the Public Sector, where technology may be deployed to serve an informed electorate with high expectations, demanding services and efficiency at an ever-accelerating pace.

Brief

In this series of articles, I will explore a number contemporary issues that Technology decision makers in Government are concerned with and also look into innovative, viable solutions that have been successfully implemented in a number of countries to solve or address these concerns.

  • Cloud Computing – While cloud technology promises to delivery significant cost savings from economies of scale and cut down on deployment costs, cloud has been traditionally shunned by governments for a number of reasons, including security and confidentiality. In recent years, a number of vendors have developed Government Clouds that are designed to integrate with existing Government networks and systems, while meeting government needs for compliance and security.
  • Big Data – Big Data refers to data sets that are so large that they become difficult to manage using traditional tools. With the proliferation of e-government initiatives, governments word-wide face significant challenges in managing vast repositories of information.
  • Open Source and Interoperability – Government’s ability to adopt and enhance open standards that encourage interoperability between different systems and establish an environment of equal opportunities among technology vendors, partners and end-users.
  • Digital Access – The Internet has redefined access to knowledge and learning and it is a priority for governments to ensure that students from all walks of life are not limited in opportunity due to poor access to the web. Here we explore how technology is transforming big cities and communities alike in accessing the web.
  • Mobility and Telecommuting – Governments worldwide are embracing  telecommuting and flex-time work policies as a viable long-term solution to reducing costs and energy consumption. We explore technologies that foster collaboration and productivity for a mobile workforce.
  • Cyber Security – With the call for increased vigilance against acts of cyber terrorism, we explore the extent that governments are prepared to do in order to maintain Confidentiality, Integrity and Availability amidst an increasingly connected ecosystem of public-sector employees, vendors, contractors and other stakeholders.
  • Open Government – Governments are heeding the call for greater transparency, public participation and collaboration by making information more readily available on government websites and also providing the public with greater access for providing feedback and commentary. This had led to the adoption of new technologies and innovations to ensure that confidentiality is not sacrificed in the light of new policies
  • Connected Health and Human Services – Case management, health records management and health benefits administration are but a few components of government services that many lives depend on to function effectively and efficiently. We will explore technologies that are transforming these services.
  • Accessibility – In an age of information workers, support for differently abled employees has become a source of competitive advantage, enabling governments to tap into additional segments of the workforce.
  • Defense and Intelligence – Technology has long played a vital role in ensuring that vital battlefield decisions can be made with timely access to information; communications occurs unimpeded in times of emergency; and cost efficiencies can me maximized in times of tightening budgets.

Dimensions of Exploration

Essential to any well-thought out study, we must consider important attributes such as the long-term implications, return on investment and practicality of implementation. Therefore, for each of the issues listed above, we will include in our analysis the following components:

  • Executive Brief
  • Latest Trends
  • Strategic Value
  • Positive Implications
  • Negative Implications
  • Proposed Solutions
  • Reference Implementations
  • Useful Links

Topics

An individual article has been dedicated to each of the following topics; please click on each one for further reading:

  • Cloud Computing
  • Big Data
  • Open Source and Interoperability
  • Digital Access
  • Mobility and Telecommuting
  • Cyber Security
  • Open Government
  • Connected Health and Human Services
  • Accessibility
  • Defense and Intelligence

* This series is a work in progress, and does not support a particular thesis or ideal. It simply reflects research of the solutions that have been devised to solve frequently unique problems and does not reflect an endorsement of a particular technology or ideal.

Why write about Government?

I’ve spent a significant amount of time consulting for government and in truth, nothing has given me greater pleasure than to see the benefits of technology impact my selfless friends and colleagues who have made the altruistic decision to stay in government in order to serve the greater good. These unsung heroes maintain the systems that support our health, education, defense, civil, social and legal infrastructure and many other essential functions of government, which many lives may depend on.

Web Analytics Primer: 1,2,3 Analyze!

Introduction:

Since the beginning of the Internet, when people started visiting websites and (hopefully) buying stuff online, businesses have wanted to know exactly what people were doing in their virtual web storefronts. From the humble page view counter, to cookies and tracking tools, the industry of web analytics was born.

Today, Web Analytics is a well established tool for obtaining knowledge and insights of the vistors to our websites by tracking the behavior of visitors as they click through from page to page. The demand for Web Analytics is growing, with an estimated current value of US $600 million and growing in the double digits every year according to a number of online sources (see 2009 study by Forrester Research). Web Analytics comes in a wide variety of solutions, ranging from simple modules that plug into full-blown Business Intelligence suites, to Cloud-based service-oriented analytics tools.

Your humble Roadchimp was recently given the task of advising a close friend and entrepreneur on devising a Web Analytics solution for a growing online business. And so I decided it would be an excellent opportunity to develop a simple primer for our  lovely primate readers.

Further Reading:

Rather than going into too much detail about Web Analytics, I’m providing a quick link to the Wikipedia page that contains a great overview of the topic. You’re one quick click away from obtaining some basic definitions of commonly used terms as well as links to some of the more popular Web Analytics platforms out in the market.

Scenario:

SwingFromLimb (a.k.a. Swing) is a rapidly growing online social tool that helps active-minded chimps to find sports facilities closer to them. Swing’s customers come from all walks of life and predominantly are primates who are interested in sports activities of all types, from Tree Canopy Judo to Prehensile Yoga and Banana Kickboxing. Swing connects activity enthusiasts to sports facility owners by listing thousands of classes online that users sign up for via the Swing website or App. As part of it’s enhanced service offering, Swing offers a club management application to facility managers to help them market their classes to online users.

Goals:

Swing wants to analyze the online behavior of its users in order to identify useful information, such as the types of classes that are more popular in a particular part of the jungle, the times of day that chimps book their classes online as well as seasonal variations in class attendance. This can happen especially in  the New Year when holy Saint Chimpolous rides up from the tropics in his banana sled and slides down our chimneys to deliver overripe fruit while consuming all of our household cleaners; and the ensuing hordes of soft-bellied and guilt-ridden chimps making a beeline for their nearest gym after the holidays.  (yes… serious readers I wrote a funny)

Details that the Swing management would like to obtain about their customers are listed below:

  • Methods of Accessing Content: The types of web browser, screen resolution, language and plug-in support (Java, Flash etc) used to access the site or App.
  • Location of Visitors: Using IP Address filtering and location awareness, we can determine which geographical locations users are connecting to the website from as well as which mobile service providers they may be using.
  • In-Site Behavior: Which pages users click to most frequently; how long users tend to dwell on a page before clicking to the next page and also what pages users visit last before leaving the site.
  • Access Patterns: Access behavior for each user based on criteria such as the time of day; geographical location (home or office) and days of the week.
  • Web Referrals: Which search engines, blogs or websites are users referred to when entering the website.

Google Analytics:

Our task is develop a prototype Analytics solution for Swing to allow management to get an idea of what a common Web Analytics solution can offer. After this evaluation phase, we can provide a recommendation on the most cost effective and scalable solution for the company. Looking at the widespread number of tools out there, we’ve decided to start with Google Analytics.

This is a free-to-use service provided by Google and at last count is the analytics platform of choice on over 17 million web sites. Google Analytics was originally developed by the Urchin Software Corporation which was acquired by Google in 2005. The product is delivered as a freemium service and features integration with Google AdWords and requires a number of cookies to be deployed on the end user’s computer.

Implementing Google Analytics

To successfully implement a web analytics platform, the company recommends a simple 3-step process on the Google Analytics website as follows:

  1. Sign up for Google Analytics
  2. Add tracking code
  3. Learn about your audience

01 Signup

We shall cover these steps in detail below:

1. Sign up for Google Analytics

The first step involves signing in with a Google Analytics account (If you don’t have a Google account already) and putting in some of your website’s details. One interesting note is that you have a choice of anonymizing your data and sharing it with other users and also on whether to enable integration with other Google services.
03a Sharing Example

Prior to signing up for the service, you have to agree to the Google Analytics Terms of Service, which among other things, declares that this is a free service for up to 10 million hits.

2. Add Tracking Code

Once you have completed the initial setup, a unique tracking code is generated for your website and you are directed to the Google Analytics dashboard where you can perform some basic configuration settings, as well as obtain a tracking code that will need to be inserted into the HTML code of your website.

05 Dashboard

The tracking code is basically a string of text and looks like this:

06 Tracking Code

The tracking code can be implemented in three different ways

  • Static Implementation: The HTML code is copied and inserted into the header of each HTML page.
  • PHP implementation: You create a file called analyticstracking.php that contains the tracking code block. This page is included in each PHP template page
  • Dynamic Content Implementation: In order to implement Google Analytics, we can follow the basic process listed above, and reference our code block through an include or template reference when each page is called by a browser.

That’s the hardest part of the work done, since there are no specific firewall requirements to configure on your servers as it is your website’s visitors end devices that connect to the Google App servers.

The following web URLs are used to pass page statistics to Google’s servers.

https://ssl.google-analytics.com/__utm.gif
http://www.google-analytics.com/__utm.gif
http://www.google-analytics.com/ga.js
http://ssl.google-analytics.com/ga.js

3. Learn about your audience

This next step is all about implementation and once the tracking code is configured, we can start to analyze data about our users. A quick login to the dashboard shows us interesting data about users and we have the ability to create customizable reports of a multitude of things including the amount of time that Android users versus iPhone users spent on the site.

Screen Shot 2013-02-07 at 1.28.26 PM

So how exactly does Google Analytics work? Once we’ve configured Swing’s web pages to serve up the tracking code, Google Analytic’s servers will start collecting data about users visiting our site. Let’s explore this mechanism briefly:

Google uses three sources to obtain information about visitors to our sites:

  • HTTP requests: A common HTTP request from a visitor’s browser typically contains information about the type of browser, referrer (which site the users coming from) and also language
  • Browser/system information: The Document Object Model (DOM) is a format of organizing pages via a tree structure that was developed and standardized by the World Wide Web Consortium (W3C) and supported by most browsers. Using DOM, Google’s servers are able to extract detailed browser information such as flash and java support, screen resolution and keyboard support.
  • Cookies: Google uses First Party Cookies which contain small amounts of information about the individual user’s current session on a website and can be passed from one page to another as the user browses a website.

The tracking code works according to the following steps: (Taken from Google Analytic’s site for developers)

  1. A browser requests a web page that contains the tracking code.
  2. A JavaScript Array named _gaq is created and tracking commands are pushed onto the array.
  3. <script> element is created and enabled for asynchronous loading (loading in the background).
  4. The ga.js tracking code is fetched, with the appropriate protocol automatically detected. Once the code is fetched and loaded, the commands on the_gaq array are executed and the array is transformed into a tracking object. Subsequent tracking calls are made directly to Google Analytics.
  5. Loads the script element to the DOM.
  6. After the tracking code collects data, the GIF request is sent to the Analytics database for logging and post-processing.

In a nutshell, the JavaScript that is pasted into the HTML pages on your website instructs the visitor’s browser to connect to Go0gle Analytic’s servers and download some additional code. The code is stored in the browser’s DOM tree and continues to push information to Google’s Analytics server whenever a user performs an action on their browser while viewing your site.

A variety of actions can be tracked by visitors and I’ve included a list below to give you an idea of what’s possible:

  • Loading a page
  • An Event such as
    • Playing a video
    • Downloading a file
    • Clicking on a button
    • Hovering the mouse on the screen
  • An e-commerce transaction on your website
    • Adding an item to a shopping cart
    • Transaction details such as Transaction ID
  • Customized parameters
    • A member logs in with special privileges

By collecting each of these events, we can start to understand fairly accurately what a user does on each page visit. For example, a user loads the website from their Android powered phone at 4pm on a tuesday afternoon, while sitting in a busy commercial part of the city. After scrolling down the main page for 2 minutes, she searches for a particular gym that she visited 2 months ago for available classes that day. She quickly finds a class and signs up online, paying for the class using her stored account information.

Making Sense of the data

So what do we do with this information? Well, for an individual user, not much really, as Web Analytics packages do not present personal information about users visiting the site. On the other hand, if we have hundreds or thousands of users connecting to a site over time, we can start analyzing the data to identify trends that may be useful to the business. For example, if we’re seeing an upward trend in the number of visitors searching for a particular product or service in a given area, we might be able to find similar products and services to cater to that demand, or in the case of Swing’s management, this might mean advising facilities owners in that region to offer more of a type of product or service. No wonder, over 30 million websites on the Internet use some form of Web Analytics.

Conclusion:

In this article, we went over a simple implementation of a Web Analytics tool and also explored the basic mechanics of Google Analytics. In future posts in this series, we will look into performing some customization and also collecting best practices and recommendations for users interested in implementing a Web Analytics tool.

References:

http://antezeta.com/news/what-google-knows-about-google
http://searchenginewatch.com/article/2239469/How-to-Use-Google-Analytics-Advanced-Segments
http://techcrunch.com/2013/01/28/google-makes-using-analytics-easier-with-new-solution-gallery-for-dashboards-segments-custom-reports/
http://analytics.blogspot.co.uk/
http://blog.kissmetrics.com/50-resources-for-getting-the-most-out-of-google-analytics/
http://www.grokdotcom.com/2009/02/16/the-missing-google-analytics-manual/

Cloud Architecture – Serving Static Content

Introduction

A number of cloud hosting providers provide optimized static content delivery services, such as Amazon AWS Cloudfront and Microsoft Windows Azure CDN (Content delivery network). In this article, we will explore the elements of a scalable infrastructure that can be used to deliver static content at high capacity peak loads and build a test platform that we can perform load testing and benchmark performance.

Scenario

Let’s assume that a fictitious company Chimpcorp wants to offload the serving of static files from it’s primary website and stream this data to partners and customers around the world over the Internet. They want a cost-effective, yet scalable solution that can allow large numbers of users to download the files at the same time. Furthermore, the data must be available at all times and resilient to failure and hacking attempts. Our task is to build a web infrastructure to handle this. We can summarize the goals below:

  • Serve static content
  • Cost effective
  • Low latency Scalable
  • Fault Tolerant
  • World-wide access

We have also developed a list of assumptions and verified them with Chimpcorp in order to narrow down our design:

  • The content consists of files approximately 5 KB in size
  • Content is static and will not change frequently
  • Servers must be able to host more than 10,000 simultaneous connections
  • All users will connect to a public Internet website (download.chimpcorp.com)
  • All users must access the content via HTTP
  • All users access identical content
  • There is no requirement to track user sessions
  • The servers must be secured to prevent tampering of data

Deconstructing the problem

Our first step is to break down the entire conversation between the end user and the web server serving up the content. The conversation can be described as follows:

  1. User launches a browser from their device
  2. User types in a URL into their browser
  3. The browser checks its cache; if requested object is in cache and is fresh, skip to #13
  4. Browser asks OS for server’s IP address corresponding to the DNS name in the URL
  5. OS verifies that the DNS name is not already cached in it’s host file
  6. OS makes a DNS lookup and provides the corresponding IP address to the browser
  7. Browser opens a TCP connection to server (Socket to Port 80)
  8. HTTP traffic traverses the Internet to the server
  9. Browser sends a HTTP GET request through TCP connection
  10. Server looks up required resource (if it exists) and responds using the HTTP protocol
  11. Browser receives HTTP response
  12. After sending the response the server closes the socket
  13. Browser checks if the response is a 2xx (200 OK) or redirect (3xx result status codes), authorization request (401), error (4xx and 5xx), etc. and handles the request accordingly
  14. If cacheable, response is stored in cache
  15. Browser decodes response (e.g. if it’s gzipped)
  16. Browser determines what to do with response (e.g. is it a HTML page, is it an image, is it a sound clip?)
  17. Browser renders response, or offers a download dialog for unrecognized types
  18. User views the data .. and so on

Evaluating components and areas for optimization

Our next step is to analyze the individual elements in the conversation for opportunities for optimization and scaling.

  1. End users’ browser/bandwidth/ISP – Beside the fact that users must access our servers via HTTP over the Internet, we do not have any control over the type and version of browser, the quality and bandwidth of ISP service, or the device that the user accesses the content from.
  2. DNS Lookup – It take approximately 20-120 milliseconds for DNS to resolve an IP Address. Users connecting from around the world can either use Geo-aware redirection or Anycast DNS for smart resolution of IP addresses to a web host close to the server.
  3. Server Location – As users will be accessing the host from locations around the world, the servers should be co-located close to where the users are in order to reduce round trip times. We can use Geo-aware DNS to relay users to servers that are located in their geographical region.
  4. TCP Session Parameters – As we are serving small static content over our website, we can analyze the specific parameters of the TCP session in order to identify potential areas for optimization. Examples of TCP parameters are listed below
    1. TCP Port Range
    2. TCP Keepalive Time
    3. TCP Recycle and Reuse times
    4. TCP Frame Header and Buffer sizes
    5. TCP Window Scaling
  5. TCP Header Expiration/Caching – We can set an expiry header to expire far into the future to reduce page loads for static content that does not change. We can also use Cache control headers specified in the HTTP 1.1 standard to tweak caching.
  6. HTTP Server Selection – With the wide variety of HTTP Servers available in the market, our optimal selection should take into account the stated project objectives. We should be looking for a Web Server that can efficiently server static content to large numbers of users, be able to scale out and have some degree of customization for effectiveness.
  7. Server Resource Allocation – Upon our selection of the appropriate Web server, we can select the appropriate hardware setup, bearing in mind specific performance bottlenecks for our chosen HTTP server, such as Disk I/O, Memory allocation and Web caching.
  8. Optimizing Content – We can optimize how content is presented to users. For example, compressed files take less time to be downloaded from the server to the end user and image files should can be optimized and scaled accordingly.
  9. Content Offloading – Javascripts, images, CSS and static files can be offloaded to Content Delivery Networks. For this scenario, we will rely on our web servers to host this data.
  10. Dynamic Scaling – Depending on the load characteristics of our server, we should find a solution to rapidly scale out our web performance either horizontally (adding nodes) or vertically (adding resources).

Design Parameters

Our next stage is to compile the analysis into tangible design parameters that will shape our final design. The design components and related attributes are listed as follows:

  • Server Hosting Platform: A global cloud services provider is a cost efficient way to deploy identical server farms around the world.
  • DNS Hosting: A highly available DNs forwarding solution that incorporates anycast DNS is preferrable to a GEO-aware resolution service.
  • HTTP Server Selection: An Nginx web server configured on an optimized Linux platform will provide highly-scalable and resource efficient platform for our task. The primary advantage of Nginx over other more popular Web Server technologies such as Apache is that Nginx doesn’t spawn a new thread for each incoming connection. Existing worker processes accept new requests from a shared listen socket. Nginx is also widely supported and has an active user community and support base.

Nginx Optimization

The following parameters were used to optimize the Nginx server deployment

  1. CPU and Memory Utilization – Nginx is already very efficient with how it utilizes CPU and Memory. However, we can tweak several parameters based on the  type of workload that we plan to serve. As we are primarily serving static files, we expect our workload profile to be less CPU intensive and more disk-process oriented.
    1. Worker_processes –  We can configure the number of single-threaded Worker processes to be 1.5 to 2 x the number of CPU cores to take advantage of Disk bandwidth (IOPs).
    2. Worker_connections – We can define how many connections each worker can handle. We can start with a value of 1024 and tweak our figures based on results for optimal performance. The ulimit -n command gives us the numerical figure that we can use to define the number of worker_connections.
    3. SSL Processing – SSL processing in Nginx is fairly processor hungry and if your site serves pages via SSL, then you need to evaluate the Worker_process/CPU ratios. You can also turn off Diffie-Hellman cryptography and move to a quicker cipher if you’re not subject to PCI standards. (Examples: ssl_ciphers RC4:HIGH:!aNULL:!MD5:!kEDH;)
  2. Disk Performance – To minimize IO bottlenecks on the Disk subsystem, we can tweak Nginx to minimize disk writes and ensure that Nginx does not resort to on-disk files due to memory limitations.
    1. Buffer Sizes – Buffer size defines how much data we can store in the host. A buffer size that is too low will result in Nginx having to upstream responses on disk, which introduces additional latency due to disk read/write IO response times.
      1. client_body_buffer_size: The directive specifies the client request body buffer size, used to handle POST data. If the request body is more than the buffer, then the entire request body or some part is written in a temporary file.
      2. client_header_buffer_size: Directive sets the headerbuffer size for the request header from client. For the overwhelming majority of requests it is completely sufficient to have a buffer size of 1K.
      3. client_max_body_size: Directive assigns the maximum accepted body size of client request, indicated by the line Content-Length in the header of request. If size is greater the given one, then the client gets the error “Request Entity Too Large” (413).
      4. large_client_header_buffers: Directive assigns the maximum number and size of buffers for large headers to read from client request. The request line can not be bigger than the size of one buffer, if the client sends a bigger header nginx returns error “Request URI too large” (414). The longest header line of request also must be not more than the size of one buffer, otherwise the client get the error “Bad request” (400). These parameters should be configured as follows:
        client_body_buffer_size 8K;
        client_header_buffer_size 1k;
        client_max_body_size 2m;
        large_client_header_buffers 2 1k;
    2. Access/Error Logging – Access Logs record every request for a file and quickly consume valuable disk I/O. Error logs should not be set too Low unless it is our intention to capture every single HTTP error. A warm level of logging is sufficient for most production environments. We can configure Logs to store data in chunks, defining chunk sizes in (8KB, 32KB, 128KB)
    3. Open File Cache – The open file cache directive stores Open file descriptors, including information of the file, location and size.
    4. OS File Caching – We can define parameters around the size of the cache used by the underlying server OS to cache frequently accessed disk sectors. Caching the web server content will reduce or even eliminate disk I/O.
  3. Network I/O and latency – There are several parameters that we can tweak in order to optimize how efficiently the server can manage a given amount of network bandwidth due to peak loads.
    1. Time outs – Timeouts determine how long the server maintains a connection and should be configured optimally to conserve resources on the server.
      1. client_body_timeout: Directive sets the read timeout for the request body from client. The timeout is set only if the body is not obtained in one read step. If after this time the client send nothing, nginx returns error “Request time out” (408).
      2. client_header_timeout: Directive assigns timeout with reading of the title of the request of client. The timeout is set only if a header is not obtained in one readstep. If after this time the client send nothing, nginx returns error “Request time out” (408).
      3. keepalive_timeout: The first parameter assigns the timeout for keep-alive connections with the client. The server will close connections after this time. The optional second parameter assigns the time value in the header Keep-Alive: timeout=time of the response. This header can convince some browsers to close the connection, so that the server does not have to. Without this parameter, nginx does not send a Keep-Alive header (though this is not what makes a connection “keep-alive”).  The author of Nginx claims that 10,000 idle connections will use only 2.5 MB of memory
      4. send_timeout: Directive assigns response timeout to client. Timeout is established not on entire transfer of answer, but only between two operations of reading, if after this time client will take nothing, then nginx is shutting down the connection.

        These parameters should be configured as follows:

        client_body_timeout 10; client_header_timeout 10; keepalive_timeout 15; send_timeout 10;

    2. Data compression – We can use Gzip to compress our static data, reducing the size of the TCP packet payloads that will need to traverse the web to get to the client computer. Furthermore, this also reduces CPU load when serving large file sizes. The Nginx HTTP Static Module should be used gzip on; gzip_static on;
    3. TCP Session parameters – The TCP_* parameters of Nginx
      1. TCP Maximum Segment Lifetime (MSL) – The MSL defines how long the server should wait for stray packets after closing a connection and this value is set to 60 by default on a Linux server.
    4. Increase System Limits – Specific parameters such as the number of open file parameters and the number of available ports to serve connections can be increased.

Solution Design A: Amazon AWS

The design parameters in the section above were used to build our scalable web solution. The components of our solution are as follows:

  • Amazon EC2 AMIs: Elastic Compute Cloud will be used to host our server farms. Nginx offers a fully supported AMI instance in EC2 that we can tweak to further optimize performance to suit our needs. This AMI is readily deployable from the AWS marketplace and includes support from Nginx Software Inc. We will deploy Nginx on a High-CPU Medium Instance featuring the following build specs:
    • 1.7 GB RAM
    • 5 EC2 Compute Units (2 Virtual Cores)
    • 350 GB instance storage
    • 64-bit architecture
    • Moderate I/O performance
  • Elastic IPs: Amazon provides an Elastic IP Service that allows us to associate a static Public IP Address to our virtual host.
  • Amazon Route 53: This scalable DNS service allows us to implement an Anycast DNS solution for resolving hosts to an environment that is closest to them.
  • Additional Options: A number of automation and deployment tools were utilized to enhance the efficiency of the environment:
    • EC2 Command Line tools
    • Automated deployment and lifecycle management via Chef
    • Development testing via Vagrant
    • Centralized code repository and version control via Github
  • Solution VariantsThe following design variants were introduced in order to benchmark our original build selection against alternative deployment scenarios. The scenarios are as follows:
    • Nginx Systems AMI
    • CentOS optimized Nginx with PHP-FPM
    • Apache Web Server on Ubuntu

Measuring our solution’s effectiveness Here we define a number of simple measures that we can use to benchmark the effectiveness of our solution

  • Cost per user – defined as the cost of the solution divided by the number of users within a given period, this measures the cost effectiveness of the solution
  • Server Connectivity Metrics – These metrics are relevant to Web Server performance
    • Number of Requests per Second
    • Number of Connections per Second
    • Average/Min/Max Response rates
    • Response Times (ms)
  • System Performance Metrics – these metrics relate specifically to system performance
    • CPU/Memory Load

Testing our design

We provisioned an M2.Medium tier EC2 instance in the same availability zone to test host level performance without the complications of network latency between test host and server. We ran tests at increasing levels of concurrency and the number of requests per second. We used the following tools to test the performance of our solution in relation to our test scenario:

Test Results:

Baseline (Nginx AMI):  Httperf tests returned 1.1ms reply times for linearly increasing loads up to 30,000 connections/ second over large sample sizes. The host started to display increasing standard deviations closer to 30,000 simultaneous connections, indicating potential saturation. Memory usage on the host remained stable at around 17 MB even during peak loads.

Optimized CentOS (Nginx AMI): Httperf tests returned similar response times and response rates as the baseline host, up to 30,000 connections/second ( >1000 concurrent connections), however results showed higher consistency over large samples and lower standard deviation.

Apache (Ubuntu Host): Httperf tests returned 100+ms response times for linearly increasing loads up to 10,000 connections/second, quickly saturating at 6,000 connections/sec. Each httpd instance occupied approximately 100MB of memory and quickly consumed the majority of system resources on the 1.7GB RAM virtual host.

Baseline

Fig: Nginx Optimized Performance

Conclusions:

Overall performance on the Nginx platform for delivering static content (HTML files) was far superior to Apache. Nginx performance out-of-the-box is respectable and with further customization of settings, can provide highly optimal and scalable results.

Recommendations:

In order to further increase the performance of the solution, we propose the following recommendations:

  • Increase sample sizes – Due to time constraints, the number of repetitions run for load testing was low. For real-life production environments. we recommend running httperf in a wrapper like ab.c over a larger number of repetitions (>1000) at varying load factors in order to build a respectable sample. At this point, trends will be easier to identify.
  • Implement in-memory caching – not currently supported natively in Nginx
  • Implement Elastic Load Balancing  – ELB has been load tested to over 20k simultaneous connections
  • Migrate Static Content to Cloud Front – While Nginx can provide superior performance, it’s is most popularly deployed as a reverse proxy to offload static content from dynamic code like PHP. Amazon’s Cloud Front is optimized to provide superior and scalable web content delivery that can synchronize across multiple locations.
  • Cost Management – If costs are a concern, we can certainly move to the open source Nginx solution in comparison to the provisioned AMI instance and save on support and licensing costs.

References:

http://stackoverflow.com/questions/2092527/what-happens-when-you-type-in-a-url-in-browser http://www.hongkiat.com/blog/ultimate-guide-to-web-optimization-tips-best-practices/ http://yuiblog.com/blog/2007/04/11/performance-research-part-4/
http://geekexplains.blogspot.co.uk/2008/06/whats-web-server-how-does-web-server.html http://nbonvin.wordpress.com/2011/03/24/serving-small-static-files-which-server-to-use/ Overview of Nginx architecture Implement Nginx in AWS Httperf command reference Performance  testing
http://gwan.com/en_apachebench_httperf.html

Cloud Architectures: The Strategy behind homogeneity

Extract

So what is homogeneity? Homogeneity refers to keeping things identical or non differentiable from one another on purpose. For example, Cloud Service Providers create homogenous infrastructure on a massive scale by deploying commodity hardware in their datacenters, thus enjoying lower incremental costs which they pass on to the consumer via lower prices. Technology architects can similarly design their applications and services to run over these homogenous computing nodes or building blocks, facilitating horizontal scaling capabilities commonly referred to as elastic computing. In this article, we will explore how homogeneity has been in use long before the era of modern technology and also derive meaningful takeaways from each particular adaptation of homogeneity.

Agility  – Military Tactics

From the dawn of greatest of ancient civilizations, battles were fought and won by ambitious generals relying on the strength of their strategies and ability to quickly adapt their tactics to changing battlefield situations. The conquering Romans provide a fascinating study on the implementation of homogeneity and standardization at the fundamental unit level. A Roman Legion was subdivided into 10 units or cohorts, with each cohort further subdivided into six centuries,  numbering 80 men on average.

Legion

While each Century served as an individual unit in battle and had autonomy of movement under the command of a centurion. the true tactical advantage of the Roman Legion was seen through the combination of these units on the battlefield, known as formations. Each formation presented a different pattern of deploying military units, optimized for various scenarios and terrains, including troop movement and battle formations depending on the relative strength of one’s army and that of opposing forces. For example, the Wedge formation highlighted below, drew the strongest elements of the force into the center that could be used to drive forward through opposing forces. Similarly, the Strong Right Flank formation provided tactical strength versus an opposing army under the principle that a strong right flank could quickly overrun an opposing force’s left flank since the enemy’s left-hand side would be encumbered by shields and less agile to sideways strikes.

RomanFormations

Under the watchful eye of a skilled and experienced commander, the Roman Legion was a force to be reckoned with, but this ultimately relied on the abilities of each individual unit to be deployed rapidly in battle and  to act reliably in performing their role. The Romans dominated the battlefield, besting even the famed Greek Phalanx due to the agility by which the Legion could change formations in the heat of battle. Our takeaway here is that homogeneous building blocks provide scalability and flexibility in response to rapidly changing situations.

Operational Efficiency – Southwest Airlines

Southwest

Southwest Airlines has reached great commercial success in part to its strategy of operating a homogenous fleet of Boeing 737 aircraft. The utilization of a single build of aircraft greatly enhances operational efficiencies in terms of technical support, training, holding spare parts and even route planning, since passenger and baggage loads are fairly consistent across each plane. It’s also less complicated to plan for fleet growth and to allocate resources in anticipation of future demand with a homogenous unit. The task of staffing employees for flights is also greatly simplified, since aircrews are trained to operate one type of aircraft across several variants and the task of scheduling crews to support multiple routes along Southwest’s Point-to-Point system becomes a much simpler endeavor. Southwest Chairman Gary Kelly announced plans in April 2012 to acquire 74 Boeing 737-800s by 2013 in order to augment their existing fleet of 737-700s. This expansion represents an incremental upgrade to their existing fleet capacity, in response to greater demand across Southwest’s expanding network and also to capture greater economy amidst rising fuel costs. An important takeaway from Southwest Airlines is that homogenization leads to operational efficiencies and also drives competitive advantage especially in highly price sensitive markets. Most importantly, homogenization has tangible benefits by directly enhancing profits.

Build Standardization – MacDonald’s Fast Food Restaurants

MacDonalds

Probably one of the most classic examples of enhanced operational efficiency derived from developing a homogenous product, this iconic fast-food provider was founded on the principles of consistency, homogeneity and ease of preparation. While in recent years, MacDonald’s has stepped up its efforts to localize its menu in an effort suit local palates, over 30% of revenues are driven by sales of core items that include the Big Mac, hamburger, cheeseburger, Chicken McNuggets and their world-famous french fries as stated by CEO Jim Skinner during a 2011 earnings call. Sticking to a shorter, standardized menu was part of the company’s push to adhere to stringent quality standards and became a crucial factor in building operational efficiencies from resource procurement capabilities during company’s international expansion in the 1970s. MacDonald’s was specific and exacting in it’s standards when expanding into new markets, to the point of defining the genetic variety of potatoes to use in it’s french fries. This is an important takeaway particularly from the perspective of  a business that is expanding into International markets and whose products are not exactly exportable from one country to another. Standardization and homogenization of a product catalog makes it easier to decide early on whether to local source for materials or to bring your own.

Resource Sharing – Automobile Twinning

Auto Twinning

A long-perceived benefit of consolidating automobile manufacturers into several large corporations that govern the production of numerous brands such as America’s General Motors Group and Ford Motor Company as well as Germany’s Volkswagen Group lies in the ability to utilize a common drive train or chassis that can be deployed across different brands or makes of automobiles. For example, the Ford Escape, Mercury Mariner and Mazda Tribute are all built on the same chassis and share a large proportion of their components. The differences in these cars tend more to be aesthetic in nature, since they’re designed to appeal to different consumer segments, nevertheless the financial benefits to the automobile manufacturer are far more tangible. By getting design teams across various brands to collaborate at an early stage in a car’s development, manufacturers can assemble  a common design platform, of which elements can be later customized to fulfill individual brand attributes. The results are that  manufacturers are able to reap huge benefits from doubling or in some cases tripling their economies of scale as well as accelerating returns on their R&D investment dollars. Our takeaway here is that common components or build elements can be identified and jointly developed as shared resources to prevent duplication of effort and wastage of resources. Service Oriented Architectures that subscribe to a shared services model are a great example of this philosophy in action.

Conclusion

In this article, we explored how the fundamental Cloud Architecture principle of homogenization could provide agility to the Roman Legions, operating efficiencies to Southwest Airlines, standardization to MacDonald’s restaurants and cost savings and improved ROI for major automotive manufacturers. The fact is, examples abound in the world to showcase the rational concepts behind building applications in the Cloud and for this very reason, the Cloud is an exciting and fundamentally compelling evolution of computing for our generation because it makes sense! Thank you for reading.

About RoadChimp

RoadChimp is a published author and trainer who travels globally, writing and speaking about technology and how we can lend paradigms from other industries in order to build upon our understanding of a rapidly emerging technologies. The Chimp started off his technology career by building one of the first dot.com startups of its type in Asia and subsequently went on to gain expertise in Large Scale computing in North America, the Caribbean and Europe. Chimp is a graduate of Columbia Business School and

References

Article on Roman Military Tactics (http://romanmilitary.net/strategy/legform/)

Blog Article on Southwest Airlines (http://blog.kir.com/archives/2010/03/the_genius_of_s.asp)

MacDonald’s Menu breakdown of profitability (http://adage.com/article/news/mcdonald-s-brings-u-s-sales-years/232319/)

FTC study on MacDonald’s International Expansion (http://www.ftc.gov/be/seminardocs/04beyondentry.pdf)

Report on Automotive Twinning (http://www.edmunds.com/car-buying/twinned-vehicles-same-cars-different-brands.html#chart)

Cloud Architectures: Session Handling

Introduction

Deploying applications to the cloud, requires a critical re-think of how applications should be architected and designed to take advantage of the bounty that the cloud has to offer. In many cases, this requires a paradigm shift in how we design the components of our applications to interact with each other. In this post, we shall explore how web applications typically manage session state and how cloud services can be leveraged to provide greater scalability.

Web Application Tiers

It is a common practice to design and deploy Scalable Web Applications in a 3-tiered configuration, namely as follows:

1. Web Tier: This tier consists of anywhere from a single to a large number of identically configured web servers that are primarily responsible for authenticating and managing requests from users as well as coordinating requests to subsequent tiers in the web architecture. Cloud-enabled Web Servers commonly utilize the HTTP protocol and the SOAP or REST styles to facilitate communication with the Service Tier.

2. Service Tier: This tier is responsible for managing business logic and business processing. The Service Tier comprises of a number of identical stateless nodes that host services that are responsible for performing a specific set of routines or processes.

3. Data Tier: The data tier hosts business data across a number of structured or unstructured formats and most cloud providers commonly host a variety of storage formats, including Relational Databases, NoSQL and simple File Storage, commonly known as BLOBS (Binary Large OBjects).

4. Load Balancing (Optional): An optional tier of load balancers can be deployed on the perimeter of the  Web Services tier to load balance requests from users and distribute load among servers in the Web Tier.

Managing Session State

Any web application that serves users in a unique way needs an efficient and secure method of keeping track of each user’s requests during active session.  For example, an e-commerce shopping site that provides each user with a unique shopping cart needs to be able to track the individual items in each user’s shopping cart for the duration of their active web session. More importantly, the web application that serves the user needs to be designed to be resilient to failures and potential loss of session data. In a Web Services architecture, there are a number of methods which can be employed to manage the session state of a user. We shall explore the most common methods below:

  • Web Tier Stateful (Sticky) Sessions: A web application can be designed such that the active Web Server node that a user get’s redirected to stores the session information locally and all future requests from the user are served by that node. This means that the Server Node becomes stateful in nature. Several disadvantages of this design are that the node serving the user becomes a single point of failure and also that any new nodes added to the collection of Web Servers can only share the load of subsequently established sessions since existing sessions continue to be maintained on existing nodes, thus severely limiting the scalability of this design and its ability to evenly distribute load
  • Web Tier Stateless Session Management: This design solves several limitations stated in the previous design by storing user session state externally, that can be referenced by any of the connected Web Server nodes. An efficient way to store small amounts of session data can be via a small cookie that stores a Session ID for the individual user. This Session ID can serve as a pointer for any inbound request between the user and the Web Application. Cloud Service Providers offer various types of storage to host Session State data, including NoSQL, Persistent Cloud Storage and Distributed Web Cache storage. For example, a web-tier request would be written to use AJAX to call REST services that would in turn pull JSON data relating to an individual user’s session state.
  • Service Tier Stateless Session Management: In most web architectures, the Service Tier is designed to be insulated from user requests. Instead, the Web Tier acts as a proxy or intermediary, allowing the Service Tier to be designed to be run in a trusted environment. Services running in this tier do not require state information and are completely stateless. Due to this statelessness, the service tier enjoys the benefits of the loose coupling of services, which allows individual services to be written in different forms of code such as Java, Python, C#, F# or Node.js based on the proficiency of the development teams and are still able to communicate with each other.

Summary

Stateless Session Management allows us to build scalable compute nodes within a Web Application Architecture that are easy to deploy and manage, and reduce single points of failure and take advantage of scalability and resiliency offered by Cloud Services providers to host session state data.

Cloud Architectures: Cloud Adoption Roadmaps

The cloud is becoming a viable option for scalability and resilience, but it requires fundamental rethinks in terms of architecting and building web applications in order to fully take advantage of the benefits of moving to the cloud. What we are seeing in the industry is a risk-averse trend for organizations to adopt cloud services-based applications, that I have categorized into the phases listed below. Which phase is your organization currently in?

P1: Evaluating

Organizations are willing to trial out the features of cloud services for trivial or non-business essential functions. Primary drivers are cost efficiencies and elasticity. Use cases include application development sandboxes.

P2: Hosting/Consolidating

Organizations are migrating peripheral services to dedicated virtual hosting providers. Common implementations include peripheral/edge services such as E-mail scanning; DNS hosting and Static Web Page/FTP services. On-premise and Cloud boundaries are very clearly defined within the Enterprise Architecture and the Cloud is commonly relegated to extranet or the perimeter-edge of the Enterprise computing environment.

P3: Infrastructure Virtualization/ Disaster-Recovery (IAAS)

Organizations approach cloud services as an extension of their computing infrastructure, largely due to interoperability between on-premise and cloud-based virtualized environments. Some organizations also see an opportunity to update their off-site Backup; Disaster Recovery and Data Compliance/Retention practices by employing a dedicated managed services provider. Key drivers are cost efficiencies, managed outsourcing of specific tasks and key task automation and alerting. The organization is starting to embrace Cloud environments and internal stakeholders perceive that the negative/downside risks of cloud computing are acceptable on a very limited scale.

P4: Hybridization (SAAS)

Organizations start identifying potential application candidates that can be deployed onto a cloud environment, by way of an application upgrade or feature extension. Lead candidates for hybridization may be influenced by software vendors who are offering their products in hybrid or cloud-based variants. Hybridization or Cloud-migration of business applications carries a significant degree of business risk and such decisions usually involve key decision makers.

P5: Cloud Native Architecture Development (PAAS)

Organizations have built up a certain degree of comfort with their cloud service providers and internal technology stakeholders begin embracing technical advantages of moving to Platform As A Service (PAAS). Such implementations require significant investments in evaluating the benefits of Cloud Architectural Patterns and potential re-architecting of existing applications to subscribe to Cloud Services paradigms. Cloud Services are seen as a key element of business strategy execution and receive high-level stakeholder support. Key drivers include enhancing competitive advantage as well as operational efficiencies realized from self-provisioning, elasticity.

Current State of the Nation

RoadChimp sees that the majority of businesses out there are still moving from the P2 to P3 stage, where a number of organizations have started adopting IAAS and SAAS through dedicated Managed Services Providers and Vendors. The uptick onto cloud Platform providers is still in it’s early days, since standards and maturity models still need more time to amalgamate and gain widespread user adoption. The exception to this is largely technology-based startups (Post 2008) who are largely leap-frogging traditional on-premise infrastructure and executing their business logic directly on the cloud.