Cloud Architecture – Implementing Nginx in AWS

Brief:

In this post, we will be deploying Nginx as an AMI instance in Amazon’s Elastic Compute Cloud (EC2). This post will document our steps to configure and optimize Nginx for serving static pages to a global audience of users.

Planning and Preparation:

Nginx is a powerful web server that can be deployed in combination with other services such as Fast CGI or Apache backends to provide scalable and efficient web infrastructures. AWS makes it extremely fast and convenient to implement your own Nginx instances. You can choose to deploy the Linux variant of your choice and then deploy Nginx on your OS, or you can choose to deploy the Nginx AMI Appliance developed by Nginx Inc., available in the Amazon Marketplace for am additional licensing fee.

01 Nginx AMI

A few factors to consider when making this decision:

  • Customization: For those who are comfortable with Linux and interested in tweaking Operating System parameters to get the best performance, a custom implementation would make the most sense as you have a lot of control. 
  • Cost: The key to running an effective cloud services infrastructure is to ensure that you have a tight cloud implementation that leaves little wastage of excess resources with the help of cloud features such as auto scaling. That being said, the Nginx AMI has a license fee (US $0.13/hr for an m1.medium instance at time of writing) and should be factored into your total costs of ownership.

* Prior benchmark testing on loads up to 30,000 transactions per second revealed that performance differentials between the custom built AMI and Nginx Marketplace AMIs were insignificant.

Installation:

For this post, we will choose to implement the Nginx AMI which can be provisioned either from the AWS Web Management console or scripted and launched via Amazon’s EC2 tools. We will need to first access the Nginx AMI appliance page in the Amazon Marketplace to accept the terms and conditions.

03 EC2 console

After the installation in complete, we can verify that the host is setup via the following steps:

  1. Open a browser and connect to the Public DNS name of the EC2 host. We should be able to view the default welcome page.
    04 Nginx Welcome
  2. We can also SSH into the host and run the following command to verify the status of the host. The default user is ec2-user
    /etc/init.d/nginx status
    We should get a response like "nginx (pid  1169) is running..."
  3. Next, we check for any available updates and run sudo yum update to apply all updates
  4. We can view the default configuration files of Nginx here:
    /etc/nginx/conf.d/default.conf
    /etc/nginx/nginx.conf
  5. We should configure xginx to automatically start at reboot
    chkconfig nginx on
  6. Some basic commands:
    1. Start the nginx service: service nginx start
    2. Restart the nginx service: sudo nginx -s reload

Alternatively, I like to use Chef to control automated installations of Nginx. Nginx is currently supported on Ubuntu 10.04, 12.04 and CentOS 5.8 and 6.3 Operating Systems. This support page gives you more information regarding the implementation.

Component Configuration:

Our next task is to configure Nginx and all the necessary components we require to serve up our web content. For this post, we will be using Nginx to host some static web content.

  1. Configure Remote Access
    1. I provisioned a Security group in AWS known as Webserver, enabling SSH, HTTP and HTTPS traffic from all IPs. Depending on how you choose to secure your deployment, you may choose to deploy a single management host with SSH access and then only enable SSH from that host into the Web server.
  2. HTTP Server Components
    1. We should make sure that we install Nginx with the latest components.
    2. Nginx should be configured with only required  components in order to minimize it’s memory footprint. We can run the following command:
      ./configure --prefix=/webserver/nginx --without-mail_pop3_module --without-mail_imap_module  --without-mail_smtp_module --with-http_ssl_module  --with-http_stub_status_module  --with-http_gzip_static_module
    3. The Nginx AMI image is automatically configured at startup to serve a default index.html page at the location /usr/share/nginx/html
    4. To configure additional components, you can run the nginx-setup command. You will be asked to select which components to install, afterwhich the script will install all prerequisite packages.
    5. After completion, the web application will be installed in the following default location /var/www/default
  3. Load Static content
    1. Our Static content is stored on an EBS volume snapshot, which we can access as follows:
      1. Attach and mount the volume in EC2 console
      2. Run fdisk -l to identify which device is our EBS volume (in this case /dev/xvdf1)
      3. Create a new directory for this EBS volume sudo mkdir /mnt/ec2snap
      4. Set permissions to access this directory sudo chmod 0777 /mnt/ec2snap
      5. Mount the device into this folder sudo mount /dev/xvdf1 /mnt/ec2snap -t ntfs
    2. Our static content can now be copied to our default folder location
      1. First, we rename the default files created at the time of installation
        sudo mv /usr/share/nginx/html/index.html /usr/share/nginx/html/index_old.html
      2. Then we perform the copy and set necessary permissions on the file
        cp /mnt/ec2snap/html/index.html /usr/share/nginx/html
        sudo chmod 644 /usr/share/nginx/html/index.html
      3. Now we should test our web server to ensure that our configuration is working, by browsing to our public web server online.
      4. Lastly, we should also un-mount the EBS volume
        sudo umount /mnt/ec2snap
  4. Backup snapshot
    1. At this point, it’s wise to quickly run a snapshot before delving into the configuration files. (Command Ref.)
      ec2-create-snapshot –aws-access-key AKIAJMLFQQMQVPBBDJFQ –aws-secret-key SGm81OzfAQT/obL24hFH79NYvd8OAb/05qRSAlI3 –region us-west-1 vol-f41285d5 -d “backup-Nginx-$(date +”%Y%m%d”)”
      06 Snapshot

Nginx Optimization:

Nginx allows administrators to perform a considerable number of tweaks to optimize performance based on our underlying system resources. We’ve listed a number of basic tweaks here. Make sure that you thoroughly test these settings before deploying into a production environment.

  1. CPU and Memory Utilization – Nginx is already very efficient with how it utilizes CPU and Memory. However, we can tweak several parameters based on the  type of workload that we plan to serve. As we are primarily serving static files, we expect our workload profile to be less CPU intensive and more disk-process oriented.
    1. Worker_processes – We can configure the number of single-threaded Worker processes to be 1.5 to 2 x the number of CPU cores to take advantage of Disk bandwidth (IOPs).
    2. Worker_connections – We can define how many connections each worker can handle. We can start with a value of 1024 and tweak our figures based on results for optimal performance. The ulimit -n command gives us the numerical figure that we can use to define the number of worker_connections.
    3. SSL Processing – SSL processing in Nginx is fairly processor hungry and if your site serves pages via SSL, then you need to evaluate the Worker_process/CPU ratios. You can also turn off Diffie-Hellman cryptography and move to a quicker cipher if you’re not subject to PCI standards. (Examples: ssl_ciphers RC4:HIGH:!aNULL:!MD5:!kEDH;)
  2. Disk Performance – To minimize IO bottlenecks on the Disk subsystem, we can tweak Nginx to minimize disk writes and ensure that Nginx does not resort to on-disk files due to memory limitations.
    1. Buffer Sizes – Buffer size defines how much data we can store in the host. A buffer size that is too low will result in Nginx having to  upstream responses on disk, which introduces additional latency due to disk read/write IO response times.
      1. client_body_buffer_size: The directive specifies the client request body buffer size, used to handle POST data. If the request body is more than the buffer, then the entire request body or some part is written in a temporary file.
      2. client_header_buffer_size: Directive sets the headerbuffer size for the request header from client. For the overwhelming majority of requests it is completely sufficient to have a buffer size of 1K.
      3. client_max_body_size: Directive assigns the maximum accepted body size of client request, indicated by the line Content-Length in the header of request. If size is greater the given one, then the client gets the error “Request Entity Too Large” (413).
      4. large_client_header_buffers: Directive assigns the maximum number and size of buffers for large headers to read from client request. The request line can not be bigger than the size of one buffer, if the client sends a bigger header nginx returns error “Request URI too large” (414). The longest header line of request also must be not more than the size of one buffer, otherwise the client get the error “Bad request” (400).These parameters should be configured as follows:
        client_body_buffer_size 8K;
        client_header_buffer_size 1k;
        client_max_body_size 2m;
        large_client_header_buffers 2 1k;
    2. Access/Error Logging – Access Logs record every request for a file and quickly consume valuable disk I/O. Error logs should not be set too Low unless it is our intention to capture every single HTTP error. A warm level of logging is sufficient for most production environments. We can configure Logs to store data in chunks, defining chunk sizes in (8KB, 32KB,128KB)
    3. Open File Cache – The open file cache directive stores Open file descriptors, including information of the file, location and size.
    4. OS File Caching – We can define parameters around the size of the cache used by the underlying server OS to cache frequently accessed disk sectors. Caching the web server content will reduce or even eliminate disk I/O.
  3. Network I/O and latency – There are several parameters that we can tweak in order to optimize how efficiently the server can manage a given amount of network bandwidth due to peak loads.
    1. Time outs – Timeouts determine how long the server maintains a connection and should be configured optimally to conserve resources on the server.
      1. client_body_timeout: Directive sets the read timeout for the request body from client. The timeout is set only if a body is not get in one readstep. If after this time the client send nothing, nginx returns error “Request time out” (408).
      2. client_header_timeout: Directive assigns timeout with reading of the title of the request of client. The timeout is set only if a header is not get in one readstep. If after this time the client send nothing, nginx returns error “Request time out” (408).
      3. keepalive_timeout: The first parameter assigns the timeout for keep-alive connections with the client. The server will close connections after this time. The optional second parameter assigns the time value in the header Keep-Alive: timeout=time of the response. This header can convince some browsers to close the connection, so that the server does not have to. Without this parameter, nginx does not send a Keep-Alive header (though this is not what makes a connection “keep-alive”).  The author of Nginx claims that 10,000 idle connections will use only 2.5 MB of memory
      4. send_timeout: Directive assigns response timeout to client. Timeout is established not on entire transfer of answer, but only between two operations of reading, if after this time client will take nothing, then nginx is shutting down the connection.These parameters should be configured as follows:

        client_body_timeout   10;
        client_header_timeout 10;
        keepalive_timeout     15;
        send_timeout          10;
    2. Data compression – We can use Gzip to compress our static data, reducing the size of the TCP packet payloads that will need to traverse the web to get to the client computer. Furthermore, this also reduces CPU load when serving large file sizes. The Nginx HTTP Static Module should be used with the following parameters:
      gzip on;
      gzip_static on;
    3. TCP Session parametersThe TCP_* parameters of Nginx
      1. TCP Maximum Segment Lifetime (MSL) – The MSL defines how long the server should wait for stray packets after closing a connection and this value is set to 60 by default on a Linux server.
    4. Increase System Limits – Specific parameters such as the number of open file parameters and the number of available ports to serve connections can be increased.

Nginx configuration file:

Prior to rolling out any changes into production, it’s a good idea to first test our configuration files.

  1. We can run the command nginx -t to test our config file. We should make sure that  we receive an ‘OK’ result before restarting the services
    09 Test config

Conclusion:

In this post, we revealed how easy it is to set up an Nginx server on Amazon AWS. The reference links below provide a wealth of additional information on how to deploy Nginx under varying scenarios, please give them a read.. Ook!

Road Chimp, signing off.

Reference:

http://nginx.org/en/docs/howto_setup_development_environment_on_ec2.html
http://www.lifelinux.com/how-to-optimize-nginx-for-maximum-performance/
http://blog.martinfjordvald.com/2011/04/optimizing-nginx-for-high-traffic-loads/
https://calomel.org/nginx.html
http://nginxcp.com/forums/Forum-help-and-support
http://wiki.nginx.org/Pitfalls
Configure start-stop script in Nginx
Configuring PHP-FPM on an Nginx AMI
http://theglassicon.com/computing/web-servers/install-nginx-amazon-linux-ami
Custom CenOS: http://www.idevelopment.info/data/AWS/AWS_Tips/AWS_Management/AWS_10.shtml
Chef Configuration for Nginx
Github Cookbooks for Nginx in Chef
http://kovyrin.net/2006/05/18/nginx-as-reverse-proxy/lang/en
http://www.kegel.com/c10k.html
http://forum.directadmin.com/showthread.php?p=137288
https://engineering.gosquared.com/optimising-nginx-node-js-and-networking-for-heavy-workloads
Configure Larger System Open File Limits

PMP Exam Prep – Part 2: PMP Exam Application Tips

Ook!

To a lot of exam candidates, submitting their PMP examination application can be a very daunting task. I’ve built up quite a lot of experience helping my students and colleagues with their PMP Exam Applications and so I decided to write this article in order to provide some tips and tricks that might end up saving you time.

Estimated duration: 20+ hours (no kidding!)
The average applicant takes a minimum of 20 hours to put together their examination application materials; format their experience into projects; quantify their experience into the project knowledge areas and finally click on the submit button. Click here to read up on the certification requirements in a previous post.

Why this much effort?

To understand the amount of effort required to put in a good application, you first need to understand PMI’s rationale. As a non-profit organization, PMI’s major sources of revenues are from membership and examination fees. But since most candidates only take an examination once in their lifetime, the bulk of PMI’s revenues comes from membership fees. In other words, PMI must be able to sustain it’s operations and growth strategies primarily from it’s pool of members.

It’s also a little known secret that the PMP examination is not difficult to pass; the fact that there’s close to 400,000 certified PMPs’ globally can attest to that. However the important thing that PMI has to constantly manage is the validity of it’s credential. In other words, what quality benchmarks attest to the PMP certification?

By implementing a rigorous screening process, peppered with blind audits, PMI has been able to maintain a successful certification program that can withstand the quality of the PMP credential from external auditors.

So what doe this boil down to?

  • The PMP examination is not difficult to pass. It reflects knowledge of a framework that is based on Industry Best Practice. If you have the prerequisite amount of experience, you would have encountered similar situations tested in the exam. 
  • PMI puts in place the rigorous certification requirements in order to defend the validity of the certification. As a PMP, I take pride in holding this credential, because I know that holders must have had to pass rigorous experiential requirements before they could even take the exam.
  • PMI wants you to pass the screening stage. They need more people to take their exams, pass and become lifelong members. That’s how they can grow their organization and keep adding tremendous value to the greater Project Management Profession at large.

Conclusion: Therefore, the exam application process may take time and effort, but it is totally worth the time and effort. Also, if another 400,000 people could have done this, then you most certainly could as well. They key here is strategy and planning and I’ve put down some steps to guide you in this process.

PMP Application Steps:

  1. Update your resume
  2. Start looking for previous work experience that constitutes as Projects
    1. Definition: Temporary/Unique Deliverable
    2. Role: Show career progression > role in earlier projects as a contributor and progressing to role of PM in more recent projects
    3. Identify 7-10 projects
    4. Each project should have a minimum duration of 300 hours (3 months – 1 yr)
    5. Each project requires a description:
      1. Scope/scale of project: x customers, y sites, z users
      2. Unique deliverable: what tangible product/service/result was the customer left with? Reports, documentation, application/site, process
      3. Role: What did you do?
      4. Responsibilities:
        1. Early career: Technical deliverables, design, contributing
        2. Lead to PM career: Reports, creating plans, identifying scope/risk
  3. Project hourly breakdown
    1. Out of 100% of hours in a project
      1. 10-15% initiating
      2. 15-25% planning
      3. 25 – 35% executing
      4. 10% monitoring & controlling
      5. 5%  closing

Let’s go through this in greater detail:

1. Update your resume:

You should make sure that your resume is up to date. Pay attention to dates and company information, as you will need this information when you upload all of your examination data to the PMI website.

2. Look for projects:

Now that you have an accurate copy of your resume, start listing out 7-10 sizeable projects from the most recent 5 or 8 years, depending on which category you fit into. (See my previous post for more information). PMI has no limitation of projects that you can list in your application, but for expedience sake, most candidates find that 7-10 projects are a sizeable number that they can work with effectively.

Each project should fall under PMI’s definition of a project. Meaning to say that the project should have a definitive start date and end date (and not carry on indefinitely) and the project should have a unique end-result or deliverable.

For each project, you will need to obtain the following information that I have listed in this example:

Project Title: Global Infrastructure Migration
Company Name: Monkey Business Inc.
(Company Address and phone number)
Project Duration: 13 months
Hours: 1500 hours
Project Start and Finish: December 2005 – January 2007
Project Description:
1. Scope: 20 locations globally, 3000 users, 350 servers. Team size:  30 IT team members globally. New York was one of six network hub locations within the global network.  I was responsible for the network sites in the North-East region (New York, New Hampshire, Boston).
2. Deliverables: Deliverables included the consolidation and migration of three Windows 2003 Active Directory Domains across 20 sites, 3000 users and 350 servers spanning a global network with offices in North America, Europe and Asia; integration of Exchange 2000 and Exchange 2003 messaging platforms; restructuring of network addressing and routing at offices in the North-East region; deployment of VPN connectivity to branch offices; collaborating with corporate engineers at several key network locations within the global IT infrastructure.  Responsible for updating project team during Weekly status update meetings with IT team, authoring technical elements of project plan, providing duration and cost estimates to project manager.

3. Project Durations

PMI categorizes project effort into 5 categories. Initiating, Planning, Executing, Monitoring and Controlling, and Closing. I’ve provided some guidelines above on how you could break down your project effort into the different process groups. Depending on the phase and scale of the individual project you were working on, as well as your role, these proportions would vary for you. I’ve listed 2 examples below:

Example 1:

Role: Project contributor.
> I was an engineer and performed a lot of implementation work.
Hours: 1500
Initiating: (5%) 75 hours > I attended the kick-off meeting
Planning: (5%) 75 hours > I was involved in some of the initial design and planning work
Executing: (55%) 825 hours > I was heavily involved in the implementation of all sites
Monitoring and Controlling: (20%) 300 hours > I attended all weekly status meetings and provided regular status updates to the PM over the 13 month project.
Closing: (15%) 225 hours > I was involved in the final user acceptance testing and project closeout activities.

Example 2:

Role: Project Manager.
> I was responsible for the deliverables, reporting and risks for this project
Hours: 1500
Initiating: (15%) 225 hours > I attended the kick-off meeting
Planning: (30%) 450 hours > I managed all aspects of the project planning deliverables. I developed the project Budgets, Scope and Timeline and finalized the project risk matrix.
Executing: (10%) 150 hours > I performed Quality Assurance and audits
Monitoring and Controlling: (25%) 300 hours > I was responsible for monitoring project status, compiling progress reports and providing executive summaries to sponsors.
Closing: (20%) 300 hours > I was responsible for final delivery of the project and contractual signoff of all deliverables.

Conclusion:

This seems like a lot of information to put together for each project, but I want you to understand that by helping the application reviewers to get an understanding of the scale and scope of your projects, as well as what your specific responsibilities were; they’re better able to gauge whether you’re qualified to pass the screening. Don’t forget that the reviewers may not have your same industry-specific experience, so try to stay away from industry terminology and nomenclature.