Cloud Architectures – Storage in the Cloud

roadchimp clouds

Brief

Cloud technology is deployed across a wide variety of industries and applications. The term ‘Cloud’ itself has become so widely prevalent that we’ve devised additional terms in an effort to describe what type of cloud we’re talking about. What’s your flavor? Iaas, Paas or Saas? Or perhaps it’s Public, Private or Hybrid?

Regardless of the type of cloud you’re using or planning to implement, there’s no denying that storage is an essential component of every cloud architecture that simply cannot be overlooked. In this post, we will look into some of the most common usages of storage in the cloud and peel back the layers to discover exactly what makes them tick. Our goal is to come up with a yardstick to measure storage design.

Drivers towards Cloud Storage adoption

dropbox box_logo

What do Dropbox and Box Inc have in common? Both companies are less than 5 years old, offer services predominantly centered around cloud storage and file sharing and have been able to  attract significant amounts of capital from investors. In fact, Dropbox raised $250 million at a $4 billion dollar valuation from investors with  Box Inc raising another $125 million in mid 2012. It looks like Silicon Valley sees Cloud Storage services as a key piece in the future of cloud. So why is there such a tremendous interest around cloud storage? Consumers are drawn to a number of benefits of using cloud:

  • Redundancy: Large clouds incorporate redundancy at every level. Your data is stored in multiple copies on multiple hard drives on multiple servers in multiple data centers in multiple locations (you get the picture).
  • Geographical Diversity: With a global audience and a global demand for your content, we can place data physically closer to consumers by storing it at facilities in their country or region. This dramatically reduces round trip latency, a common complaint for dull Internet performance.
  • Performance: Storage solutions in the cloud are designed to scale dramatically upwards to support events that may see thousands or millions more consumers accessing content over a short period of time. Many services provide guarantees in the form of data throughput and transfer.
  • Security & Privacy:  Cloud storage solutions incorporate sophisticated data lifecycle management and security features that enable companies to fulfill their compliance requirements. More and more cloud providers are also providing services that are HIPAA compliant.†
  • Cost: As clouds get larger, the per unit costs of storage go down, primarily due to Economies of Scale. Many service providers choose to pass on these cost savings to consumers as lower prices.
  • Flexibility: The pay as you use model takes away concerns for capacity planning and wastage of resources due to cyclical variations in usage.

It should be noted that a Draft opinion paper released by the EU Data Protection Working Party while not explicitly discouraging Cloud adoption, recommended that Public Sector agencies perform a thorough risk analysis prior to migrating to the cloud. You can read the report here.

Storage Applications for the Cloud

We’ve listed some of the most common applications for cloud storage in this section:

  • Backup: The cloud is perceived to be a viable replacement for traditional backup solutions, boasting greater redundancy and opportunities for cost savings. The Cloud backup market is hotly contested in both the consumer and enterprise markets.
    • In the consumer market, cloud backup services like Dropbox, Microsoft SkyDrive and Google Drive offer a service that takes part of your local hard drive and syncs them up with the cloud. The trend for these pay for use services are on the rise, with Dropbox hosting data for in excess of 100 million users within four years of launching their service.
    • In the Enterprise Space, Gartner’s magic quadrant for enterprise backup solutions featured several pureplay Cloud backup providers including Asigra, Acronis and i365. Even leading providers such as CommVault and IBM have launched cloud-based backup solutions. Amazon’s recently launched Glacier service provides a cost-effective backup tier for around $0.01 per gigabyte per month.
      01 Gartner Magic Quadrant
  • File Sharing: File sharing services allow users to post files online and then share the files to users using a combination of Web links or Apps.  Services like Mediafire, Dropbox and Box offer a basic cloud backup solution that provides collaboration and link sharing features. On the other end of the spectrum, full-blown collaboration suites such as Microsoft’s Office 365 and Google Apps feature real-time document editing and annotation services.
  • Data Synchronization: (between devices): Data synchronization providers such as Apple’s iCloud as well as a host of applications including the productivity app Evernote allow users to keep files  photos and even music synchronized across array of devices (Desktop, Phone, Tablet etc.) to automatically synchronize changes
    evernote
  • Content Distribution: Cloud content distribution network (CDN) services are large networks of servers that are distributed across datacenters over the internet. At one point or another, we’ve used CDNs such as Akamai to enhance our Web browsing experience. Cloud providers such as the Microsoft Windows Azure Content Distribution Network (CDN) and the Amazon CDN offer affordable CDN services for serving static files and images to even streaming media to global audience.
  • Enterprise Content Management Companies are gradually turning to the cloud to manage Organizational compliance requirements such as eDiscovery and Search. Vendors such as HP Autonomy and EMC provide services that feature secure encryption and de-duplication of data assets as well as data lifecycle management.
  • Cloud Application Storage: The trend towards hosting applications in the cloud is driving innovations in how  we consume and utilize storage. Leading the fray are large cloud services providers such as Amazon and Microsoft who have developed cloud storage services to meet specific applications needs.
    • Application Storage Services: Products like Amazon Simple Storage Service (S3) and Microsoft Windows Azure Storage Account support storage in a variety of formats (blob, queue and table data) and scaling to very large sizes (Up to 100TB volumes).  Storage services are redundant (at least 3 copies of each bit stored) and can be accessed directly via HTTP, XML or a number of other supported protocols. Storage services also support encryption on disk.
      02 Azure storage
    • Performance Enhanced Storage: Performance enhanced storage emulates storage running on a SAN and products like Amazon Elastic Block Storage provide persistent, block-level network attached storage that can be attached to virtual machines running and in cases VMs can even boot directly from these hosts. Users can allocate performance to these volumes in terms of IOPs.
    • Data Analytics Support: Innovative distributed file systems that support super-fast processing of data have been adapted to the cloud. For example, the Hadoop Distributed File System (HDFS) manages and replicates large blocks of data across a network of computing nodes, to facilitate the parallel processing of Big Data. The Cloud is uniquely positioned to serve this process, with the ability to provision thousands of nodes, perform compute processes on each node and then tear down the nodes rapidly, thus saving huge amounts of resources. Read how the NASA Mars Rover project used Hadoop on Amazon’s AWS cloud here.

Storage Architecture Basics

03 Generic Storage Architecture

So how do these cloud based services run? If we were to peek under the hood, we would see a basic architecture that is pretty similar to the diagram above. All storage architectures comprise of a number of layers that work together to provide users with a seamless storage service. The different layers of a cloud architecture are listed below:

  • Front End: This layer is exposed to end users and typically exposes APIs that allow access to the storage. A number of protocols are constantly being introduced to increase the supportability of cloud systems and include Web Service Front-ends using REST principles, file-based front ends and even iSCSI support. So for example, a user can use an App running on their desktop to perform basic functions such as  creating folders,  uploading and modifying files, as well as defining permissions and share data with other users. Examples of Access methods and sample providers are listed below:
    • REST APIs: REST or Representational State Transfer is a stateless Web Architecture model that is built upon communications between clients and servers. Microsoft Windows Azure storage and Amazon Web Services Simple Storage Service (S3)
    • File-based Protocols: Protocols such as NFS and CIFS are supported by vendors like Nirvanix, Cleversafe and Zetta*.
  • Middleware: The middleware or Storage Logic layer supports a number of functions including data deduplication and reduction; as well as the placement and replication of data across geographical regions.
  • Back End: The back end layer is where the actual physical hardware is implemented and we refer to read and write instructions in the Hardware Abstraction Layer.
  • Additional Layers: Depending on the purpose of the technology, there may be a number of additional layers
    • Management Layer: This may supporting scripting and reporting capabilities to enhance automation and provisioning of storage.
    • Backup Layer: The cloud back end layer can be exposed directly to API calls from Snapshot and Backup services. For example Amazon’s Elastic Block Store (EBS) service supports a incremental snapshot feature.
    • DR (Virtualization) Layer: DR service providers can attach storage to a Virtual hypervisor, enabling cloud storage data to be accessed by Virtual Hosts that are activated in a DR scenario. For example the i365 cloud storage service automates the process of converting backups of server snapshots into a virtual DR environment in minutes.

Conclusion:

This brief post provided a simple snapshot of cloud storage, it’s various uses as well as a number of common applications for storage in the cloud. If you’d like to read more, please visit some of the links provided below.

Roadchimp, signing out! Ook!

Reference:

* Research Paper on Cloud Storage Architectures here.
Read a Techcrunch article on the growth of Dropbox here.
Informationweek Article on Online Backup vs. Cloud Backup here.
Read more about IBM Cloud backup solutions here.
Read about Commvault Simpana cloud backup solutions.

3 thoughts on “Cloud Architectures – Storage in the Cloud”

Leave a Reply

Your email address will not be published. Required fields are marked *