Last modified on 1 July 2009, at 09:00

Cloud Computing

Revision as of 09:00, 1 July 2009 by Lakshmikantg (Talk | contribs) (Cloud computing applications in different industry segments)

Summary
One of the most talked about topics today is Cloud Computing – the new phenomenon set to change the way we use computers forever. Cloud computing refers to the delivery of software and other technology services over the Internet by a service provider and has been widely acknowledged as a viable way to reduce capital expenditures and operational costs. Although many companies have embraced this technology, some are unwilling to switch from internally owned and managed IT systems to cloud computing technologies due to fears of security threats and loss of control over company systems and data.

With its growing popularity, a large number of firms have started providing this service. We tried to compare some of these offerings on various parameters like infrastructure, data storage system, supported applications/frameworks, scalability, security etc.

The worldwide market for cloud services is expected to grow from $46.4 billion in 2008 to reach $150.1 billion in 2013 with a CAGR of 26.5%. Among the various cloud services, business process services alone have contributed 84% of the total revenue generated in 2008.

In this report, we mention some of the industry segments which have successfully utilized cloud computing. We also take a brief look into how some of the photo-sharing websites like SmugMug are using cloud computing.


Overview

  • Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.
  • Users need not have knowledge of, expertise in, or control over the technology infrastructure in the "cloud" that supports them.
  • Cloud computing services often provide common business applications online that are accessed from a web browser, while the software and data are stored on the servers.


The concept generally incorporates combinations of the following:

Market overview

Cloud services provided worldwide are:

Companies providing cloud computing technology

Y-on-Y revenue from cloud services

Cloud services - revenue breakup

Pros and cons of cloud computing

Key drivers of cloud computing


Primary research - Amazon Web Services

In our primary research, we talked to an AWS executive to gain insights about how startups and enterprise companies are using cloud computing and, drivers and constraints for the same. Valuable insights have been summarized below:

Startups

Startups-cloud.jpg

Enterprise

Enterprise.jpg

Cloud computing comparison of different vendors

  Amazon.com Inc. Microsoft GoGrid (ServePath LLC) Google Joyent Inc. Layered Technologies Inc. (3tera Inc. Partner) Mosso (Rackspace US Inc.) Terremark Worldwide Inc. Xcalibre Communications Ltd.
Offerings EC2 (Elastic Compute Cloud) plus S3 (Simple Storage Service) Windows Azure GoGrid Google App Engine Accelerator GridLayer The Hosting Cloud Enterprise Cloud FlexiScale
Providers Infrastructure Runs on Amazon.coms infrastructure. Each availability zone runs on its own physically distinct, independent infrastructure and is engineered to be highly reliable Microsofts infrastructure GoGrid uses the ServePath network, including OC‐12 and GB connections from UUNet, Level 3 Communications Inc., NTT Communications/Verio Inc. and AboveNet Inc Googles infrastructure AMD Opteron x64 multicore servers with 4GB RAM per core. Joyent operates two carriergrade datacenters Uses other providers networks, including SAVVIS Inc.s backbone IP network, which is based on Juniper core routers and Cisco Communications Inc. carrier‐class equipment and has OC‐48/OC‐192 backbone trunks; the DataBank IP network. Runs 3teras AppLogic grid OS Built on Rackspaces Superstructure, including its T1, SAS70‐certified datacenters Powered by Terremarks Infinistructure utility‐computing platform and runs on the multinational hosting providers network; includes plug‐and‐play access to 160 glob carriers High‐end x86‐based servers, high‐speed multi‐GB network, high‐end storage back end
Providers Data‐Storage System Amazon S3 Web service Azure Data Storage REST APIs, data storage functionality through three services BLOB, TABLE and QUEUE Storage is part of the server image; vendor plans to offer additional storage options Persistent storage with queries, sorting and transactions Sun Microsystems Sun Fire X4500 NAS storage Storage is built into the Grid; data is stored on multiple hard drives across multiple machines for redundancy Uses network‐attached storage devices Undisclosed Persistent storage based on a fully virtualized SAN/NAS back end
Supported Operating Systems Linux Communication protocols such as SOAP, REST and XML allows use of other OS Linux, Microsoft Windows, CentOS and Red Hat Enterprise Linux, Microsoft Windows and Mac OS X OpenSolaris Linux and Solaris; plans to support Windows Linux and Microsoft Windows Linux, Microsoft Windows and Solaris Linux and Microsoft Windows
Supported Languages Linux and Red Hat Enterprise .NET, Eclipse, Ruby, PHP, Java and Python Java, .NET, Perl, PHP, Python, Ruby on Rails and most shell‐scripting languages Python Java, PHP, Python and Ruby on Rails Grid nodes will run any software that runs on a normal dedicated compute .NET, Perl, PHP, Python and Ruby on Rails Undisclosed Linux and Microsoft Windows
Supported Applications/
Framework
MySQL Enterprise and OpenSolaris ASP.NET and WCF (Windows communications foundation) Apache, Facebook applications, IIS, MySQL Enterprise, PostgreSQL and Windows Serve Django. Services include URL Fetch, Memcache and image manipulation Ruby on Rails Apache, Jboss, MySQL Enterprise; anything that runs under a supported OS Apache, Microsoft SQL and MySQL Enterprise Undisclosed Through partnership with CohesvieFT, customers can access a large number of preconfigured application stacks
Scalability Limited to 20 virtual computer instances during beta period; additional instances are allowed No limits on scalability, additional VMs are allocated as processing load increases No limits on scalability Up to 5 million page views per month with preview release Contact vendor Up to 43 nodes. Bandwidth, RAM and CPU are changed on‐the‐fly. Process can be resized
(2 minutes per 1GB of data)
Unlimited. Current users are pushing hundreds of millions of requests on single domains Undisclosed 5 VDSes (Virtual Dedicated Servers) per account; more machines are available on request
Security Provides Web‐service interfaces to configure firewall settings that control network access to and between groups of instances Execute applications in dedicated virtual machines, each VM provides a 64-bit windows server 2008 environment, VMs prevent data leakage from one to the other application executing on same server hardware Provided via ServePaths secure infrastructure and telecom facility Service runs on Googles secure infrastructure. App Engine provides a secure sandbox environment Spam protection; advanced traffic security, SSL acceleration and Advanced DNS available as add‐ons All grid nodes are locked down to maintain access only when firewalls and other security features are put in place. Also, a front‐end DDoS (distributed denial of service) mitigation service is available Enterprise firewalls; email accounts include anti‐virus and spam protection. SSL capabilities available as an add‐on service Infrastructure is SAS 70 Type II certified. Network includes Integrated firewalls and private VLAN architecture; connections to the Infinicenter management console are secured by SSL Each customer has a VLAN; VDSes are separated by a Xen implementation; and customer data is stored in a T1 storage back end
Virtualization Technology Xen Modified Hyper-V hypervisor Xen Undisclosed Solaris Zones Based on 3tera AppLogic Undisclosed Undisclosed Xen
Redundancy Features Ability to place server instances in multiple locations and elastic IP addresses Triple-layer redundancy to keep data safe and availability of services high RAID servers; plans to offer server snapshots and cloning Fault‐tolerant servers Undisclosed Backup and snapshot feature for customers data Clusters Automated resource balancing for monitoring and optimization A failed server is automatically removed from the cluster; VDSes running on it are automatically and transparently restarted on other servers
Load Balancing Undisclosed Yes, route traffic to active nodes only Yes; F5 Network BigIP load balancers. Yes Yes; F5 Network BigIP load balancers No. Customers can set up their own load balancing Yes; load‐balancing layer includes logic for multiple IP addresses for each customer site Undisclosed Yes; as optional add‐on
Control Panel Web‐service interface Web interface Yes; proprietary multiserver hosting control panel lets you manage servers and scale Web applications and networks Proprietary, the Administration Console Undisclosed all servers, storage, applications and users are managed from a single, browser‐based management console Proprietary to Mosso Infinicenter console for deploying, configuring and managing server and network infrastructure Yes; API (application programming interface) also available. Control panel includes usage‐tracking tool
Development Tools Command‐line tools for building AMIs Integration into Visual Studio, support for any .NET languages, complete command-line SDK tools No, not necessary; plans to release public API with the same control as the Web interface Python runtime environment, Sandbox accelerators, central development and deployment, version control, unit test site and staging site available as add‐ons AppLogic has a scriptable command‐line interface for provisioning and scaling applications None Undisclosed No
Additional Cloud‐Storage Service Included in cloud service Undisclosed No Datastore, a distributed data‐storage service Additional storage available for $.15 per GiB DynaVol; ranges from $15/month to $1,300/month Limited; CloudFS is in private beta No No

Technical overview

Cloud computing architecture

Three major participants in cloud:

  1. Cloud Providers; building out clouds, for instance Google, Amazon, etc. effectively technology providers.
  2. Cloud Adopters / Developers; those developing services over the Cloud and some becoming the first generation of Cloud ISVs.
  3. Cloud "End" Users; those using Cloud provisioned services, often without knowing that they are cloud provisioned, e.g. Facebook users have no idea that their favorite FB app is running on AWS.

Various architectural layers:

  1. Operations: it supports functional business processes rather than supporting the technology itself.
  2. Service layer: it is made up of application code, bespoke code, high-level ISV offerings.
  3. Platform layer: it is made up of standard platform software i.e. app. servers, DB servers, web servers, etc., and an example implementation would be a LAMP stack.
  4. Infrastructure layer: it is made up of
(i) infrastructure software (i.e.virtualisation and OS software)
(ii) the hardware platform and server infrastructure
(iii) the storage platform
  1. Network layer: it is made up of routers, firewalls, gateways, and other network technology.

Delta of Effort / Delta of Opportunity

The gap between the cloud providers and the end cloud users is known as the delta of effort and also the delta of opportunity.

It is the delta of effort in terms of skills, abilities, experience and technology that the cloud adopter needs to deliver a functional service to their own “End Users”. This will be potentially a major area of cost to the cloud adopters. But it's also the delta of opportunity in terms of 'room' to innovate.


Cloud computing applications in different industry segments

Pharmaceutical

  • Historically, most large pharmaceutical firms have run fully integrated vertical business models, doing all they can in-house and choosing to selectively outsource where appropriate
  • Firms now seek to reduce the time and cost through sharing an integrated platform which is cheaper, less time consuming, and more supportive of a networked business model
  • Pharma companies exploring cloud computing have reported positive experiences through
  • Easier implementation
  • More computational transparency
  • A clear-cut IP policy
  • Scalable invoicing

Pfizer

  • Protein engineers and informaticians at Pfizer’s Biotherapeutics and Bioinnovation Center faced the challenging task of antibody docking that presented enormous computational roadblocks.
  • Each antibody model requires the respectable models of the protein’s three-dimensional structure to be generated on Rosetta++ platform which carries out refinement of antibody docking.
  • Refinement involves small local perturbations around the binding site followed by evaluation with Rosetta’s energy function – an iterative process using 200-nodes cluster that requires a massive amount of computing.
  • An array of Rosetta workers are spun up on Amazon’s EC2. The S3 stores inputs and outputs, SimpleDB tracks job meta-data, and the Simple Queue Service glues it all together with message passing and the entire process is completed overnight which previously took months.
  • Consequently, the research staff is focusing more on results without pushing their projects on back-shelf.
Source: bio-itworld

Eli Lilly

  • Eli Lilly is using cloud computing services to support scientists with on-demand processing power and storage.
  • The firm uses Amazon Web Services and other cloud services to provide high-performance computing, as needed, to hundreds of its scientists.
  • Eli Lilly foresees the possibility of using cloud services from a half dozen different vendors and need for an “orchestration layer” that sits between Eli Lilly and the various cloud services.
  • It would comprise algorithms that determine the best cloud service for a particular job based on lowest cost, highest performance, or other requirement. Such an approach would make it possible for Eli Lilly and other users to write to a single API rather than many, while optimizing service usage.
  • The firm is also exploring the potential to use cloud computing for external collaboration between Eli Lilly and outside researchers.
Source: undertheradarblog

Johnson and Johnson

  • Johnson and Johnson is seeking to complement its high performance grid architecture with cloud computing, mainly in the area of drug discovery modeling applications, according to Rick Franckowiak, director for systems engineering at the Pharmaceutical Research & Development IT organization at Johnson & Johnson.
  • Require enhanced computing and storage capabilities and address spike-type processing demand.
Source:fiercebiotechit

Indigo BioSystems Inc.

  • Indigo BioSystems Inc., a privately held company offering data management and automated analysis solutions for life science researchers with a focus on the pharmaceutical industry.
  • It has deployed IBM's Compute on Demand cloud services to provide clients with a highly secure, scalable and compliant environment for data exchange.
  • IBM’s cloud services has been able to meet the clients’ requirements for a scalable and globally accessible platform for data exchange alongwith the security and regulatory compliances of the pharmaceutical industry.
Source: prdomain

Hospitality

Cloud computing is used in hospitality industry to provide

  • Disaster recover infrastructure for mission critical applications
  • For online reservation system
  • Purchase of a dedicated pool of computing resources and allocating them as needed
  • Facilitating in responding to real-time situations
e.g. Preferred Hotel Group is using Terremark's Enterprise cloud services

Source: phx.corporate-ir.net

Web applications

Cloud computing is used in various web applications such as Microsoft's Hotmail, Google's Gmail and YouTube, and Yahoo's Flickr photo-sharing service etc

  • Consumers run only their browsers on local computers
  • The rest of the software along with users' email messages, photos or videos are on remote machines the user can't see and doesn't have to know anything about it
e.g. Microsoft's Hotmail is using Azure (cloud computing platform from Microsoft) and Google's Gmail is using Google Apps
  • Google Docs - online versions of word processor and spreadsheet applications is also using cloud computing

Source: www.htrends.com

Consumer electronics

Cloud computing is being used in consumer electronics at various levels:

  • In Laptop computers for Wireless communication and access to the Internet
  • Laptops require minimum hardware to reduce cost and a internet connection
  • Gaming industry, that will allow iPhones and other thin-client devices to have really high-end graphics without having a big, expensive hot video card in them that draws battery life
e.g Netbook, a general purpose laptop, works on cloud computing and is available at a price of US$400 and some even in US$50-100 range.

Source: www.htrends.com

Retail

  • Cloud services allow retailers to plan for demand peaks in online services dynamically without worrying about provisioning for high availability
  • Pay-as-you-model helps them save cost rather than paying for expensive hardware to meet these peaks while they are under used for the rest of the year
  • Increases productivity and help companies serve their customers better.
e.g. Amazon is using cloud computing for its online retail services

Source: www.onwindows.com

Financial Services

Cloud computing is used by financial services firms:

  • To Store and analyze large amount data related to stock market or historical data
  • Build and evaluate new risk analysis programs
  • Allows companies to bring application to the market quickly and deployed within a limited budget
e.g. NASDAQ Market Replay application uses Amazon's S3 cloud for data storage but the application part is not running in cloud presently. It has plans to develop future applications in Amazon's EC2.

Companies using cloud computing

The New York Times

Major League Baseball

ESPN

Hasbro

Chicago

Cybernet Slash Support (CSS)

Intuit

Activision

Cloud computing in online photo sharing

Cloud computing advantages like scalability, reduced hardware costs and extensive reach is being utilized by some leading photo sharing websites.

SmugMug

  • SmugMug is a premium photo sharing web site with an emphasis on professional photography.
  • Members are allowed to upload unlimited number of photos, create new functionality, tagging and uploading from Picasa, iPhoto and other software packages.
Source: Wikipedia
  • SmugMug is using Amazon S3 mainly as a storage solution for customers’ photos which can be accessed anywhere, anytime while also providing real-time backup and failover systems.


The architecture basically consists of three software components:

TweetPhoto

TweetPhoto is a free photo-sharing service that compliments the social-media service Twitter which is utilizing Rackspace Hosting’s cloud-computing unit Mosso and allowing users to easily upload as many photos as they want and automatically share them with Twitter followers and Facebook friends. It also allow users to conduct photo searchers, subscribe to RSS media feeds, geotagging from GPS-enabled smartphones or transfer bio and friend information from Twitter as well as monitor which other users are viewing and commenting on their photos.

Source: www.bizjournals.com

TweetPhoto utilizes the Cloud Site and Cloud File services from Rackspace.

  • Cloud Site is a scalable platform which can store and recall countless amount of metadata in real-time and has the ability to run Windows or Linux applications across hundreds of servers.
  • Cloud File Provides unlimited online storage for media which is served out via Limelight Networks' Content Delivery Network (CDN)
  • CDN is a system of computers networked together on the internet and works on the principle that the capacity sum of strategically placed servers can be higher than the network backbone capacity. Strategically placed edge servers
  • Decrease the load on interconnects, public peers, private peers and backbones
  • Free up capacity
  • Lower delivery costs
  • Offloads traffic from peer link and redirects traffic to edge servers

Source:Wikipedia,businesswire.com, blog.mosso.com

Flickr

  • Flickr is an image and video hosting website, web services suite, and online community platform.
  • Used to share personal photographs and also widely used by bloggers as a photo repository.


Flickr architecture:

The platform for Flickr is:

  • PHP
  • MySQL
  • Shards
  • Memcached for a caching layer.
  • Squid in reverse-proxy for html and images.
  • Smarty for templating
  • Perl
  • PEAR for XML and Email parsing
  • ImageMagick, for image processing
  • Java, for the node service
  • Apache
  • SystemImager for deployment
  • Ganglia for distributed system monitoring
  • Subcon stores essential system configuration files in a subversion repository for easy deployment to machines in a cluster.
  • Cvsup for distributing and updating collections of files across a network.

Source:highscalability.com


Barriers to the adoption of cloud computing