Difference between revisions of "Cloud Computing"
Lakshmikantg (Talk | contribs) (→Cloud computing applications in different industry segments) |
Lakshmikantg (Talk | contribs) (→Startups) |
||
Line 51: | Line 51: | ||
===Startups=== | ===Startups=== | ||
− | [[Image:startups- | + | [[Image:startups-cloud1.jpg|thumb|center|800px]] |
===Enterprise=== | ===Enterprise=== |
Revision as of 00:01, 15 July 2009
Summary
One of the most talked about topics today is Cloud Computing – the new phenomenon set to change the way we use computers forever. Cloud computing refers to the delivery of software and other technology services over the Internet by a service provider and has been widely acknowledged as a viable way to reduce capital expenditures and operational costs. Although many companies have embraced this technology, some are unwilling to switch from internally owned and managed IT systems to cloud computing technologies due to fears of security threats and loss of control over company systems and data.
With its growing popularity, a large number of firms have started providing this service. We tried to compare some of these offerings on various parameters like infrastructure, data storage system, supported applications/frameworks, scalability, security etc.
The worldwide market for cloud services is expected to grow from $46.4 billion in 2008 to reach $150.1 billion in 2013 with a CAGR of 26.5%. Among the various cloud services, business process services alone have contributed 84% of the total revenue generated in 2008.
In this report, we mention some of the industry segments which have successfully utilized cloud computing. We also take a brief look into how some of the photo-sharing websites like SmugMug are using cloud computing.
Contents
- 1 Overview
- 2 Market overview
- 3 Primary research - Amazon Web Services
- 4 Cloud computing comparison of different vendors
- 5 Technical overview
- 6 Cloud computing applications in different industry segments
- 7 Companies using cloud computing
- 8 Cloud computing in online photo sharing
- 9 Barriers to the adoption of cloud computing
Overview
- Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.
- Users need not have knowledge of, expertise in, or control over the technology infrastructure in the "cloud" that supports them.
- Cloud computing services often provide common business applications online that are accessed from a web browser, while the software and data are stored on the servers.
The concept generally incorporates combinations of the following:
Market overview
Cloud services provided worldwide are:
Companies providing cloud computing technology
Y-on-Y revenue from cloud services
Cloud services - revenue breakup
Pros and cons of cloud computing
Key drivers of cloud computing
Primary research - Amazon Web Services
In our primary research, we talked to an AWS executive to gain insights about how startups and enterprise companies are using cloud computing and, drivers and constraints for the same. Valuable insights have been summarized below:
Startups
Enterprise
Cloud computing comparison of different vendors
Amazon.com Inc. | Microsoft | GoGrid (ServePath LLC) | Joyent Inc. | Layered Technologies Inc. (3tera Inc. Partner) | Mosso (Rackspace US Inc.) | Terremark Worldwide Inc. | Xcalibre Communications Ltd. | ||
Offerings | EC2 (Elastic Compute Cloud) plus S3 (Simple Storage Service) | Windows Azure | GoGrid | Google App Engine | Accelerator | GridLayer | The Hosting Cloud | Enterprise Cloud | FlexiScale |
Provider’s Infrastructure | Runs on Amazon.com’s infrastructure. Each availability zone runs on its own physically distinct, independent infrastructure and is engineered to be highly reliable | Microsoft’s infrastructure | GoGrid uses the ServePath network, including OC‐12 and GB connections from UUNet, Level 3 Communications Inc., NTT Communications/Verio Inc. and AboveNet Inc | Google’s infrastructure | AMD Opteron x64 multicore servers with 4GB RAM per core. Joyent operates two carriergrade datacenters | Uses other providers’ networks, including SAVVIS Inc.’s backbone IP network, which is based on Juniper core routers and Cisco Communications Inc. carrier‐class equipment and has OC‐48/OC‐192 backbone trunks; the DataBank IP network. Runs 3tera’s AppLogic grid OS | Built on Rackspace’s Superstructure, including its T1, SAS70‐certified datacenters | Powered by Terremark’s Infinistructure utility‐computing platform and runs on the multinational hosting provider’s network; includes plug‐and‐play access to 160 glob carriers | High‐end x86‐based servers, high‐speed multi‐GB network, high‐end storage back end |
Provider’s Data‐Storage System | Amazon S3 Web service | Azure Data Storage REST APIs, data storage functionality through three services BLOB, TABLE and QUEUE | Storage is part of the server image; vendor plans to offer additional storage options | Persistent storage with queries, sorting and transactions | Sun Microsystems Sun Fire X4500 NAS storage | Storage is built into the Grid; data is stored on multiple hard drives across multiple machines for redundancy | Uses network‐attached storage devices | Undisclosed | Persistent storage based on a fully virtualized SAN/NAS back end |
Supported Operating Systems | Linux | Communication protocols such as SOAP, REST and XML allows use of other OS | Linux, Microsoft Windows, CentOS and Red Hat Enterprise | Linux, Microsoft Windows and Mac OS X | OpenSolaris | Linux and Solaris; plans to support Windows | Linux and Microsoft Windows | Linux, Microsoft Windows and Solaris | Linux and Microsoft Windows |
Supported Languages | Linux and Red Hat Enterprise | .NET, Eclipse, Ruby, PHP, Java and Python | Java, .NET, Perl, PHP, Python, Ruby on Rails and most shell‐scripting languages | Python | Java, PHP, Python and Ruby on Rails | Grid nodes will run any software that runs on a normal dedicated compute | .NET, Perl, PHP, Python and Ruby on Rails | Undisclosed | Linux and Microsoft Windows |
Supported Applications/ Framework |
MySQL Enterprise and OpenSolaris | ASP.NET and WCF (Windows communications foundation) | Apache, Facebook applications, IIS, MySQL Enterprise, PostgreSQL and Windows Serve | Django. Services include URL Fetch, Memcache and image manipulation | Ruby on Rails | Apache, Jboss, MySQL Enterprise; anything that runs under a supported OS | Apache, Microsoft SQL and MySQL Enterprise | Undisclosed | Through partnership with CohesvieFT, customers can access a large number of preconfigured application stacks |
Scalability | Limited to 20 virtual computer instances during beta period; additional instances are allowed | No limits on scalability, additional VMs are allocated as processing load increases | No limits on scalability | Up to 5 million page views per month with preview release | Contact vendor | Up to 43 nodes. Bandwidth, RAM and CPU are changed on‐the‐fly. Process can be resized (2 minutes per 1GB of data) |
Unlimited. Current users are pushing hundreds of millions of requests on single domains | Undisclosed | 5 VDSes (Virtual Dedicated Servers) per account; more machines are available on request |
Security | Provides Web‐service interfaces to configure firewall settings that control network access to and between groups of instances | Execute applications in dedicated virtual machines, each VM provides a 64-bit windows server 2008 environment, VMs prevent data leakage from one to the other application executing on same server hardware | Provided via ServePath’s secure infrastructure and telecom facility | Service runs on Google’s secure infrastructure. App Engine provides a secure sandbox environment | Spam protection; advanced traffic security, SSL acceleration and Advanced DNS available as add‐ons | All grid nodes are locked down to maintain access only when firewalls and other security features are put in place. Also, a front‐end DDoS (distributed denial of service) mitigation service is available | Enterprise firewalls; email accounts include anti‐virus and spam protection. SSL capabilities available as an add‐on service | Infrastructure is SAS 70 Type II certified. Network includes Integrated firewalls and private VLAN architecture; connections to the Infinicenter management console are secured by SSL | Each customer has a VLAN; VDSes are separated by a Xen implementation; and customer data is stored in a T1 storage back end |
Virtualization Technology | Xen | Modified Hyper-V hypervisor | Xen | Undisclosed | Solaris Zones | Based on 3tera AppLogic | Undisclosed | Undisclosed | Xen |
Redundancy Features | Ability to place server instances in multiple locations and elastic IP addresses | Triple-layer redundancy to keep data safe and availability of services high | RAID servers; plans to offer server snapshots and cloning | Fault‐tolerant servers | Undisclosed | Backup and snapshot feature for customer’s data | Clusters | Automated resource balancing for monitoring and optimization | A failed server is automatically removed from the cluster; VDSes running on it are automatically and transparently restarted on other servers |
Load Balancing | Undisclosed | Yes, route traffic to active nodes only | Yes; F5 Network BigIP load balancers. | Yes | Yes; F5 Network BigIP load balancers | No. Customers can set up their own load balancing | Yes; load‐balancing layer includes logic for multiple IP addresses for each customer site | Undisclosed | Yes; as optional add‐on |
Control Panel | Web‐service interface | Web interface | Yes; proprietary multiserver hosting control panel lets you manage servers and scale Web applications and networks | Proprietary, the Administration Console | Undisclosed | all servers, storage, applications and users are managed from a single, browser‐based management console | Proprietary to Mosso | Infinicenter console for deploying, configuring and managing server and network infrastructure | Yes; API (application programming interface) also available. Control panel includes usage‐tracking tool |
Development Tools | Command‐line tools for building AMIs | Integration into Visual Studio, support for any .NET languages, complete command-line SDK tools | No, not necessary; plans to release public API with the same control as the Web interface | Python runtime environment, | Sandbox accelerators, central development and deployment, version control, unit test site and staging site available as add‐ons | AppLogic has a scriptable command‐line interface for provisioning and scaling applications | None | Undisclosed | No |
Additional Cloud‐Storage Service | Included in cloud service | Undisclosed | No | Datastore, a distributed data‐storage service | Additional storage available for $.15 per GiB | DynaVol; ranges from $15/month to $1,300/month | Limited; CloudFS is in private beta | No | No |
Technical overview
Cloud computing architecture
Three major participants in cloud:
- Cloud Providers; building out clouds, for instance Google, Amazon, etc. effectively technology providers.
- Cloud Adopters / Developers; those developing services over the Cloud and some becoming the first generation of Cloud ISVs.
- Cloud "End" Users; those using Cloud provisioned services, often without knowing that they are cloud provisioned, e.g. Facebook users have no idea that their favorite FB app is running on AWS.
Various architectural layers:
- Operations: it supports functional business processes rather than supporting the technology itself.
- Service layer: it is made up of application code, bespoke code, high-level ISV offerings.
- Platform layer: it is made up of standard platform software i.e. app. servers, DB servers, web servers, etc., and an example implementation would be a LAMP stack.
- Infrastructure layer: it is made up of
- (i) infrastructure software (i.e.virtualisation and OS software)
- (ii) the hardware platform and server infrastructure
- (iii) the storage platform
- Network layer: it is made up of routers, firewalls, gateways, and other network technology.
Delta of Effort / Delta of Opportunity
The gap between the cloud providers and the end cloud users is known as the delta of effort and also the delta of opportunity.
It is the delta of effort in terms of skills, abilities, experience and technology that the cloud adopter needs to deliver a functional service to their own “End Users”. This will be potentially a major area of cost to the cloud adopters. But it's also the delta of opportunity in terms of 'room' to innovate.
Cloud computing applications in different industry segments
Pharmaceutical
- Historically, most large pharmaceutical firms have run fully integrated vertical business models, doing all they can in-house and choosing to selectively outsource where appropriate
- Firms now seek to reduce the time and cost through sharing an integrated platform which is cheaper, less time consuming, and more supportive of a networked business model
- Pharma companies exploring cloud computing have reported positive experiences through
- Easier implementation
- More computational transparency
- A clear-cut IP policy
- Scalable invoicing
Pfizer
- Protein engineers and informaticians at Pfizer’s Biotherapeutics and Bioinnovation Center faced the challenging task of antibody docking that presented enormous computational roadblocks.
- Each antibody model requires the respectable models of the protein’s three-dimensional structure to be generated on Rosetta++ platform which carries out refinement of antibody docking.
- Refinement involves small local perturbations around the binding site followed by evaluation with Rosetta’s energy function – an iterative process using 200-nodes cluster that requires a massive amount of computing.
- An array of Rosetta workers are spun up on Amazon’s EC2. The S3 stores inputs and outputs, SimpleDB tracks job meta-data, and the Simple Queue Service glues it all together with message passing and the entire process is completed overnight which previously took months.
- Consequently, the research staff is focusing more on results without pushing their projects on back-shelf.
Eli Lilly
- Eli Lilly is using cloud computing services to support scientists with on-demand processing power and storage.
- The firm uses Amazon Web Services and other cloud services to provide high-performance computing, as needed, to hundreds of its scientists.
- Eli Lilly foresees the possibility of using cloud services from a half dozen different vendors and need for an “orchestration layer” that sits between Eli Lilly and the various cloud services.
- It would comprise algorithms that determine the best cloud service for a particular job based on lowest cost, highest performance, or other requirement. Such an approach would make it possible for Eli Lilly and other users to write to a single API rather than many, while optimizing service usage.
- The firm is also exploring the potential to use cloud computing for external collaboration between Eli Lilly and outside researchers.
Johnson and Johnson
- Johnson and Johnson is seeking to complement its high performance grid architecture with cloud computing, mainly in the area of drug discovery modeling applications, according to Rick Franckowiak, director for systems engineering at the Pharmaceutical Research & Development IT organization at Johnson & Johnson.
- Require enhanced computing and storage capabilities and address spike-type processing demand.
Indigo BioSystems Inc.
- Indigo BioSystems Inc., a privately held company offering data management and automated analysis solutions for life science researchers with a focus on the pharmaceutical industry.
- It has deployed IBM's Compute on Demand cloud services to provide clients with a highly secure, scalable and compliant environment for data exchange.
- IBM’s cloud services has been able to meet the clients’ requirements for a scalable and globally accessible platform for data exchange alongwith the security and regulatory compliances of the pharmaceutical industry.
Hospitality
Cloud computing is used in hospitality industry to provide
- Disaster recover infrastructure for mission critical applications
- For online reservation system
- Purchase of a dedicated pool of computing resources and allocating them as needed
- Facilitating in responding to real-time situations
- e.g. Preferred Hotel Group is using Terremark's Enterprise cloud services
Web applications
Cloud computing is used in various web applications such as Microsoft's Hotmail, Google's Gmail and YouTube, and Yahoo's Flickr photo-sharing service etc
- Consumers run only their browsers on local computers
- The rest of the software along with users' email messages, photos or videos are on remote machines the user can't see and doesn't have to know anything about it
- e.g. Microsoft's Hotmail is using Azure (cloud computing platform from Microsoft) and Google's Gmail is using Google Apps
- Google Docs - online versions of word processor and spreadsheet applications is also using cloud computing
Consumer electronics
Cloud computing is being used in consumer electronics at various levels:
- In Laptop computers for Wireless communication and access to the Internet
- Laptops require minimum hardware to reduce cost and a internet connection
- Gaming industry, that will allow iPhones and other thin-client devices to have really high-end graphics without having a big, expensive hot video card in them that draws battery life
- e.g Netbook, a general purpose laptop, works on cloud computing and is available at a price of US$400 and some even in US$50-100 range.
Retail
- Cloud services allow retailers to plan for demand peaks in online services dynamically without worrying about provisioning for high availability
- Pay-as-you-model helps them save cost rather than paying for expensive hardware to meet these peaks while they are under used for the rest of the year
- Increases productivity and help companies serve their customers better.
- e.g. Amazon is using cloud computing for its online retail services
Financial Services
Cloud computing is used by financial services firms:
- To Store and analyze large amount data related to stock market or historical data
- Build and evaluate new risk analysis programs
- Allows companies to bring application to the market quickly and deployed within a limited budget
- e.g. NASDAQ Market Replay application uses Amazon's S3 cloud for data storage but the application part is not running in cloud presently. It has plans to develop future applications in Amazon's EC2.
Companies using cloud computing
The New York Times
Major League Baseball
ESPN
Hasbro
Chicago
Cybernet Slash Support (CSS)
Intuit
Activision
Cloud computing in online photo sharing
Cloud computing advantages like scalability, reduced hardware costs and extensive reach is being utilized by some leading photo sharing websites.
SmugMug
- SmugMug is a premium photo sharing web site with an emphasis on professional photography.
- Members are allowed to upload unlimited number of photos, create new functionality, tagging and uploading from Picasa, iPhoto and other software packages.
- SmugMug is using Amazon S3 mainly as a storage solution for customers’ photos which can be accessed anywhere, anytime while also providing real-time backup and failover systems.
The architecture basically consists of three software components:
TweetPhoto
TweetPhoto is a free photo-sharing service that compliments the social-media service Twitter which is utilizing Rackspace Hosting’s cloud-computing unit Mosso and allowing users to easily upload as many photos as they want and automatically share them with Twitter followers and Facebook friends. It also allow users to conduct photo searchers, subscribe to RSS media feeds, geotagging from GPS-enabled smartphones or transfer bio and friend information from Twitter as well as monitor which other users are viewing and commenting on their photos.
TweetPhoto utilizes the Cloud Site and Cloud File services from Rackspace.
- Cloud Site is a scalable platform which can store and recall countless amount of metadata in real-time and has the ability to run Windows or Linux applications across hundreds of servers.
- Cloud File Provides unlimited online storage for media which is served out via Limelight Networks' Content Delivery Network (CDN)
- CDN is a system of computers networked together on the internet and works on the principle that the capacity sum of strategically placed servers can be higher than the network backbone capacity. Strategically placed edge servers
- Decrease the load on interconnects, public peers, private peers and backbones
- Free up capacity
- Lower delivery costs
- Offloads traffic from peer link and redirects traffic to edge servers
Source:Wikipedia,businesswire.com, blog.mosso.com
Flickr
- Flickr is an image and video hosting website, web services suite, and online community platform.
- Used to share personal photographs and also widely used by bloggers as a photo repository.
Flickr architecture:
The platform for Flickr is:
- PHP
- MySQL
- Shards
- Memcached for a caching layer.
- Squid in reverse-proxy for html and images.
- Smarty for templating
- Perl
- PEAR for XML and Email parsing
- ImageMagick, for image processing
- Java, for the node service
- Apache
- SystemImager for deployment
- Ganglia for distributed system monitoring
- Subcon stores essential system configuration files in a subversion repository for easy deployment to machines in a cluster.
- Cvsup for distributing and updating collections of files across a network.