Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Expo on Google News


2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
SYS-CON.TV
Top Links You Must Click On


Real-World Use Cases: Cloud Storage Workloads
What is your data workload?

In my previous article, "Cloud Computing Public or Private? How to Choose Cloud Storage," we covered choosing between public and private cloud storage and the appropriate data types for cloud storage. This month we will dig deeper into the workloads and file creation patterns that best fit cloud storage with a focus on private clouds. Rather than file types, the discussion will cover how files are managed and where cloud storage fits, along with a few real-world use cases.

When choosing any storage solution it's important to consider the workload and data usage patterns. This even goes beyond storage - application workloads drive server, network and all IT infrastructure decisions. Sure, most vendors will tell you that their product is the best solution for any workload, and when choices were few, that was somewhat accurate. However, today there are many different offerings, each with strengths and weaknesses in different situations.  This article will review six workload scenarios and identify where cloud storage is a good fit and where it is a poor fit.

Rapidly Changing Single File Workloads
Examples of a rapidly changing single file workload would include I/O patterns of a database, source code repository, or an active spreadsheet. In this workload there is either a very powerful single server, or many users sharing a single file. In both cases, updates to a single file are constant and rapid, driving the need for a tier-one class of storage. To facilitate this workload, the system should have lots of memory; fast, hard drives; and the ability to create snapshots for instant data protection. Today this market is well served by Enterprise NAS vendors such as EMC and NetApp.

Data Ingestion Workloads
The best example of a data ingestion workload is video surveillance. Consider, for example, the city of London and its thousands of cameras, each streaming write operations to storage. Every camera creates its own set of files and needs fast access to storage. This is an excellent workload for private cloud storage. A private storage cloud has many storage nodes that can ingest streams of information independently so there is no data bottleneck. A camera-to-storage node ratio can be established, say 10 cameras per node, and then replicated out to hundreds of nodes, and enabling thousands of cameras. Since the cloud is centrally managed, a single administrator can easily manage the video surveillance storage for the entire city.

Read-Intensive Workloads
Video streaming and online video sharing are categorized as read-intensive workloads. Consider the example of the Beijing Olympics last summer. There was unbelievable demand for online video of the events, and in the U.S. the focus was on men's swimming. When the U.S. relay team won by a fraction of a second, everybody wanted to watch. Millions of people flocked to the web and video servers churned out views. This creates a unique storage demand. With thousands of web servers trying to read a single file, the architecture must support parallel reads. With hundreds of independent nodes serving out many copies of the same file, cloud storage provides the ideal solution to read intensive workloads.

High Performance Computing (HPC) Workloads
HPC workloads are similar to data ingestion workloads with one important difference - access to a single file. Rather than every client creating a unique file, hundreds or thousands of systems access a single file that is striped across many nodes for performance. This workload requires tight coordination between every node in the cluster to ensure data integrity, file locking, and cache coherence. HPC storage is used extensively in oil and gas exploration and financial data modeling where complex transactions are processed by compute clusters. There are a number of established HPC storage vendors include Panasas, Isilon and NetApp GX.

Single Producer, Many Consumer Workloads
In June 2008, the NASA Phoenix Mars Lander discovered ice crystals on the surface of Mars. The world reacted, scientists and religious organizations confirmed their unique theories about the universe, and everybody wanted access to the data. Given the challenges of landing on Mars and collecting soil samples, it's safe to say this is an example of a write once, consume many workload. Other examples include genomic sequence findings and quarterly business results. All share a single creation event with demand for multiple points of read access. Cloud storage protects data by replicating files to one or more nodes. This same activity can create many access points, enabling a single creation event to be easily shared amongst many consumers.

Archive or Content Depot Workloads
In most cases as data ages it becomes less active. Whether it is corporate information or media content, it is important that this data be kept available, but at a cost relative to its value. Private cloud storage economics and scale capabilities are designed to address this use case. Data can be copied to the cloud to free up more expensive tier-one storage devices and delay costly infrastructure upgrades. Cloud storage can be expanded on demand using the latest (or oldest) commodity hardware and a few simple mouse clicks. When it comes time to retire cloud hardware, it can be removed without downtime, preserving access and enabling 50 year archives.

What Is Your Data Workload?
When considering storage choices, ignore the "we can do everything" vendors and think about your workload. Once you understand your requirements and how the data will be used, your answer will emerge.

About Mike Maxey
Mike Maxey is director of product management for ParaScale, a Silicon Valley startup focused on addressing the exploding bulk storage requirements for digital content and archival data.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Enterprise Open Source Magazine Latest Stories . . .
Apache Deltacloud, the Red Hat-contributed ReSTful API that abstracts differences between clouds so services on any cloud can be managed – provided of course there’s a driver – has graduated from the Apache Foundation’s incubator and is now a full-fledged Top-Level Project (TLP). The...
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and st...
AMD said late Tuesday that its chief sales officer Emilio Ghilardi had left the company and that CEO and president Rory Read is going to do his job while a replacement is sought. AMD didn’t say why Ghilardi left but it’s assumed Read wants his own people. Read is relatively new to th...
During the lifespan of M3 (Monitis Monitor Manager) there has always been something lacking – timers. M3 execution procedure was outlined in this previous article. The execution mentioned in the latter was a one-time-execution, whereas server monitoring requires periodic invocati...
Red Hat is putting its bought-in Gluster scale-out NAS storage technology, acquired in October, on the Amazon cloud. It’s styled Red Hat Virtual Storage Appliance for Amazon Web Services and other clouds are supposed to follow in short order.
A new episode of the screencast series is now available at the OpenNebula YouTube Channel. This screencast demonstrates the new easily-customizable self-service portal for cloud consumers. Its aim is to offer a simplified access to shared infrastructure for non-IT end users. The scree...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE