Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Expo on Google News


2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
SYS-CON.TV
Top Links You Must Click On


Some Musings on Device Options
To dsync or not to dsync?

In ASE 12.0 we introduced official file system device support via the device-level dsync flag. Since then, many a DBA has pondered "to dsync, or not to dsync?" This tends to be part of the larger question of file versus raw. Like just about anything related to performance, there is not really a yes or no answer that fits all cases. In this article I'll try to further your understanding of this and other device options.

I like to have an understanding of what a flag/switch/etc., really does before making a decision on its use, so I'll start my discussion of device flags with an explanation as to what they really are. Before we talk about the various flags, let's take a look at file system i/o in general (generalizations and simplifications will be fine for this portion).

A typical file system has a cache into which blocks are read and written. You can think of this very much like ASE's data cache. A file system block is essentially like an ASE page, and the file system cache is like the ASE data cache. When a user requests some data from a file (or database), appropriate blocks (or pages) are read from disk and into the cache, and then returned from the cache to the user. When the user writes to a file (or a database), the write is reflected in the cache but is typically not written to disk until a later time. This makes both read and write operations faster than they really are. The next time that same data is read, it may already be in cache, eliminating the need for a disk i/o. On the write side, because we are only updating memory (cache), we don't need to wait for an expensive disk i/o to complete.

Because nothing in life is free, the improved efficiency comes with some risk. I mentioned that a write will complete when the change is written to the cache, with the actual i/o to disk happening at some later time. Suppose the system crashes before that write to disk takes place? The application thought it had, because the O/S returned success for the write. However, when the system comes back up, the changes will be lost. If the file that lost the write was one of your database devices, ouch. This is the reason that ASE did not officially support file system devices until 12.0.

In 12.0 we introduced the dsync flag - shorthand for "Data Synchronous." The dsync flag in ASE directly translates to the O_DSYNC open(2) flag. That is to say, when ASE opens a device that has the dsync flag set, ASE will pass O_DSYNC to open(2). This flag tells the file system that a write to that file must pass though the cache and be written to disk before the write is considered complete. In other words, for writes we throw away the cache efficiency and make sure the data goes to disk. This way if the system crashes, everything that we thought had been written to disk has in fact been written to disk.

(Note: Most file systems support an O_SYNC mode in addition to O_DSYNC. The difference has to do with metadata. File systems maintain metadata about a file in addition to the data in the file. The O_SYNC flag forces both metadata and data changes to be written to disk, whereas the O_DSYNC flag does not concern itself with metadata. O_DSYNC is sufficient for ASE because the metadata on ASE devices does not frequently change in any consequential way.)

Now that we know what dsync really is, let's consider when it should and shouldn't be used. This is pretty simple. Any time you are using file system devices and you care about the recoverability of the data, you should use the dsync option. Remember that without dsync, the OS may tell ASE that a write is complete without it ever having been written to the disk. In the event of a system crash, we could have some real problems, like data page changes being on disk without the accompanying log records having made it. The natural exceptions are tempdb devices. Here we don't care about recoverability, and therefore we can optimize write performance by running without dsync.

There are a few common questions and misconceptions about dsync that I want to clarify:

1.  A common misconception is that because dsync is "synchronous," you can't do asynchronous I/O to the file. This is not true. This synchronous/asynchronous conflict is at a different level. With async i/o we are talking about the context in which the i/o is executed, i.e., whether the i/o blocks the caller or if it is done in a different context (see my first post on async i/o: http://blogs.sybase.com/master/master_01220702.asp). With dsync we're talking about when the write() is considered complete. These are not mutually exclusive, and you can asynchronously do a data synchronous i/o. It is quite simple. The async portion is as always: the application issues an i/o and later polls (or is notified) for completion. The dsync portion means that the application won't be told that the i/o has completed until the data has made it to disk. (Note that the Linux 2.6 kernel will block in the async i/o request unless direct i/o is being used.)

2.  Another question that occasionally comes up is whether or not dsync needs to be used for raw devices. Raw device i/o does not go through a file system, and therefore there is no caching of the i/o. When an application issues a write() to a raw device, the data goes directly to the disk. Therefore, dsync is out of context for raw. One way to think about it is: the safety that dsync provides is already guaranteed by raw i/o.

3.  Finally, some folks point out that even with dsync, we are not guaranteed that the write has hit the physical platter before the application is told it has completed. This is true. With dsync we consider a write complete when the driver/hardware says it has. With most disks/disk controllers/SANs, this means that the write has made it to the controller's cache. However, these devices guarantee that any i/o to the controller cache will eventually make it to the platter, so we don't worry about this.

This article has been reprinted with permission from David Wein's blog at http://blogs.sybase.com/bloggers/DavidWein.aspx.

About David Wein
David Wein is an ASE Software Architect at Sybase.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Enterprise Open Source Magazine Latest Stories . . .
Apache Deltacloud, the Red Hat-contributed ReSTful API that abstracts differences between clouds so services on any cloud can be managed – provided of course there’s a driver – has graduated from the Apache Foundation’s incubator and is now a full-fledged Top-Level Project (TLP). The...
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and st...
AMD said late Tuesday that its chief sales officer Emilio Ghilardi had left the company and that CEO and president Rory Read is going to do his job while a replacement is sought. AMD didn’t say why Ghilardi left but it’s assumed Read wants his own people. Read is relatively new to th...
During the lifespan of M3 (Monitis Monitor Manager) there has always been something lacking – timers. M3 execution procedure was outlined in this previous article. The execution mentioned in the latter was a one-time-execution, whereas server monitoring requires periodic invocati...
Red Hat is putting its bought-in Gluster scale-out NAS storage technology, acquired in October, on the Amazon cloud. It’s styled Red Hat Virtual Storage Appliance for Amazon Web Services and other clouds are supposed to follow in short order.
A new episode of the screencast series is now available at the OpenNebula YouTube Channel. This screencast demonstrates the new easily-customizable self-service portal for cloud consumers. Its aim is to offer a simplified access to shared infrastructure for non-IT end users. The scree...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE