|
SYS-CON.TV Webcasts
Comments
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
Top Links You Must Click On
Enterprise The pros and cons of business-app implemention via open-source software (Part 1)
Is open-source or Microsoft-licensed software the right choice for better, faster, cheaper and safer business-application implem
By: Paul Murphy
Oct. 18, 2002 12:00 AM
(LinuxWorld) — This is the first installment of a series comparing the implementation results for real business applications. We'll examine business-application implementation using Unix tools and ideas and how this plan of attack compares to what happens when the same apps are implemented using Microsoft-licensed software. Each application will be the subject of two articles. The first one will present the theoretical — or "book-learning" — view of the issue and invite readers with real-world experience in using the technologies to contact the author in confidence to correct my errors, give your estimate of the time needed for the work and discuss what goes wrong when you try to go from theory to practice. The follow-up article will then try to summarize community experience with the technologies in order to draw out conclusions everyone can use and to answer the basic question for this series: is open-source software better than Microsoft software? The focus here is on the technology, but readers should be aware that the most-important factors in real architecture decisions usually have little or nothing to do with the technology or cost. The goal in making tech decisions is to get a product that works for its intended users, but getting the best product at the lowest cost is only part of this. As discussed at length in my Unix Guide to Defenestration, any Unix technology, no matter how insanely great or cheap, can be made to fail if the managers who get control of it after implementation either wanted something else and consciously or unconsciously set out to prove they were right. If a manager only understands how to manage a proprietary system and insists on applying those ideas to Unix, it can also doom the project to failure. Nichievo Inc.All three of the examples planned have a context set by the same project in the same imaginary company: Nichievo Inc. This setting, which is purely imaginary, is designed to illustrate a political opportunity for Unix and open source in an otherwise closed shop. Be aware, however, of the risks involved: if the people who take over from you on system delivery don't want to make the system work, it won't. The overall job involves setting up a secure digital exchange for what Nichievo calls an acceptance order. Nichievo insures receivables; an acceptance order is the company's commitment to pay an insured receivable in case of default by the debtor. In its unsigned form, such an order is a quote. Signed, it is a contractual commitment. Our job is to collect quotes as they are issued, make them available for review/signature by senior managers and make the signed orders available for customer download. Background on Nichievo and the overall systems project under discussion can be found in this extended sidebar. As envisaged, our solution will require XML-publishing capabilities, so this first article will look at the Windows versus Linux option in terms of the core hardware and licensed software needed for XML publishing. Specifically, we'll look at the choice between Apache/Cocoon and Microsoft's proprietary tools. The third article in the series will look at the development issue. If we picked Cocoon and open source in round one, we'd already be largely committed to using Java Beans and Java for forms management and validation, but there are quite a few other things we need to do. For those, should we use Perl, PHP or try to extend our use of Java and the Cocoon framework? On the other hand, if we decided to do this job with Microsoft's tools, we could use a third party IDE for some tasks but will be using BASIC or some variant of it for others. The fifth article will look at the database layer (Ed. Note: The even-numbered installments of this series will be reserved for answering and showcasing reader feedback to prior installments). If we made the Microsoft choice in round one, SQL-Server is a given here. The open-source choice, in contrast, has options: should we use leading products like PostgreSQL and mySQL, explore interesting new ideas like eXist or choose a commercial product like Sybase? That's the plan, but this series need not be limited to these toolkits or the Nichievo case. If you have additional or alternative suggestions for toolkits, please let me know. The alternativesThe job seems to call for a cross between a centralized XML-document publishing solution and a customer portal. Either way, we need provision for strong authentication, lots of logging and applications code to handle the online addition of digital signatures, as well as some back-end database functions to simplify acceptance processing for standing orders. Under this view of the process:
Reality check
It's important to think about, particularly when putting an application design into a service proposal, who it is that's paying your bills. In this case, the client, Nichievo, has a dismal technology record and a CIO who is not on side with the managers bringing us in.In this situation, your clients have to let the CIO review the proposal, and it's not that unusual for him to respond by having some of his people cook up a few screens that prove beyond any doubt that he can do a better job of implementing your ideas than you can. If senior management buys into that, it leaves your sponsors looking like idiots and you without an invoicable project — or friends you can go back to in that company. Having been burnt a few times, I now brief the sponsors whenever that kind of thing looks likely. I put something significant in the proposal that hits a few hot buttons among client executives but is as hard as possible for the CIO to promise or do. In this case, we don't need XML to do this project; we need it to win this project. Technically, ordinary HTML with either PERL or PHP would work just fine. However, management would really like to automate much of the customer interaction and, of course, the international outsourcing services firm hired 18 months ago has failed to deliver on its promises to do this. The CIO isn't the only threat to worry about. That giant international outsourcing services firm doesn't want its billings to fall and could respond by going to the firm's managing director with a story that blames the CIO for their failures. If only he had let them use Domino, everything would long since have been working beautifully. Technically, Domino would work for the basic job but be very hard to push to full customer-portal operation. Putting together a working Domino demo wouldn't be that hard; in fact, it would be much easier than doing it with Cocoon. The scenario reverses, however, as you add functional complexity. By the time you've developed a full messaging portal, my guess is that Domino would demand many times the programming effort than Cocoon needs to get to that same level. So how serious a threat is this? Well, if you pack enough serious suits with pretty business cards into a room, it's often rather easy to convince senior management to feel heroic and dedicated about backstabbing their friend the CIO — a tough decision, of course, but taken in the interests of the company, you understand. To head this kind of thing off, I've been praising the ebXML standardization effort spearheaded by OASIS (the Organization for the Advancement of Structured Information Standards) as a downstream means of standardizing the business-messaging they need to enable the customer message-exchange they want. I think acting on this direction would be premature, but I picked XML for this application as a building block toward eventually doing that — knowing that it would be very difficult to do with Lotus while blocking the CIO's credibility if he attempts a putsch. It would be possible to do this using servers already in place in Nichievo's 34 operating offices but the centralized approach is preferable because:
We could implement the centralized alternative in one of two ways:
Note Microsoft issued a press release on October 8, 2002, announcing: "The 'Jupiter' Vision Aims to Unify and Extend Current E-Business Server Technologies And Include Standardized Business Process Management Capabilities, Deeper Support For XML Web Services, and Richer Developer and Information Worker Experiences." This consolidation is to be achieved over the next 18 months and may, or may not, ultimately provide a cocoon-like wrapping for Microsoft's XML publishing environment. The applicationsFrom a design perspective, we see the central system as a Web-based "order switch" that collects requests from customers, recommends orders from juniors, orders approvals from senior partners and then passes the approved orders back to customers. Diagram 1 below shows typical high-level use cases for this. To make this work we need:
Since we have no control over the client device, reliance on a Web browser as the user interface is a given. This decision essentially determines that we'll use a Web server as our means of communicating with the user client and logging accesses. Document volume is quite low: we expect a maximum of only about 120,000 acceptance orders per month. On the other hand, we have to keep every document ever filed on the server online in order to support the firm's customer relationship management effort and to provide data for part of its risk-assessment methodology. With this in mind, the numbers build relatively quickly. In three years, we can expect to need online access to something like 3.2 million approved and 400,000 unapproved orders. It would be possible to store and index these as signed and unsigned documents, but it would be better to store the data for them in simple tables and have the application construct the documents on request. That step complicates processing but enormously facilitates activity logging, reporting, backup, system recovery and statistical uses of the data. In this case, use of a database would also reduce disk space requirements considerably. A typical order document stored as a Microsoft Word 10 binary takes 19,658 bytes exclusive of the standard contract terms referenced in it, but storing that information in an SQL table takes only about 320 bytes for the addressee (which is normally stored only once per customer) and 190 bytes per covered receivable for about a 95 percent overall disk space saving (after indexing and overhead). It is not the dollars that are important here; disk is cheap. What's important is the reduction in backup and recovery time. Recovering 60GB from tape takes hours; recovering 60MB takes minutes. Using a database to construct documents on the fly eliminates concern over varying input file formats, as data-entry can be handled via a browser form. On the other hand, it creates two additional problems:
Therefore, as shown below, the design will be based on using a database to store the information going into each document, using an "XML-enabled" application layer to construct documents as needed, and using a Web server as an interface to the user's browser.
The processing applications needed can be thought of as modules within an overall framework. Diagram 3, below, shows a typical screen flow for one such module.
Actual definition of these screens is best done using an active prototyping approach in which you start with your best, and usually rather naive, idea of how it should work and then do two things in parallel:
Once your prototype achieves stability, you can implement formal testing and review by users not previously associated with the project and use their comments to refine the thing to the point that they think your prototype "works." Once the system works, phase two will deal with deployment issues including:
The costsFrom a capital cost perspective, the new system is to fit into the existing network and support framework. Consequently, initial infrastructure costs are limited to the server and any licensed software needed. Server sizing is something of a non-issue. We know that the database will be quite small, probably still under 20GB three years from now, and we know that typical usage volume will also be quite low because, on a typical day, the company insures about 7,500 customers of whom around 900 will record some change — usually a receipt or a new receivable on a rolling account. The weakest link
In this situation, verify that the network can deliver. You may have a 10MBS connection with low utilization but that doesn't mean you can add a substantial new load. Particularly on PC-type networks, all kinds of things — firewalls, poorly configured or underpowered routers, "invisible" SMB network use — can foul things up.If the network is slow, your users won't care about your excuses or your demonstrations of how fast the server is. They'll see poor response and turn off. Be sure to test your connection, repeatedly and at different times of day, before agreeing to its adequacy. If their in-house network won't support your access needs and the local network guru doesn't take action, try to take your test system somewhere else... and make the network effect obvious when the guru's boss has to migrate the box in-house. It is likely that most users will initially see use of this server as an additional burden imposed by management and respond by fitting their interactions with it into already busy schedules. In practice, that means we can predict usage surges just before or after lunch and just around go-home times in each time zone. Unfortunately, West coasters tend to leave impositions until after lunch while East coasters do them just before going home, leaving us facing the likelihood that the two biggest surges, those from the Pacific and Eastern time zones, will overlap. Once people see value in this service, usage will balance out. The quickest way to destroy any chance of that happening is to underconfigure the hardware at the beginning. Users who have to wait for your server the first time they connect to it will have their resentment of the new imposition reinforced, and you'll never recover their trust. On the other hand, the cost difference between "about right" and "grossly overpowered" is only a few dollars in this context, so I've intentionally specified an insanely overpowered machine below: a dual-processor Dell Xeon running at 2.4-GHz.
I also considered a Sun 480 for this role, as the cost wouldn't be much different despite its fiber channel disk, the higher reliability of Solaris on SPARC and its upgradeability to four CPUs giving it some advantages. For this article, I want to compare Linux to Microsoft solutions on the same hardware, but a real-world decision would be more influenced by the workload. The twin Xeons are faster than the two UltraSparcs, but the Sun machine offers a hardware cryptographic accelerator for $2,700 that is capable of doing around 4,300 SSL "hand-shakes" per second. If system usage were going to be high relative to the hardware, that accelerator would make a big difference. But that isn't true here, and the Xeon's shorter completion times on single tasks makes them the better choice as far as I am concerned. On the software side, I've not worked with the Microsoft stuff and am not all that sure what we need or don't need. The list here is deduced from the how-to article on the Microsoft Web site referenced earlier.
As an operational matter, the importance of this data to the company means that I'd recommend redundancy — setting up two servers, in different cities, with different administration, and different Internet backbone connectivity — at somewhat more than twice the cost. As shown below, you could do that with the Linux solution for about the cost of one Windows 2000 system.
Get your feedback in!
This should be an area of intense comment from people who have actually used this stuff. Remember, article two needs your experience and opinions. If you have used Cocoon or the BizTalk/SQL-Server combo in a real application, please contact me.
We do not yet have man-power estimates for either the development or the operational phases of this work. On the development side, the requirements are currently only loosely understood while operational issues have yet to be discussed at any length. Nevertheless, experience tells us that the first prototype can be developed under Cocoon in about a week and that the process is likely to go through from three to five iterations before a full-user manual (which is the requirements specification) can be written for user signoff. Key issuesClearly, infrastructure costs for the open-source solution are less than half of those for the proprietary solution. In itself, that fact doesn't make the Cocoon solution better. The cost difference — perhaps $100,000 for a two-way redundant system — looks like a lot of money at the personal level but barely registers on Nichievo's bottom line. Failure would hurt both us and our sponsors, but it won't break the firm. Success, on the other hand, affects the balance of management power in the firm and could lead to radical change starting with the cancellation of the current development contract and the ousting of the CIO. That, in turn, would create opportunities for us in particular and the open-source movement in general; replace 1,500 or so Windows desktops with Unix smart-displays, and we'll have a massive positive impact on the firm's bottom line. The potential rewards of change are therefore clear, and no one's under the illusion that we're here to sell Microsoft products. But we still need to ask the question as fairly and "straight-up" as we can: What are the relative risks associated with each decision? Use Cocoon, or use Microsoft's tools? Implementation riskIf you choose the open-source route for this, there's no doubt it will be going into a hostile environment... but the Windows decision isn't all that great either. Yes, it makes you compliant with the CIO's preferred direction, but it still leaves you in a conflict with the international outsourcing and consulting firm that's been beavering away there for the last eighteen or so months. Different agendas, different methods
Having the client own and manage the development environment is great if your primary interest is selling time. After all, waiting for the other guy to act (or just for a PC to grind something out) is far more profitable than working because it reduces your average selling cost.One client I know fell for this twice, not only demanding control of the development servers but once buying the development house's used 486s and once getting another consultant's retired P2s as Oracle development workstations. These worked as Oracle seats, but Windows NT and Oracle on P2 gear made for lots of long — and fully billable - waiting, while mutual finger-pointing and related delays added more billable days to the project's overall duration. Either way, you'll have people working against you — fewer and more muted with Windows than with Linux, but no picnic either way. This is, of course, the biggest risk there is. But if you've made your clients aware of the danger and they're willing to take the risk, then its your job to minimize it without undermining their judgment by agonizing over it. Resource control is the most effective risk-reduction strategy possible here. If you want to succeed, own the hardware and control network access to it, even if that means putting a bunch of their PCs in a room with the server and a small hub. Later, put two phases into your deployment plan:
To facilitate this, I often include an offer in the proposal to develop on production scale hardware that we own until hand-over. At that time, the client can decide to buy it at the pre-agreed price or replace it with hardware of his own. In most cases, this looks like a great risk-reduction strategy to the client... and it is, because they don't spend a hardware nickel until the software works, and they don't face a systems transition either. However, its real purpose is to trap the opposition between rocks and hard places:
Notice, however, that this strategy requires you to buy the machine and any needed licenses up-front. This is a powerful argument for Linux because:
StabilityBoth sets of tools are under development with both subject to change. As a rule, however, Microsoft's changes affect everything from the operating system (which may require new hardware to run) to the client interface layer. Apache's changes tend to be independent of the operating system. There are operating-system patches to consider in both cases, but Linux patches don't generally require application reconfiguration or testing. Windows service packs, in contrast, often change everything from licensing terms to API internals. From a stability perspective, therefore, both choices mean that we will be adapting to technical change as it occurs, but the Cocoon option limits that to the application and is therefore strongly preferable. RecoverabilityThe absence of licensing issues, together with the separation of application, database, server and OS on Linux, mean that we could recover the application to any Linux machine capable of handling the load. That isn't true on Windows 2000 server; a failure pretty much has to be recovered on the machine that failed. Otherwise, we're really looking at a new install — something that's usually much harder and more time-consuming to do. Given how critical this application is, recoverability is a killer issue and a strong vote for Linux with Cocoon. SecuritySecurity is the other killer issue. There have been security issues with Apache, Tomcat, PostgreSQL and Linux, but not many. Those that appeared were quickly remedied. The Microsoft toolset, on the other hand, has dozens of outstanding security issues, including XML-based attacks on SQL-Server and Windows 2000 Server. Remediation is usually slow in coming. This, to me, is a decisive issue: Linux and Cocoon it is. Reader Feedback: Page 1 of 1
Enterprise Open Source Magazine Latest Stories . . .
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||