Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Expo on Google News


2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
SYS-CON.TV
Top Links You Must Click On


Java Developer's Journal: A Blueprint For Developing Language Tools
A proven approach to making them modular, extensible, and maintainable

Language tools such as compilers, interpreters, and code generators are a critical part of the software development landscape. Any software project will include several procured tools and very likely several in-house tools. Experience shows that the only guarantee with such tools is change: the underlying language may change due to improvements or extensions and the functionality provided by the tool expands, driven by user-requested features and the need to stay in front of the competition. The specific changes that will be made are rarely known at the outset, but change is coming.

This implies that when designing such a tool, extensibility is paramount. So it's important that the design is modular. A clean division of responsibilities is needed to support maintainability, which in turn is needed to support the rapid pace of change typically associated with language tools.

This article describes a proven approach to developing language-based tools in a way that is modular, extensible, and maintainable. The approach is based on two principles: establishing the core modules at the outset and using the visitor pattern to interact with language sentences.

To illustrate the approach a simple calculator example is given. This calculator supports the addition, subtraction, multiplication, and division of integers. The language supported by this calculator is informally described below:

expression = constant |
   expression op expression
constant = 0 | 1 | 2 | ...
op = + | - | * | /

This language is very simple, but it's sufficient to illustrate the main concepts related to the development of language tools. The example is developed in Java based on the JavaCC parser generator. However the concepts presented are language-independent and apply equally to C++ and C#. All of the code shown in this article is available for download.

A key objective when developing a language tool is to ensure that the tool isn't dependent on the actual representation of the language. This allows simple support for multiple language formats, imports from other tools, and so on. To meet this objective, a distinction is made between concrete and abstract syntax. This is described below. After this the notion of context is introduced followed by a description of the actual mechanism for parsing. Then the use of the generated parse tree is explained.

Concrete Syntax
Concrete syntax represents the way in which a specific file format represents the input for the tool. This is often described using BNF or a similar structured representation. If a parser generator such as Lex/Yacc, Antlr, or JavaCC is used, the concrete syntax will be described in a generator-specific manner. A simple concrete syntax for the calculator using JavaCC shown is Listing 1.

Note that normally semantic actions would be included in such a JavaCC description. These are presented later in the article.

In general there can be several concrete syntaxes for a language (plain text, XML, RTF, etc.). The tool design should be sufficiently flexible to support multiple concrete syntaxes (as well as the ability to add further concrete syntaxes) without having a major impact on the rest of the tool.

Abstract Syntax
Abstract syntax is a representation of the language that's independent of the concrete syntax. The abstract syntax representation contains only information that's necessary for the tool to perform its task; any other information is discarded. In principle this necessary information should be included in all concrete syntaxes. So there should only be one abstract syntax regardless of the number of concrete syntaxes. In an OO setting, an abstract syntax is typically a tree-structure that reflects the way in which language sentences can be constructed. The abstract syntax fulfils a number of functions. It:

  • Represents the program
  • Stores any necessary information related to the concrete representation (e.g., for pretty printing, relating error messages to specific file locations, etc.). This is explored further in the next section.
  • Exposes the above information to tools without revealing implementation details.
The tree-like structure of the abstract syntax is represented using inheritance; we use context information objects to store the necessary information from the concrete representation; exposing information to tools without revealing implementation is achieved with interfaces. And since the abstract syntax has a tree-like structure, tools will access it using a visitor pattern. This basic structure is shown in Figure 1.

Notice that the package arrangement in Figure 1 follows the convention used in Eclipse that dictates that classes in a package named intern are not to be exposed to other tools.

Context Information
The idea of creating abstract syntax is that the tool will use these objects to achieve its goals. This means that the tool is not dependent on the specific concrete format being used. However, this separation can be problematic since in some situations information from the concrete representation is actually needed. For instance, a type checker might generate an error message that has to be displayed to the user. For this message to be of value, the specific location of the error has to be provided.

The solution to this problem is to associate each abstract syntax node with an object defining the context of that node. This might, for example, be the start and end line and column for the node; the information required here may vary according to the nature of the language, the tool, and the concrete format in question. As with the abstract syntax, client tools should be shielded from the implementation details of context information, so an interface is used and classes implementing this interface are internal. It's possible that there may be multiple context information classes according to the concrete representation. This is suggested in Figure 2 where a plain text context information class is used.

Parsing
To create a parser, the concrete syntax presented earlier needs to be married with the abstract syntax classes. This is shown in Listing 2. Note that there will typically be one parser for each concrete syntax supported.

This example is based on JavaCC, but the principle applies to other parser generators: the semantic actions in the matching rules in the parser definition are used to create and instantiate abstract syntax objects, resulting in the creation of an abstract syntax tree corresponding to the input text. This abstract syntax tree will be the input to other components in the tool that require the input text.

About Paul Mukherjee
Paul Mukherjee works as a consultant for Systematic Software Engineering, and is a Sun Certified Java Programmer and Sun Certified Java Developer. In his role as a consultant he is used to helping to make projects successful but also tries to help the individual members of the project to be better at what they do.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Java Developer's Journal: A Blueprint For Developing Language Tools. Language tools such as compilers, interpreters, and code generators are a critical part of the software development landscape. Any software project will include several procured tools and very likely several in-house tools. Experience shows that the only guarantee with such tools is change: the underlying language may change due to improvements or extensions and the functionality provided by the tool expands, driven by user-requested features and the need to stay in front of the competition. The specific changes that will be made are rarely known at the outset, but change is coming.

Java Developer's Journal: A Blueprint For Developing Language Tools. Language tools such as compilers, interpreters, and code generators are a critical part of the software development landscape. Any software project will include several procured tools and very likely several in-house tools. Experience shows that the only guarantee with such tools is change: the underlying language may change due to improvements or extensions and the functionality provided by the tool expands, driven by user-requested features and the need to stay in front of the competition. The specific changes that will be made are rarely known at the outset, but change is coming.


Your Feedback
Java Developer's Journal News Desk wrote: Java Developer's Journal: A Blueprint For Developing Language Tools. Language tools such as compilers, interpreters, and code generators are a critical part of the software development landscape. Any software project will include several procured tools and very likely several in-house tools. Experience shows that the only guarantee with such tools is change: the underlying language may change due to improvements or extensions and the functionality provided by the tool expands, driven by user-requested features and the need to stay in front of the competition. The specific changes that will be made are rarely known at the outset, but change is coming.
Java Developer's Journal News Desk wrote: Java Developer's Journal: A Blueprint For Developing Language Tools. Language tools such as compilers, interpreters, and code generators are a critical part of the software development landscape. Any software project will include several procured tools and very likely several in-house tools. Experience shows that the only guarantee with such tools is change: the underlying language may change due to improvements or extensions and the functionality provided by the tool expands, driven by user-requested features and the need to stay in front of the competition. The specific changes that will be made are rarely known at the outset, but change is coming.
Enterprise Open Source Magazine Latest Stories . . .
Apache Deltacloud, the Red Hat-contributed ReSTful API that abstracts differences between clouds so services on any cloud can be managed – provided of course there’s a driver – has graduated from the Apache Foundation’s incubator and is now a full-fledged Top-Level Project (TLP). The...
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and st...
AMD said late Tuesday that its chief sales officer Emilio Ghilardi had left the company and that CEO and president Rory Read is going to do his job while a replacement is sought. AMD didn’t say why Ghilardi left but it’s assumed Read wants his own people. Read is relatively new to th...
During the lifespan of M3 (Monitis Monitor Manager) there has always been something lacking – timers. M3 execution procedure was outlined in this previous article. The execution mentioned in the latter was a one-time-execution, whereas server monitoring requires periodic invocati...
Red Hat is putting its bought-in Gluster scale-out NAS storage technology, acquired in October, on the Amazon cloud. It’s styled Red Hat Virtual Storage Appliance for Amazon Web Services and other clouds are supposed to follow in short order.
A new episode of the screencast series is now available at the OpenNebula YouTube Channel. This screencast demonstrates the new easily-customizable self-service portal for cloud consumers. Its aim is to offer a simplified access to shared infrastructure for non-IT end users. The scree...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE