Simeon Simeonov

Subscribe to Simeon Simeonov: eMailAlertsEmail Alerts
Get Simeon Simeonov via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: Apache Web Server Journal, XML Magazine

Apache Web Server: Article

SOAP - What is this thing called SOAP? Here's the Background, Part 1

SOAP - What is this thing called SOAP? Here's the Background, Part 1

I wanted to kick off this new column on XML protocols with an introduction to the hot newcomer in this arena - Simple Object Access Protocol (SOAP). Trouble is, there are too many ways to go about the topic. The first version I wrote was a technical introduction to SOAP laced with references to some of the important ongoing debates on XML protocols and XML distributed computing. When I finished I found that the two threads - detailed technical information and higher-level issues - interfered with each other. So I decided to focus on the technology in the next issue and devote this space to a discussion of the driving forces behind SOAP.

The Making of SOAP
The following story is pieced together from what I've learned from information volunteered by employees and alumni of Microsoft, DevelopMentor and UserLand (the initial creators of SOAP) as well as Allaire, DataChannel, IBM, W3C and webMethods (other active players in the XML distributed computing space). I welcome additions and/or corrections.

Microsoft started thinking about XML distributed computing in 1997; SOAP was coined in early 1998. The goal was to enable applications to communicate via Remote Procedure Calls (RPCs) on top of HTTP. DevelopMentor (a long-standing Microsoft ally) and UserLand (a company that saw the Web as the ultimate publishing platform) joined the discussions. Things moved forward, but as the group tried to involve wider circles, at Microsoft politics stepped in and the process stalled. The DCOM camp at the company disliked the idea of SOAP and believed that Microsoft should use its dominant position in the market to push the DCOM wire protocol via some form of HTTP tunneling instead of pursuing XML. Some XML-focused folks at Microsoft believed the SOAP idea was good but had come too early. Perhaps they were looking for some of the advanced facilities that could be provided by XML schema and namespaces (see my article, "The Evolution of XML Protocols," in XML-J, Vol. 1, issue 3). Frustrated by the deadlock, UserLand went public with a cut of the spec published as XML-RPC in the summer of 1998.

In 1999, as Microsoft was working on XML Data and adding support for namespaces in its technologies, the idea of SOAP gained additional momentum. It was still an XML-based RPC mechanism, however. That's why it met with resistance from the BizTalk ( www.biztalk.org) team. The BizTalk model was based more on point-to-point messaging than RPCs. It took people a few months to resolve their differences. In December 1999 SOAP 1.0 was submitted to the Internet Engineering Task Force (IETF). DevelopMentor started an open effort to create a Java SOAP implementation.

In typical fashion the Microsoft marketing/PR machine proudly waved the SOAP banner at every opportunity. SOAP was heralded as the effort that would immediately break down barriers to interoperability and shepherd us into a new era. Given Microsoft's positioning, it wasn't surprising that companies such as IBM and Sun responded very defensively and denounced SOAP as all hype and no substance.

Luckily, common sense prevailed quickly. Microsoft changed its positioning more along the lines of "SOAP is a first step toward the establishment of standards for a ubiquitous XML distributed computing infrastructure." A few months ago, in the beginning of May, SOAP 1.1 was submitted as a Note to the W3C with IBM as a coauthor. This was an unexpected and refreshing change. In addition, the SOAP 1.1 spec was much more extensible, eliminating concerns that backing SOAP implied backing some Microsoft proprietary technology. This, and the fact that IBM immediately released an open Java SOAP implementation that was subsequently donated to the Apache XML Project (xml.apache.org) for open source development, convinced even the greatest skeptics that SOAP is something to pay attention to. Sun reversed its stance and voiced support for SOAP in the beginning of June. Not long after, Microsoft released a base-level SOAP implementation and announced that the new version of BizTalk is going to be based entirely on SOAP.

That's the status as of this writing. To keep your finger on the pulse of SOAP, you should join the XML distributed applications mailing list at the W3C ( http://xml-dist-app@w3.org), the SOAP discussion mailing list at DevelopMentor ( soap@discuss.develop.com) and the SOAP mailing lists at the Apache XML Project ( soap-user@xml.apache.org and soap-dev@xml.apache.org).

What Should SOAP Do?
SOAP claims to be a specification for a ubiquitous XML distributed computing infrastructure. It's a nice buzzword-compliant phrase, but what does it mean? Let's parse it bit by bit to find out what SOAP should do.

XML surely means that SOAP is based on XML and related standards: namespaces, schema, and so on. So far so good.

Distributed computing implies that SOAP can be used to enable the interoperability of remote applications (in a very broad sense of the phrase). It's a fuzzy term and it means different things to different people and in different situations. Here are some "facets" one can use to think about a particular distributed computing scenario: the protocol stack is used for communication, connection management, security, transaction support, marshaling and unmarshaling of data, protocol evolution and version management, error handling and audit trails. The actual requirements for different facets will surely vary between scenarios. For example, a stock ticker service that continuously distributes stock prices to a number of subscribers will have needs different from those of an e-commerce payment processing service. The stock ticker service will probably need no support for transactions and only minimal - if any - security or audit trails (it distributes publicly available data). The e-commerce payment processing service will require Cerberean security, heavy-duty transaction support and full audit trails.

Infrastructure implies that SOAP is aimed at low-level distributed systems developers, not developers of application/business logic or business users. Infrastructure products such as application servers will become "SOAP enabled." SOAP will work behind the scenes making sure your applications can interoperate without your having to worry too much about it.

Ubiquitous means omnipresent, universal. On first look it seems to be a meaningless term, thrown into the phrase to make it sound grander. It turns out, however, this is probably the most important part. The ubiquity goal of SOAP is both a blessing and a curse. It's a blessing because if there are SOAP-enabled systems everywhere on the Internet, it should be easier to do distributed computing. After all, that's what SOAP is all about. However, the ubiquity of SOAP is also a curse because one technology specification should be able to support many different types of distributed computing scenarios, from the stock ticker service to the e-commerce payment processing service. To meet this goal SOAP needs to be a highly abstract and flexible technology. However, the more abstract SOAP gets, the less support it will provide for specific distributed computing scenarios. This is the eternal tug of war between generality and specificity.

Let's look back at some of the facets of distributed computing and make a best guess about the bare minimum in variations of distributed computing scenarios SOAP should be a base for. Table 1 could be considered a start in this direction. Extensible means that someone can extend how a facet of a new scenario can be handled without having to modify the core specification. Pluggable means that the specification allows for introducing a completely new mechanism for handling a certain facet.

Wow! What are the chances that a single specification can address the details of all these scenarios? Pretty darn close to zero if you ask me. That's why I can't stress enough that SOAP (at least in its current form) is just a start, a base specification that addresses problems common to all distributed computing scenarios. Saying that you're going to use SOAP for XML distributed computing is like saying you're going to use XML for structured data representation or ASCII for writing text. It's a fairly meaningless statement until you specify a whole lot more: protocols, security, transactions, data marshaling, communication patterns, and so on. Be realistic; don't let the SOAP hype take over!

What Is SOAP, Really?
Despite the hype, SOAP is still of great importance because it's the industry's best effort to date to standardize on the infrastructure technology for XML distributed computing. There is general consensus in the industry that SOAP should be focused on the common aspects of all distributed computing scenarios and therefore needs to provide:

  • An extensibility mechanism so that evolution isn't hindered and there's no lock-in. XML, schema and namespaces really shine here.
  • Packaging of information in a clearly identifiable SOAP "request" or "message." This is done via a SOAP "envelope" that encloses all other information.
  • Identification of the "body" of the request/message where potentially arbitrary XML could be used.
  • Communication of additional information that sits outside the body (via a notion of "headers"). Headers can be used to build more complex protocols on top of SOAP.
  • Basic error handling. This is done via the notion of a SOAP "fault."
  • Data serialization. SOAP specifies a default mechanism for data serialization, but others can be added.
Unfortunately, there's considerably less consensus on how higher-level issues should be addressed. Nearly everyone agrees, however, that to tackle the broad spectrum of interesting problems we're faced with we need to work in parallel on a set of layered specifications for XML distributed computing.

Among other things, the XML Protocols Shakedown panel at WWW9 in May focused heavily on the scope of SOAP and on what other specifications we need to put in place to make the technology truly usable. Dan Connolly of the W3C chaired the panel. Present among the panelists and audience were most of the XML distributed computing gurus of our time. Many in the group tended to agree that the important specific problem areas that face us are:

  • Basic RPC, the common model for exposing application "services"
  • Basic point-to-point and publish/subscribe messaging, the dominant mechanism for loosely coupled application integration
  • Programming language data structure mapping, ˆ la WDDX, to enable logical data interoperability
  • Robust security, without which most businesses won't adopt XML distributed computing
  • Full transaction management, required by mission-critical services such as e-commerce-related technologies

Please, heed this warning: If we don't try to establish standards in these areas, the benefits of using SOAP will be severely limited. While the extensibility of SOAP is a great thing, it also means that 10 companies can come up with 10 ways to do messaging/security/whatever with it and interoperability will be lost. Don't get me wrong, I'm not a standards fascist. However, I do believe that if the whole industry is faced with common problems, we'll all benefit by sharing common yet extensible standards. Commonality enables interoperability. And extensibility is key because it eliminates lock-in.

When faced with the trade-off between generality and specificity, complexity and simplicity, we've generally adopted a combination of layering and parallel development. Layered horizontal specifications can be used to address increasingly broad problem sets. Parallel development of vertical specifications can quickly deliver solutions for separate problem domains. This is likely to be how SOAP and related technologies will evolve. Of the problem areas above, the first two qualify as vertical, the last three as horizontal. Figure 1 illustrates one way they can be organized.

The Road Ahead
The SOAP development community is working hard. There have been some great discussions on security on the DevelopMentor SOAP mailing list. Allaire, IBM and UserLand are thinking about application data type mapping. Microsoft is toying with the idea of using the Transaction Internet Protocol (TIP, RFC2371) it developed together with Tandem as a means of transaction management. The base SOAP specification includes ideas on how RPCs should be handled, and IBM, Microsoft and third parties have developed competing specifications for describing object interfaces and Web services that can be exposed via SOAP. Despite this excitement, the future of SOAP and specifications related to it is uncertain because of the usual politics and conflicting goals of the major players and standards organizations.

The major vendors backing SOAP aren't all on the same page. Microsoft stands firmly behind the philosophy of releasing some baseline technology quickly and letting developers run with it. Companies suspicious of Microsoft interpret this as a move to standardize on the base-level SOAP spec and then use this to put a "standards approved" seal on any proprietary WinDNA extensions to SOAP. IBM and Sun haven't communicated a clear strategy except that they're interested in high-end enterprise-quality technology and are likely to push SOAP that way. The smaller players vary in positioning from publishing companies that don't care even about security to e-business platform vendors that want the technology to scale nicely from simple to complex scenarios.

The standards organizations have plenty to think about, too. SOAP was initially submitted as an IETF draft and then as a W3C Note. This move created some discontent. Although the IETF and the W3C have worked together on some projects, in general they're quite protective of their turf. Furthermore, while the base-level SOAP spec fits the scope of W3C quite well, higher-level specs (transactions, messaging, etc.) seem to be clearly outside it. What organization is going to spearhead standardization in these areas? OASIS is one possible choice. However, there's a lot of overlap between OASIS work and Microsoft's BizTalk framework. There's a good chance that Microsoft is going to be upset if OASIS gets involved in a big way.

I hope as an industry we can figure a way out of this mess quickly, because as you've seen, SOAP by itself just doesn't do that much. We need to put a lot more work into it before it becomes usable by the mass market. To see just how little SOAP does (but how well it does it!), in the next XML in Transit column we'll look at the technical aspects of the latest SOAP spec. The material from this issue's column will help you put things in perspective and evaluate the design trade-offs.

More Stories By Simeon Simeonov

Simeon Simeonov is CEO of FastIgnite, where he invests in and advises startups. He was chief architect or CTO at companies such as Allaire, Macromedia, Better Advertising and Thing Labs. He blogs at blog.simeonov.com, tweets as @simeons and lives in the Greater Boston area with his wife, son and an adopted dog named Tye.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.