In my last XML in Transit column (XML-J, Vol. 2, issue 5) we looked in detail at the technical aspects of the service description layer of the Web Service interoperability stack (see Figure 1). In fact, the topic of our discussion - Web Services Description Language - has now been submitted to the W3C for review.
In this column I originally planned to go a step higher and start to map out the space of Web Services advertising and discovery. Instead, I decided to bring the subject of service discovery and advertising within the context of a framework for using Web Services.
The Publish-Find-Bind Cycle
Perhaps the best way to motivate the need for Web Service discovery is to look at the typical roles and responsibilities of the participants in Web Service interactions. There are three key roles.
Of course, these three roles are stereotypes and there are many scenarios where there's some amount of overlap between real-world entities and the roles they're in. Regardless, the three roles (provider, broker, and requester) and the three Web Service-related tasks (publishing, finding, and binding) provide a good, basic workflow model. You'll often hear it referred to as the "publish-find-bind" process (see Figure 2).
Making It Work
Let's take a closer look at this process to understand how it works. Before anything useful happens the provider must have its service implementation ready as well as some way to access it (e.g., SOAP).
The publishing process involves the exchange of information between the service provider and the service broker. We can view this exchange as a registration of the service with the broker. Two broad categories of information can change hands:
The process of finding a Web Service involves a service requester going to well-known service brokers and executing a search for Web Services that meets some set of parameters. For example, my company's lunch-order intranet application may go looking for (1) restaurants with (2) Web Service-enabled ordering systems that are (3) open at lunch and (4) can deliver to the company's address. Even a trivial search like this combines contact information, geographic location, business hours, and service availability. Because of the complexity of potential searches, it's important that the mechanism for accessing service brokers is based on Web Services so we know they have extensible support for complex data encodings. Of course, it would be rather nice if we had a standardized API for accessing all service brokers so we're not locked in.
After we've identified the Web Service we want to use we have to bind our applications to it. For this we need to get the low-level technical information about the service described in its WSDL spec. The likely scenario is that the service broker will point us to a location (a URL) where we can find that spec. Our Web Service-access software can look at the spec and adapt to accessing that service.
This adaptation generally takes one of two forms. For more dynamic invocation scenarios our Web Service-access software can just tune itself to encode and decode data in a particular way and to use a particular communication protocol for the duration of that service. In scenarios in which developers will want to write long-lived code that accesses that particular service, we can go in another direction and have the binding process generate a proxy for the Web Service. A proxy is a software component that encapsulates the details of accessing a Web Service and presents developers with a simple API in the programming language of their choice. For example, the Java proxy for a stock quote service may expose a simple method along the lines of:
double getLastTradePrice(StringWhen working with Web Service proxies developers don't have to worry about any XML details. They can maximally leverage the capabilities of their runtime environment for type checking and validation. Further, it's likely that the code for accessing Web Services via pregenerated proxies will be a little more efficient than the more dynamic code that, on the spot, has to identify the service location, the type of protocol being used, and the data-encoding rules.
symbol) throws WebServiceException;
After the binding is complete, the service requester can directly access the service. The whole process is shown in Figure 3. To make repeated invocations efficient, the binding information needs to be cached so that the binding occurs only once. This is automatically done when proxies are used. In more dynamic invocation scenarios other caching strategies become available. The simplest one is to keep a cached local copy of the WSDL specification for a service so there's no need to make a request to a service broker every time you want to access the service.
Making It Robust
The previous section described a process for using a Web Service that's simple and efficient, provided there are no changes to the service. Here are some changes we're likely to experience:
To make the model for using Web Services robust, we need a mechanism for clearing our binding cache or "rebinding." The simplest mechanism that's also quite efficient is retry-on-failure. When we need to access a Web Service we start by using our cached binding information. If the operation fails, we go to the service broker and obtain updated binding information for that service. There are two caveats here. First, we need an efficient mechanism for getting directly to the binding information without executing an expensive search. Second, we need to know on what types of errors to go on to get the updated binding information. It would make sense to do this for any kind of error that indicates the service is unavailable or has changed. For example, if we're accessing a SOAP-based service over HTTP, it makes sense to request the updated binding information if we get back HTTP 404 (not found). On the other hand, errors related to processing on the server probably shouldn't cause us to think the service has changed.
In fact, in some cases we'll have no good way of knowing whether or not a service has changed based on the service-level error messages. Therefore, it's extremely important that service providers follow a good old principle of distributed computing - once a service is published at a particular location, it's immutable. If the service is to evolve and its interface is to change, it should be published to a different location. In this way, requesters using old cached information will get a "not found" type error and know to rebind. The new binding information will point requesters to the new service location.
When we get the new binding information we have to compare it to our cached information. If they're the same, we're experiencing some type of an unplanned service failure. We should probably wait for awhile to retry the service. If we experience failure again, we should attempt to rebind again in the hope that the service provider has updated the binding information to point to a backup service. We may have to repeat this cycle more than once.
If the new binding information is different, we should rebind and access the service. At this point we're likely to succeed. As you can see, the simple retry-on-failure mechanism can handle a surprisingly large number of failure scenarios with great efficiency of interaction.
In next month's XML in Transit we'll look at ways to make the usage of Web Services highly scalable as the number of service providers, brokers, and requesters grows.