Web Services are NOT Distributed Objects
I wrote an article for IEEE Internet Computing about how there are still many misconceptions about the fundamentals of web services, titled 'Web Services are NOT Distributed Objects: Common Misconceptions about Service Oriented Architectures'. It deals with the confusion about web services and distributed objects, with web-services & RPC, that there are more than just HTTP bindings, the relation between web services and web servers, (un)reliable web services and about why debugging web services is hard but not impossible
The article is still in 'draft mode', but should be relatively stable. I would very much appreciate feedback and comments, but keep in mind that the audience of the article is the general IT professional, not the web service specialist.
The text below was the first draft of this article. Many comments improved the article before it was published.
The full and final text of this article is now available from this page.
I am leaving the draft article online at this page to show the evolution of the article.
Web Services are not Distributed Objects: Common Misconceptions about Service Oriented Architectures
Dept. of Computer Science, Cornell University
Web services are frequently described as the new incarnation of distributed object technology. This is a serious misconception, made by people from industry and academia alike, and this misconception seriously limits a broader acceptance of the true web services architecture. Even though the architects of distributed systems and internet systems alike have been vocal about the fact that these technologies hardly have any relationship, it appears to be difficult to dispel the myth that they are tied together. In this article I revisit the differences between web services and distributed objects in an attempt to make it clear that web services are an internet-style distributed systems technology that does not rely on, or require, any form of distributed object technology.
Unfortunately, the mix-up about web services and distributed object systems is not the only misconception that is commonly heard. There are at least a dozen other popular statements about web services that are partially incorrect or just plain wrong. This article also contains clarifications of a number of these common misconceptions about web services.
When I visited the WWW 2003 Conference Peter M. asked me
“Don’t you think that web services will fail, just like all the other distributed object technologies people have tried to build?”
Peter is a smart and gifted Internet architect, but this statement baffled me. How is it possible that someone like Peter still views web services as distributed object technology? Peter is not alone in his stubbornness in continuing to address web services as distributed objects. Many developers, architects, managers and academics still see web services as the next episode in the continued saga of distributed object technologies such as CORBA, DCOM and RMI. Web services and distributed objects systems are both distributed systems technologies, but that is where the common ground ends. They have no real relation to each other, except maybe for the fact that web services are now sometimes deployed in areas where in the past the application of distributed objects has failed. If we look for relationships within the distributed technology world it is probably more appropriate to associate web services with messaging technologies, as they share a common architectural view, but address different types of applications.
Web services are based on XML documents and document exchange, and as such one could call the technological underpinning of web services document-oriented computing. Exchanging documents is a very different concept from requesting the instantiation of an object, requesting the invocation of a method on the specific object instance, receiving the result of that invocation back in a response, and after a number of these exchanges, releasing the object instance.
This misconception does not stand by itself; there are about a dozen similar statements that I frequently encounter that fall more or less in the same category. Popular ones are: Web services is just RPC for the Internet, or You need HTTP to make web services work. Below I will try to address a number of the more popular misconceptions. First, however, I will try to establish what actually a web service is in its purest, minimalist form. I believe much of the confusion comes from press and vendor hype, which lacks the technical depth needed to make people understand the real concepts. Of course, the political bickering among standards bodies such as WC3, OASIS, and WS-I doesn’t help to clarify the simple, interoperable nature of web services.
Minimal web services defined
To realize why most of these issues are misconceptions, it is necessary to cut through all of the hype we have seen in the press and from vendors. If we bring back web services to a minimalist core there are three components that make up a web service:
- The Service. This is a software component that is capable of processing an XML document it has received through some combination of transport and application protocols. How this software component is constructed, whether object-oriented techniques have been used, if it operates as a stand-alone process, is part of a web or application server, or is merely a thin layered front-end for a massive enterprise application is not of any importance. The only requirement for a Service Process is that it is capable of processing certain well-defined XML documents.
- The Document. The XML document that is sent to a service to be processed is the keystone of a web service, as it contains all the application-specific information. The documents a web service can process are described using an XML schema, and two processes that are engaged in a web services conversation need to have access to the same description to make sure that the documents that are exchanged can be validated and interpreted. This information is commonly described using the Web Services Description Language (WSDL).
- The Address. Also called a port-reference, this is a protocol binding combined with a network address that can be used to access the service. This reference basically identifies where the service can be found when a particular protocol (e.g. TCP or HTTP) is used
In principle this should be enough to build a web service, but in practice at least one more component is added to it:
- The Envelope. This is a message encapsulation protocol that ensures that the XML document to be processed is clearly separated from other information the two communicating processes may want to exchange. This allows, for example, routing and security information to be added to the message without the need to modify the XML document. The protocol that is used for almost all web services is SOAP, which originally stood for “Simple Object Access Protocol.” This naming was a mistake as the protocol has nothing to do with accessing objects, and since the SOAP 1.2 specification  the protocol name is now used without expanding the acronym. The SOAP message itself, also called the ‘soap-envelope’, is XML and consists of two possible elements: a soap-header, in which all the system information is kept, and a soap-body, which contains the XML document that is to be processed by the web service.
These 4 components are all that it takes to use a web service. Whether you use your text editor to construct a SOAP message to send in an email, or use an automatically generated proxy-client from within your favorite programming language, this is all that is needed to make it work.
The document-oriented distributed computing world of web services is all about the design of the documents you want to exchange. Protocols and addresses are necessary only as glue to get the documents to the right places. Although web services are centered around documents, this does not mean these documents are targeted to be read by humans. The goal of web services is to enable machine-to-machine communication at the same scale and using the same style of protocols as human interface centered World Wide Web.
Not a misconception: web services are really simple.
At its core, web services technology is really simple; the only thing it does is use standard Internet protocols to move XML documents between service processes. This simplicity guarantees that its primary goal of interoperability can be achieved.
The simplicity also means that many of the more complex distributed applications cannot be easily built without adding other technologies to the basic web services. Over time we will see that the issues the vendors are now bickering about, such as reliability, transactions and asynchronous processing, will become reality in an interoperable manner. The process around security extensions, for example, gives us reasonable hope that vendors are capable of reaching agreement on a set of interoperable primitives.
On the other hand the process around reliable messaging has many of the distributed system specialists scared to death. In an attempt to preempt the release of the reliable messaging specification by IBM, Microsoft, BEA, and Tibco , a consortium lead by Sun Microsystems and Oracle published a reliable messaging specification  that was little more than a cut-and-paste effort from the reliability section of ebXML. This was clearly a specification that was released too early under vendor-political pressure, as the specification was ambiguous in many places, incomplete in others, and riddled with errors throughout the document. Any company implementing this specification would end up with a very unreliable system, and as such this specification is a disservice to the community. If there is one threat to web services succeeding at large scale, it will be vendor politics.
As I described in the previous section, document-oriented computing centers around the design of the document, the rest of the web service glue is just support technology to get the document to the right place in the right manner. In contrast with the simplicity of the basic web service technology, the documents can be extremely rich and complex. For example, a web services system I have worked on for the US Air Force publishes flight plans that can easily be up to a megabyte in size. Encoding these rich documents in XML ensures that the documents are extensible at predefined places without breaking any of the existing document consumers.
Misconception #1: Web services are just like distributed objects
Given the strong similarities between web services and distributed objects, it is understandable why the misconception exists that they are the same thing. After all, both have some sort of description language, both have well-defined network interactions, and both have a similar mechanism for registering and discovering available components. What contributes to the misconception that these are similar technologies is that many tool vendors provide simple object-oriented techniques for implementing web services, which give them the appearance of distributed objects. A number of these vendors have a long history in selling distributed object technology and as such have a strong interest in molding web services such that they appear to be a next step in the evolution of distributed object systems.
A first thing to realize however is that the current state of web services technology is very limited compared to distributed object systems. The latter is a well-established technology with very broad support, strong reliability guarantees, and many, many support tools and technologies. For example, web services toolkit vendors have only just started to look at the reliability and transactional guarantees that distributed object systems have supported for years.
An important aspect at the core of distributed object technology is the notion of the object life cycle: objects are instantiated by a factory upon request, a number of operations are performed on the object instance, and sometime later the instance will be released or garbage collected. A special case is the singleton object, which does not go through the instantiate/release cycle. But in both cases the object is identified through a reference, and this reference can be passed around between processes to provide a unique mechanism to access the object. Objects frequently contain references to other objects, and distributed object technology comes with extensive reference management techniques to support correct object lifetime management.
This notion of object reference is essential; without it there is no distributed object system. It is also important to realize that with this reference the caller has a mechanism to return to the same object over and over again and as such access the same state. Distributed objects systems enable state-full distributed computing. The state of the object is access through a well-defined interface that is described typically in an interface definition language (IDL).
Web services have none of the characteristics of distributed object systems. There is no notion of an object, object reference, factories or life cycle. There is no notion of an interface with methods, data structure serialization or reference garbage collection. The only technology web services have is XML documents and document encapsulation.
With a bit of a stretch, one could force an analogy between a web service and a singleton object. However, such a singleton object would need to be very restrictive to make the comparison work. At the basic level web services cannot offer any of the state-full distributed computing facilities that distributed objects systems support as basic functionality.
The difference between the two technologies is also obvious when we look at how information flows between client and server or producer and consumer. In the distributed object system the richness of the information flow is encapsulated in the interfaces an object supports, but in a web services system, the richness of the information flow comes from the design of the XML documents that are passed around.
Another important difference between these two technologies is in the style of distributed computing that they enable. Distributed object systems enable what is often called statefull computing; the remote object on the server can contain data and state that the client can operate on during the lifetime of the object. If a reference to an object is handed to different application, that process will encounter the same state when accessing the referenced object. Web services however have no notion of state, and they fall into the category of distributed system techniques that enable stateless computing. In web services the state of the interaction is contained within the documents that are exchanged. Whether a service is ever able to be truly stateless is disputable; if a web service document includes a customer identification number, which the service then uses to retrieve customer information (state) from a database, does this still constitutes stateless-ness? Identifying stateless versus stateful distributed components should be seen as a way of categorizing technologies, more than strict architectural guidance. In the context of this categorization distributed objects and web services are in opposite camps.
At the basic level web services have no notion of a relationship between two service invocations at the same service or at related services. The distributed systems that can be built without identifying relationships between components in a computation are very limited and as such one of the first advanced web service specifications that was released dealt with Coordination . This enables multiple services and their consumers to establish a context for their interaction. It is a misconception to see this context as a weak form of object references, as it references an ongoing conversation and does not reference any state at the services.
Distributed object technology is very mature and robust, especially if you restrict its usage to those environments which it has been designed for: the corporate intranet with often homogenous platforms and predictable latencies. The strength of web services is in the internet-style distributed computing, where interoperability and support for heterogeneity in terms of platforms and networks are essential. Over time web services will need to incorporate some of the basic distributed systems technologies that also underpin distributed object systems, such as guaranteed, in-order, exactly-once message delivery. It is unlikely however that web services can simply adapt the technology used in the distributed object systems to achieve the same properties.
There are two known approaches in which web services and distributed object technologies can work together. First, there is the approach of wrapping certain objects from an object system, such as J2EE, with a web service. This has of course its limitations and cannot be done for just any object. See Steve Vinoski’s article on interaction models  to learn more about this approach. A second approach that can be observed is to use web service protocols such as SOAP as the transport layer for the distributed object system. This is sometimes used to tunnel object specific interactions over HTTP. It is however a poor man’s choice as alternative solutions such as GIOP are better suited for that interaction pattern.
Misconception #2 Web services is RPC for the Internet
RPC provides a network abstraction for the remote execution of procedure calls in a programming language. It provides mechanisms for identifying a remote procedure, for deciding which arguments to the procedure are ‘in’ arguments and as such need to be provide to the remote procedure at invocation time, and which arguments are ‘out’ arguments and need to be presented to the caller at completion time. It also includes extensive mechanisms for handling errors both at the runtime and the programming level.
Web services in their basic form provide only a networking abstraction for the transfer of XML documents, and the processing of these documents by a remote service entity. Web services have a notion of ‘actor’ or ‘role’ that identifies the service that should consume the document, but there are no predefined semantics associated with the content of the XML document sent to the service.
An RPC-style interaction could be implemented using pairs of SOAP messages and a transport such as HTTP. One would use certain fixed rules for encoding the arguments in an XML document and rules for returning the results to the caller.
The original web service architects assumed that this would be a popular form of using web services and even included a specific encoding in the SOAP specification called RPC/encoded to help with the encoding of data types. However in the SOAP 1.2 specification this encoding has become optional and tool builders are no longer required to implement it, and preference is given to the document/literal encoding.
Even though we like to look at web services as just XML document processors, this doesn’t help the developers that need to build web services and web service clients. Tool vendors will do their best to provide infrastructure that allow traditional procedure calls to be applied to simple web services. For example Microsoft’s Web Service Enhancements 2.0 toolkit provides a set of object types that can be used to implement a request/response style interaction, where the programming infrastructure tries to interpret the document for the programmer. The toolkit also provides the programmer with a similar set of types that provides the programmer with simple but powerful support to receive the raw XML documents.
Internet-wide RPC has failed to succeed in the past, and web services are not going to be of much help in solving the issues surrounding wide-area RPC. There is no magic in the web services infrastructure that can suddenly overcome what excellent protocol architects were not able to achieve with DCE/RPC or GIOP. Even though web services may solve some of the interoperability issues, it does not solve for example the issue that synchronous interaction over wide-area is not scalable or that versioning procedure interface at large scale is extremely difficult.
Misconception #3: Web Services need HTTP
Web services are transport-agnostic, meaning that they can be accessed over any type of transport or application protocol. The SOAP protocol, which describes the web service message format, can be used such that messages are transported over HTTP, but can also be used such that messages go over plain TCP and UDP. There are bindings where the messages flow over SMTP by encapsulating SOAP message in an e-mail message, or over a traditional messaging infrastructure such as MQ-Series or JMS. A core scenario of the web services architecture is the case where a message flows over different transport types before it reaches its destination.
For example a SOAP request is delivered to an enterprise gateway using HTTP. The gateway then uses a load balancing mechanism to pick one of nodes of a server farm to process this request and uses a persistent TCP connection to forward the incoming document. In another case a purchase order encapsulated in a SOAP message is delivered using an e-mail message addressed to email@example.com over an SMTP transport. The receiving server will take the soap content and encapsulate it in a JMS message and insert it into the order processing workflow system, which might be based on traditional message queuing. The service that actually consumes the SOAP requests may not be determined until the message has visited a few intermediate processors that determine whether this is a request that is entitled to ‘gold’ priority treatment and some auditing has taken place. Eventually the requesting process (remember web services are intended for computer-to-computer conversations, no humans involved), will receive an email message with a confirmation or rejection of the order.
Even though the web service architecture has been developed with this transport independence in mind, it is true that the majority of the web services that are in use at this moment run over HTTP. One of the reasons for this is that most of the early web services toolkits made use of the existing infrastructure that the major web servers Apache, IBM WebSphere and Microsoft IIS offered. Leaving the parsing of requests and dispatching of messages to the web server, it was possible to abstract all of the grind of web services away using web-server add-ons such as Axis or ASP.NET. These extensions will automatically generate the WSDL for the web service and provide an simple service exercise tools, making it a great environment for prototyping and learning web services.
A second reason for the popularity of implementing web services using HTTP is more strategic. In contrast to the period of the dot.com boom, most enterprise software projects currently require a short-term return on investment. This forces most of the production web service projects to focus on improving the access to the corporate data and services for partners and customers, without requiring too much new infrastructure. The first place this is possible is by using the web servers that are already functioning as front ends to J2EE infrastructure. This approach has become rather successful and should be seen as the first step in the path to a deeper integration of web services in the enterprise.
There are people suggesting that the main reason for tunneling web service messages through HTTP is to bypass firewalls. If this would indeed be the reason than it would be a dangerous approach that would seriously weaken a site’s security, and one should only do this in combination with extensive content-based filtering techniques of the HTTP flows.
Misconception #4: Web services need web servers
There has been some discussion that maybe we should actually drop the ‘web’ from web services, as it leads to more confusion and does not contribute to clear view of the world. This is already becoming obvious in such terms as service-oriented architectures, service-oriented integration, or services bus. None of these enterprise concepts use the term ‘web’, as they are not relying on any web technologies such as HTTP, or web servers.
There are quite a few toolkits that allow you to develop and integrate web services without the need for a web server infrastructure. Examples are Simon Fell’s PocketSoap, Systinet’s WASP, IBM’s Emerging Technologies Toolkit and Microsoft’s WSE. Enterprise integration systems such as Artix and DocSOAP, also provide web-server independent web service development.
As explained in the previous section, there has been an initial set of web services that have exploited the application-server functionality of web servers. But now that the initial business case has been made, and wider choice of transports is required, most systems will move away from implementation inside web servers.
In the past months a high-profile debate has taken place about applying the principles of REST to web services architectures. REST encompasses some of the techniques that make web infrastructures scalable. There is a lot of value in this debate about the web principles, and particularly with respect to resource identification and operation visibility, but it is becoming quickly irrelevant for the bigger picture of web services, given that transport independence is surpasing the importance of the ‘web’ part of web services. The REST principles are relevant for the HTTP binding, and for the web server parsing of resource names, but are useless in the context of TCP or message queue bindings where the HTTP verbs do not apply.
Misconception #5: Web services are reliable because they use TCP
TCP is a protocol that guarantees reliable, in-order delivery of messages, so it would appear that web services, if they make use of TCP, can achieve the same guarantees. First of all the guarantee of reliability is only partially true for TCP programming, as there a few scenarios under which a message cannot be completely delivered to the remote peer, and the local participant has already closed the connection and will not be notified of this error.
What is more important to realize is that document and message routing for web services provides for the use of intermediaries. In the presence of network, node, and component failures there are quite a few scenarios possible under which the initial delivery of the document to the first station was successful, but where the document will never reach its final destination and thus never gets processed by the service.
The type of reliability that is important for web services and distributed systems in general is that of end-to-end reliability. There are a lot of established techniques in achieving this type of reliability and we will see in the coming year whether they can also be simply applied to web services or whether new technology is needed. In general reliability is achieved through the retransmission of messages, but these retransmissions also require you to weed out duplicate messages in case the message was not really lost. Estimating timeouts, etc. in a heterogeneous network such as the internet is not trivial.
Frequently when you build reliable distributed systems you would like to let some information flow back about the state of the service request processing, such that the producer of the document can take local actions. Giving feedback about the arrival of the document, about the consumption of the document by the service, and the completion of the processing of the request makes building these systems easier.
In addition to reliability we would also like to make sure that if the producers care about it, their messages will be consumed by the service in the order they were sent. This put more stress on the reliability system, because if messages get lost other messages may need to be delayed until the lost message is retransmitted.
None of these guarantees are new; they have been around for years and have been made to work in all sorts of distributed systems, such as distributed objects systems and multi-party fault-tolerant systems. Web services will need these technologies also, but until they’re added web services should be considered unreliable, whether they use TCP or not.
Misconceptions #6: Web services debugging is impossible
As web services enable the internet-scale type of distributed computing, where frequently the parties in the conversation will belong to different organizations, web service developers and those who have to deploy them are confronted with a whole new set of problems that cannot be handled with traditional debugging and monitoring tools. The federated nature of web services means that most of these new challenges are introduced by not 'owning' both ends of the wire. Two of the most prominent challenges facing users are cross-vendor interoperability and WSDL versioning.
Even though traditional tools are of little help with these problems, new web services diagnostic tools such as SOAPscope  are emerging to address the development and deployment challenges of web services. SOAPscope is unique in that it focuses on 'watching' the wire, logging the traffic and providing a suite of functionality to detect and resolve these federation-related and other potential problems.
The wide variety of web services toolkits being used to develop both web service clients and servers means that it is becoming more common that different toolkits operate at each end of a SOAP interaction. Each of these toolkits may have interpreted the specification somewhat differently leading to potential interoperability problems. When a client encounters an obscure error from a server, how does the developer diagnose the problem when they have no access to the code running at the server? The solution available to the developer is to focus the SOAP traffic on the wire and the WSDL contract between services.
Tools such as SOAPscope offer several capabilities to help understand and fix interoperability problems. SOAPscope has for example 'resend' and 'invoke' features allow testing 'what-if' scenarios against a server to isolate problem requests. The 'viewing' capabilities allow better understanding of the SOAP messages by visualizing the request at higher-levels of abstract than raw XML. And, to maximize interoperability, the WSDL Analysis detects and helps resolve potential interoperability problems prior to deploying a service.
Another challenge of Web services, which will become increasingly common, is caused by change and versioning of Web services. A small change to the XML Document specification in the WSDL contract at the server can easily break existing clients. Clients may not even be aware the document specification has changed or how to fix their client to accommodate the change. A Web service client may start to receive 'faults' from the server which indicate a problem but are seldom useful to resolving the issue. Tools such as SOAPscope can inspect the XML document specification in the current WSDL and compare those with the specification used to create the client.
These new debugging and deployment tools that use historical data next to a real-time view of the web service interaction provide extremely powerful tools for the web service developers.
There are many misconceptions about web services technology. This is mainly caused by the fact that web service technology is still evolving, even at the most basic level. Many vendors, trade magazines and venture capitalists have already tagged web services as a technology that will trigger a new wave of applications, enabled by federated interoperability. This early exposure has resulted in many incomplete and incorrect publications, frequent releases of toolkits with little or no architectural vision, and different standardization bodies fighting for the right to control the standards under pinning web services. Add to this that many of the vendors who jumped on board to promote web services, have a vested stake in web and applications servers and/or distributed object technologies, and promote web services only in the context of their flagship technologies.
This has become a fertile ground for many misconceptions. In this article I hope to have clarified a few of those common misconceptions that are important for those who have to reasons about web services at architectural level. It is important that we invest significant effort in education about the unique nature of web services to undo the damage that some of the hype reporting has created.
Posted by Werner Vogels at August 26, 2003 02:26 PM
- Felipe Cabrera, et al., Web Services Coordination (WS-Coordination), Joint Specification by BEA, IBM and Microsoft, August 2002, http://www-106.ibm.com/developerworks/library/ws-coor/
- Colleen Evans, et al., Web Services Reliability (WS-Reliability) Ver 1.0, Joint Specification by Fujitsu, NEC, Oracle, Sonic Software, and Sun Microsystems, January 2003, http://developers.sun.com/sw/platform/technologies/ws-reliability.html
- Christopher Ferris and David Langworthy, editors, Web Services Reliable Messaging Protocol (WS-ReliableMessaging), Joint Specification by BEA, IBM, Microsoft and Tibco March 2003, http://www-106.ibm.com/developerworks/webservices/library/ws-rm/
- Mindreef – Web Services Diagnostics, http://www.mindreef.com
- Steve Vinoski, Web Services Interaction Models, Part 1: Current Practice, IEEE Internet Computing, Vol. 6 No 3, pp 89-91, May/June 2002
- World Wide Web Consortium, SOAP Version 1.2 Part 0: Primer, June 2003 http://www.w3.org/TR/soap12-part0/