Welcome!

Kurt Cagle

Subscribe to Kurt Cagle: eMailAlertsEmail Alerts
Get Kurt Cagle via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Article

Rethinking Web Services, Pert 1

Rethinking Web Services, Pert 1

I've been at this game for a while, a fact that has been hammered into my awareness with distressing frequency of late. I worked with Hollerith cards in college, running my programs through a machine with a distressing tendency to shred my carefully typed code into so much confetti if the deck was not perfectly aligned in the bin. I can remember a time when mentioning the object-oriented programming paradigm was a sure invitation to fisticuffs between its adherents and the old guard. In fact, SQL hadn't even been conceived when I went through college.

I guess that puts me into the neo-ancients - not so far back that I can remember when punch cards were considered hot technology, when programming involved switching cables between sockets, and programmers still had the moniker of computer scientists and wore white coats, but at the same time old enough to call most of the leading players in the tech sphere kids.

The reason I bring this up has to do with an old, old idea in comparatively new clothing: Web services. It sounds shiny and new, an idea that resonates in the imagination. Get the refrigerator to order new milk when the old gallon goes out of date. Schedule appointments with the family dentist by negotiating his schedule with yours via complex Web services protocols. Order that shiny new sports car that you want with the cherry-red paint job and the custom interior, and the orders go off to the factory and tell the robots there what to do. Sounds wonderful. Who wouldn't love it!

Web services are being marketed as the coolest technology since the PC was invented, completely revolutionizing everything and ushering in a brand-new technology boom. Microsoft is at the forefront of this marketing, understandably, but most of the big software vendors are now entering into the fray with their own versions of Web services - not wishing to be left out in the cold and give even the hint of a competitive edge to Redmond.

Yet for all of the sudden movement in the industry towards the new testament of Web services, I think there are a number of unresolved questions that make me wonder whether Web services are being both seen and marketed incorrectly, and whether we are advancing into the fray of battles over whose version of SOAP is to rule supreme and who is to become the ultimate victor in "servicing" the Web before it's really all that clear that the war is worth fighting in the first place.

As such, I'm going to sit here, my ancient bones staring at the juggernaut of a dozen multibillion-dollar industries, and ask some questions about the emperor's clothes and his choice of tailors. (What they don't tell you in the fairy tale is that the kid who pointed out that the king was naked was summarily imprisoned and was last seen as a rower on a slave galley.)

And the Answer Is...
So here goes. Contestant No. 1, "What exactly is a Web service?"

"I know! I know! It's a call made by a software program or hardware device to another software program or hardware device asking for a SOAP message."

Well, sort of. SOAP has an interesting history. It started out as an answer to XML-RPC, a quick-and-dirty schema that Dave Winer of FrontierLand developed to help implement a number of programs, including a compelling cooperative news setup that is rapidly becoming a de facto standard.

The idea there was pretty simple: in a program, you can encapsulate a request against a given program (device, service) in XML, and get the information you need from it or get it to run some function, regardless of where the server is located. Now, except for the XML part, this is certainly not a new idea. COM and CORBA both exist to do much the same thing - a process called marshalling, in which a request from a client is sent across some kind of programming barrier (a different application, a different machine on the ethernet network, etc.).

The problem is that both suffer from much the same thing. COM is a binary protocol that is incompatible with non-Windows systems without some heavy-duty translation software, a process that exacts its tool on performance. CORBA is designed to be more open, but it has problems crossing the divide in the other direction because the COM and CORBA specs are sufficiently different that they can't be cleanly mapped from one to the other.

With XML-RPC it's reas-oned that, for relatively low volume transactions where the cost of instantiating a DOM isn't that significant, you could get around many of the translation issues by using a text/XML structure that con-tained enough infor-mation to identify the transaction and carry a payload.

In many respects this isn't all that radically different from the way a typical HTTP POST message is sent: the structures involved are flatter and less rich, but you're still requesting information from a device or server of some sort. Moreover, the typical Web client/server architecture essentially has a program (the browser) sending a request/response series of messages to the server to build a Web page. The user may set up the location of the server in a URL, but there's absolutely no reason that it couldn't be another computer system initiating the request.

However, the concept solved a major problem that Microsoft ran into with the Internet. DCOM proved largely to be a bust: you needed to set up both systems to run DCOM, it required some deep programming skills, and it was far more complicated to work with over the largely stateless Internet. SOAP thus came out (powered largely by Microsoft) as a way of providing COM information between separate processes transparently. Somewhere along the way the notion that XML should be used sparingly because of its size and instantiation costs seemed to get lost, and SOAP became the first strike that Microsoft would make into the world of an Internet COM.

The Transport Protocol of Choice
SOAP itself has evolved somewhat (and watching the SOAP discussion board, I'm using the word evolve here in a very...um, tempered...way) and been taken up by the W3C as the de facto standard for the XML Protocol working group. It was also adopted in March as the transport protocol of choice for the ebXML e-business initiatives that OASIS is developing.

Some compromises were made. The original SOAP spec wasn't designed for passing non-XML payloads, which still comprise the bulk of all such material over the Web, but the current incarnation of SOAP making its way through the XML Protocol committee does address this issue ... somewhat.

The payload issue highlights one of the other characteristics of SOAP, as well as bringing up another question. SOAP is often described as an envelope, or more properly, a set of protocols for transferring envelopes. In an ideal world, a SOAP message should in fact know nothing about its payload beyond the critical information of who the payload is from, who it is addressed to, and who it should be sent back to; however, like other electronic messaging systems, things aren't always that simple.

What about the situation where a SOAP message is sent to multiple servers or is intended to be forwarded to other addresses? What, in fact, constitutes an address, anyway? What should the recipient do upon taking in the message? Should information about the fact that the recipient was unable to work with the package, even if it was valid, be sent back? What constitutes an error? What responsibility does the client have to ensure that the recipient can understand the message?

These are all questions that are currently hotly debated in the Web services world, and the fact that there are any number of companies jumping on the Web services bandwagon even before many of these questions are adequately resolved is somewhat worrisome.

There's a second, not-so-subtle concern here. A SOAP message is an envelope; envelopes have been opened, examined, and resealed before, sometimes with contents altered, other times with their contents examined. SOAP is no different, and in fact the text nature of the contents makes SOAP even more transparent to snooping than binary messages would be.

This in turn means that in many applications, either security will need to be applied over the network itself to prevent intrusion and interception (something done currently with SSL) or the security will have to be provided at the encryption level. A central tenet to such security is that you basically attempt to hide as much of the relevant information as possible - if the payload is encrypted but the header contains the methods to be called in plain-text, this acts as a flag to would-be hackers of the system that indicates what the content is about.

This isn't an insurmountable problem, but it does lead to another one that is perhaps more pertinent. The more secure a message is, the more horsepower and time it takes to perform both encryption and decryption, and the larger the messages themselves become.

Couple this with the fact that SOAP messages are typically used in an RPC fashion - they are called autonomously by a software program operating at speeds several billions times faster than human beings, and the question naturally arises about how quickly the server will get overwhelmed.

DDOS Attack Can Incapacitate
Before dismissing this complaint out of hand, keep in mind that a denial of service attack on a server is precisely this kind of a system - automated, asynchronous, anonymous - and it's instructive to realize that most of the companies that have been pushing Web services technology have also been incapacitated at one time or another by a DDOS attack.

Add to this another, fairly insidious facet of working over the HTTP protocol. HTTP is asynchronous. Asynchrony has a number of advantages in general over synchronous calls in normal HTTP usage. For instance, you can scale far more effectively with asynchronous calls and can increase the number of potential connections exponentially because they aren't really connections in the stateful sense.

Most RPC calls are made over stateful connections, however - and usually for good reason. It's easier to secure a stateful line. You can optimize stateful connections, and enveloping (and hence the sizes of those envelopes) can be made more efficient. You don't have to interrupt a stateful connection to process another message on another thread. It more easily models the behavior of stateful connections within local COM-type systems.

On the other hand, it also significantly cuts down on the number of clients you can offer your service to. This holds for RPCs in general, and raises some questions about the long-term wisdom of attempting to perform step-wise computations through a series of RPCs on a distant machine, which is in essence the COM model writ large.

On a single dedicated network that services a given organization, such control is not noticeably a bad thing, but over the slow, frequently unreliable Internet too extravagant a use of RPCs could potentially bring the entire network to a crawl (the tragedy of the commons, writ globally).

There is an alternate model, one that in fact works quite well and is not all that radically different from where Web services are developing now.

A Different Mindset
The central idea is to batch RPCs into a single process, transmit this set of instructions to the server asynchronously, have that computer perform the actions that it needs, then pass that information back. This is a goal that is also more consistent with XML in general, since XML generally works best when visualized as the encapsulation of a process rather than a single action. It requires a different mindset from the way that COM apps are programmed - those focus on a series of discrete actions and are ultimately synchronous, even if threaded.

Of course, the danger to many software vendors in particular is that such a methodology doesn't necessarily fit well with their existing offerings. It's sexier to market the ability to have a command or function call that can operate transparently (regardless of where the machine is) than it is to make developers think about the processes that they want to operate remotely. It's also easier to have those same programmers work poorly in their existing paradigm rather than work more efficiently in a different paradigm.

XML is a different way of thinking. A good XML developer can tell you that the approach you take to working with XML data varies considerably from the way you handle object-oriented code; that it is far more oriented toward the manipulation of sets of information rather than property-driven objects. This thinking extends to SOAP and Web services, which can in fact be powerful, but only if people stop trying to make SOAP into an object container.

Personal Space
Okay, that's one simple question with a remarkably complex answer. The next question falls into the journalistic vein. Contestant No. 2, "Who needs Web services?"

"I know! Web services will replace modern software, will make it easier to do business-to-business transactions, and will make it possible for consumers to talk to their toasters!"

Now personally, if any of you have a compelling need to talk to your toasters, over SOAP or otherwise, then you probably have bigger problems than which Web protocols to use.

Seriously, Web services are often invoked for use in two fairly distinct areas. The first lies in the consumer market, where Web services will replace the Internet as we now know it, while the second is in the business-to-business sphere, where Web services will replace EDI as we know it.

Pounds of Salt
I think in many respects that, contrary to the analysts at any number of multibillion- dollar business companies, neither one of these are terribly well suited to Web services at all. Of course, the analysts at these conglomerates make six figures and in most years I'm doing well making five, so take what I have to say with a healthy pound or two of salt.

The Extremely Efficient Home
Look first at the consumer market. The vision here seems to be one of those concocted by the same people who have been pushing the completely automated home, one of extreme efficiency and interconnectedness where human beings don't need to worry about those messy interactions with other human beings. The scenario runs something like this: Janet is heading off to work when her PDA beeps at her indicating that her son, Tim, needs to get his cavity filled by the dentist. She presses a button and her PDA sends SOAP messages off to the dentist that negotiate a time when both her schedule and the dentist's are open. The message gets sent to Janet's calendar, which updates itself and sets up the necessary alarm.

No need to play phone tag, completely transparent, the ultimate plug-and-play operation. It neglects, however, the fact that you're not dealing with scenarios here; you're dealing with people.

Most people don't have their lives scheduled to such a degree that they can have an automated process do all of these negotiations for them. The interfaces involved, if current interface design is any indication, will probably end up taking far more steps than a simple phone call would. Most software, in fact, tends to limit the flexibility of design in favor of the cost of coding. In my experience taking myself and a school-age kid to the dentist many times, the appointments are typically made at the end of the previous appointment.

Finally, there is an implicit assumption on the part of the evangelists that Janet's calendar and the dentist's appointment system all agree on a common protocol, which means that they must subscribe to the same set of services. If Janet and the dentist happen, heaven forbid, to belong to different software services, does this mean that Janet can't go to this dentist?

I've heard a number of similar consumer-based scenarios, and what I find most interesting is that they all seem to share a number of common faults:

  • Lack of Social Interactions: In the personal sector, interactions serve many more purposes than the simple ones of exchanging information. In a non-networked computer world, this wasn't so much of an issue, because the majority of computerized interactions were largely related to the immediate task at hand.

    As we move increasingly into a situation where computers could handle tasks that have up to now been largely social or communicative, there's going to be increasing resistance on the part of people who feel they're giving up social interaction for machine efficiency. (If there's any real doubt about this, consider the scan-it-yourself grocery stores, which have discovered that after the novelty has passed, most people stop using the scan lanes in favor of human interaction. There's a lesson here for any service provider.)

  • Efficiency vs Flexibility: Computers are the ultimate efficiency machines. There are few tasks that couldn't be made more efficient; however, one of the first casualties of any system where efficiency is optimized is flexibility. XML by itself is not terribly efficient precisely because it is designed to be flexible, and a good XML design can often handle contingency cases in ways that more traditional code (and make no mistake about it, Web services, for all their XML underpinnings, are very traditional code) will choke on. Personal services are hard to create precisely because they are unpredictable, and any service that fails to handle the vagaries of human interaction will not be accepted by people.

  • Semantic Incompatibility: The question about Janet and her dentist being on different protocols is an example of semantic incompatibility. The vision that underlies Hailstorm (and [not to pick on Microsoft] is pervasive in many other Web services offerings as well) is that one company essentially becomes the broker for every API out there. Your calendar and my scheduler will work fine if both happen to use the same version of the calendar Web service API, but simply because both use Web services doesn't guarantee that they'll be able to talk to you. The solution according to Microsoft is simple: you have to move to a dentist who does support your service. Think about that one... carefully. *Utility vs Privacy: Another common scenario in the direct personal sphere is the notion of being able to call up from your cell phone to notify the thermostat to turn itself on as you're coming home, to start thawing the chicken for supper and to start the e-mail download. Telematics, the industry of automating objects via external commands, sounds cool in theory, but in practice never lives up to the hype.

    There may be a few people who love gadgets so much that they couldn't live without the poultry defroster and the thermostat control system, but in practice the efficiencies of time don't justify the worry. If the television could potentially record and periodically transmit what you watch, then it can also charge you for certain shows that used to be free - and also provide a powerful tool to advertisers about what does and doesn't motivate you. Remember, any pipe that comes in also has to go out.

    SUMMARY
    I guess, in the long term, I worry that the marginal benefits most people would personally get out of the use of Web services come at the potentially high cost of letting companies control even more of their lives than they do now. There are exceptions, which I discuss in Part 2 of this article (appearing next month), but the potential for abuse of Web services in the personal sphere is strong enough that people should seriously explore before letting it become a dominant feature in their lives.

    Next month, in Part 2 of this article, Cagle covers B2B integration, intra-business applications, peer-to-peer, and the decentralization of services.

  • More Stories By Kurt Cagle

    Kurt Cagle is a developer and author, with nearly 20 books to his name and several dozen articles. He writes about Web technologies, open source, Java, and .NET programming issues. He has also worked with Microsoft and others to develop white papers on these technologies. He is the owner of Cagle Communications and a co-author of Real-World AJAX: Secrets of the Masters (SYS-CON books, 2006).

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.