
Originally Posted by
kwaclaw
This argument is pretty clear for most to understand, though it gets confusing when the "loose coupling" mantra is thrown into the discussion.
The "loose coupling" mantra is a red herring. XML is no more loosely coupled than the Ice encoding. People endlessly confuse syntax with semantics and assume that, just because I can pull arbitrary XML off the wire, things are more loosely coupled.
This simply is incorrect. The ability to parse a message without a-priori knowledge of its type does not couple things more loosely. For example, suppose I currently send an XML message that looks like this:
Code:
<address>
<housenumber>25</housenumber>
<street>Smith St</street>
<!-- ... -->
</address>
From looking at this, you will immediately recognize that this is some sort of address. Now, because XML has a predefined syntax, it's possible to pull this message off the wire and build the corresponding tree representation in memory. But, so what? All that means is that I now have the tree in memory, no more, no less.
Now, let's make a change to the message for a new version of the system:
Code:
<address>
<number>25</number>
<street>Smith St</street>
<!-- ... -->
</address>
I've renamed the tag from "housenumber" to "number". That's the equivalent of renaming a Slice structure member. What do I have to do in order to deal with the new message? Well, I have to change the code. Where it used to look for "housenumber", it now has to look for "number" instead.
Whether I use XML or Ice, either way, I have to change the code to accommodate the change. XML doesn't couple things more loosely than Slice.
There is also the old argument that "I can version the system by adding new elements, and everything will be fine because old versions of client and server can just ignore the bits they don't understand".
This argument is so flawed, it isn't funny. For one, most versioning problems cannot be solved by just adding new bits. Real-life versioning is far more complex and, more often than not, requires making incompatible changes. Moreover, what I don't know can be just as important as what I do know. For example, if someone sends me an XML purchase order that contains elements that I do not understand, would I act on it? I very much doubt it. For all I know, the unknown elements might state that the purchaser expects to get an 80% discount on my advertised price.
Yet another flawed argument is that XML is self-describing. Again, this is simply nonsense. XML is not self-describing, and never will be. Here is why:
Code:
<Addresse>
<Hausnummer>25</Hausnummer>
<Strasse>Smith St</Strasse>
<!-- ... -->
</Addresse>
This is the exact same message as the earlier one, with identical semantics. Except that the tags are now in German instead of English. If you happen to speak German, the message appears to be self-describing. But, if you do not, the message is gibberish. In other words, there is nothing self-describing in XML other than syntax. The only reason client and server can make sense of such messages is that they have an a-priori agreement on the semantics of the tags.
To drive this point home, here is the same message once more:
Code:
<x>
<y>25</y>
<z>Smith St</z>
<!-- ... -->
</x>
I've simply renamed the tags. If you want to understand this message, you must know, a priori, that "y" means "house number", and "z" means "street name". The message itself simply does not contain this knowledge.
I cannot even infer the type information. For example, looking at the message, you might conclude that the house number is an integer. Seems like a fair-enough assumption, until we stumble across
Code:
<housenumber>25a</housenumber>
Oops, the house number is a string after all...
Again, the understanding whether house numbers are strings or integers depends on an a-priori agreement. Whether that agreement is formally specified by something like IDL, or WSDL, or by simple good will does not matter. The agreement has to be in place a priori. If I change the type of the element's value without telling everyone else, I break the system. We are no more loosely coupled than we are with IDL or Slice.
What piqued my interest in Vinoskis article was rather this:
"For example, distributed systems typically require intermediaries to perform caching, filtering, monitoring, logging, and handling fan-in and fan-out scenarios. In large-scale systems, these intermediation services are “must haves” that ensure that the system will operate and perform as required. Unfortunately, RPC-oriented calls lack the metadata required to support intermediation because it’s simply not a concern for normal local invocations."
Hmmm... Distributed systems "require intermediaries" that are "must haves"?
As far as I can see, that is an assertion without substantiation. Last time I looked, there were tens of thousands of successful distributed systems around that got by without any such intermediaries.
Now, I don't doubt that there are situations where intermediaries might be useful. I can come up with use cases where it's nice to be able to "cook up" the data somehow while it is in transit from one place to another. But does XML actually provide that ability?
Let's think about this... Suppose I have some sort of message switch that performs caching, or otherwise does some sort of transformation on the XML that passes through it. Going back to the address example once more, suppose that there is a <country> element. If an address comes past that lacks this element, the intermediary can add a default that sets the country to "USA".
So, what does the intermediary have to do? Well, it needs to inspect every XML message that comes past, look to see whether it contains an address element without a country element and, if that country element is missing, add the default. Easy.
Does XML help with this? Does it mean that things are any more loosely coupled? Hardly. In order to transform the message, the intermediary must have type knowledge. It must know that there are such things as addresses, that they have a specific structure, that there is a country element that might need adding, and so on. In other words, to do its job, the message switch requires type knowledge. Or, in Steve's words, it requires metadata.
Where does that metadata come from? Certainly not from the XML message itself, because it doesn't have any metadata. Instead, the knowledge must come from an external source, such as WSDL, or the knowledge might be hard-coded into the program. Regardless of where the knowledge comes from, it again constitutes an a-priori agreement as to the semantics of the message. That is no different from establishing a client-server contract with Slice. In other words, we are just as tightly coupled as ever.
The intermediary argument confuses syntax with semantics. When Steve says that "RPC-oriented calls lack the metadata required to support intermediation", he is right: the metadata isn't inside the message, but comes from somewhere else. What he fails to see is that the XML does not contain that metadata either. In fact, whether the message is encoded as XML or using the Ice encoding is neither here nor there. In order to do anything useful with the message, I don't need to know just its syntax, I need to know its semantics. And neither XML nor Ice-encoded messages contain these semantics.
The same is true for the logging argument. True, an intermediary can log XML messages. Just as an Ice intermediary can log Ice messages. But is either activity actually useful?
If the XML intermediary does not have type knowledge, all it can do is log the raw XML message. But that's not very useful because, for logging to be useful, the data has to be cooked up in some way. But that is not possible without the metadata that the XML message does not contain.
Whether to encode things in XML or binary is largely a matter of efficiency and bandwidth. As far as the semantics are concerned, the two are exactly equivalent.
Now, clearly, intermediaries can be useful in distributed systems. No argument there. But XML does not make the job of creating such intermediaries any easier than a system using a binary protocol.
Cheers,
Michi.