attributes and namespaces in xml

implicit.ly

Released scalaxb 0.0.2, and announced scalaxb on mailing lists and on implicit.ly. This implicit.ly is a cool service. Basically I write release notes in markdown, and using the posterous-sbt, it creates an entry on the website by my typing:

$ sbt publish-notes

Magic.

Attributes

I've heard on occasions that the way attributes work in XML is a mess. It is. The fault is not at attributes per se, but it's the way XML namespace is implemented that's confusing. The spec is called Namespaces in XML 1.0. Try to keep a straight face.

  1. Default namespace declarations do not apply directly to attribute names.
  2. The interpretation of unprefixed attributes is determined by the element on which they appear.
  3. The namespace name for an unprefixed attribute name always has no value.
  4. In all cases, the local name is local part.
  5. No tag may contain two attributes which have identical names.
  6. No tag may contain two attributes which have qualified names with the same local part and with prefixes which have been bound to namespace names that are identical.
  7. All element and attribute names contain either zero or one colon.
  8. No attributes with a declared type of ID, IDREF(S), ENTITY(IES), or NOTATION contain any colons.

In the spec, there used to be a section called The Internal Structure of XML Namespaces, which is still available in the first version of the spec from 1999. It's interesting because it's whining about the spec within the spec itself.

XML 1.0 does not provide a built-in way to declare "global" attributes; items such as the HTML CLASS attribute are global only in their prose description and their interpretation by HTML applications.

The appendix does contain some useful parts, which would only make sense if it already made sense to you. It claims that a XML namespace segmented into three partitions:

  1. The All Element Types Partition
  2. The Global Attribute Partition
  3. The Per-Element-Type Partitions

In other words, an attribute can be identified as the following tuple:

(namespace: Option[String], element: Option[(String, String)], localPart: String)

After reading this does it make sense that the following is valid?

<!-- http://www.w3.org is bound to n1 and is the default -->
<x xmlns:n1="http://www.w3.org" 
   xmlns="http://www.w3.org" >
  <good a="1"     n1:a="2" />
</x>

Try the tuple method. The first attribute a="1" becomes

(None, Some(("http://www.w3.org", "good")), "a")

and the second attribute n1:a="2" becomes

(Some("http://www.w3.org"), None, "a")

so their expanded names are different.

This of course is exactly the opposite the way namespaces work for elements.