mixed content revisited
scalaxb added support for mixed contents a while back. When <xs:complexType mixed="true">
, text nodes are placed in conjunction with the subelements of the complex type, like XHTML. Since I implemented it, it's been bothering me that the generated case class is not DRY.
For example,
<xs:element name="mixedTest">
<xs:complexType mixed="true">
<xs:choice maxOccurs="unbounded">
<xs:element name="billTo" type="Address"/>
<xs:any namespace="##other" processContents="lax" />
</xs:choice>
</xs:complexType>
</xs:element>
would generate case class Element3(arg1: Seq[rt.DataRecord[Any]], mixed: Seq[rt.DataRecord[Any]])
. The first parameter arg1
contains the subelements; and the second mixed
contains both the subelements and the text node. In order for the case class to round trip back to XML, it needed to store both text node and subelements in order in mixed
; however because parsing was performed only for subelements, only arg1
would contain Address
case class and mixed
would contain unparsed scala.xml.Elem
instance. Not very nice.
All this was partially due to the fact the parsing logic was not very sophisticated.
Now that real parsers are used for parsing, it was time to revisit this issue. We still need to preserve the order so we need mixed
. If it handled parsing properly, we could get rid of arg1
. So I updated the parsing logic to treat text node as part of the grammer when the complex type is mixed. Here's what scalaxb generates from the above example:
case class MixedTest(mixed: Seq[rt.DataRecord[Any]]) object MixedTest extends rt.ElemNameParser[MixedTest] { val targetNamespace: Option[String] = Some("http://www.example.com/mixed") def isMixed: Boolean = true def parser(node: scala.xml.Node): Parser[MixedTest] = optTextRecord ~ rep((((((rt.ElemName(targetNamespace, "billTo")) ^^ (x => rt.DataRecord(x.namespace, Some(x.name), Address.fromXML(x.node)))) ~ optTextRecord) ^^ { case p1 ~ p2 => Seq.concat(Seq(p1), p2.toList) })) | (((any ^^ (x => rt.DataRecord(x.namespace, Some(x.name), x.node))) ~ optTextRecord) ^^ { case p1 ~ p2 => Seq.concat(Seq(p1), p2.toList) })) ~ optTextRecord ^^ { case p1 ~ p2 ~ p3 => MixedTest(Seq.concat(p1.toList, p2.flatten, p3.toList)) } def toXML(__obj: MixedTest, __namespace: Option[String], __elementLabel: Option[String], __scope: scala.xml.NamespaceBinding): scala.xml.NodeSeq = { var attribute: scala.xml.MetaData = scala.xml.Null scala.xml.Elem(rt.Helper.getPrefix(__namespace, __scope).orNull, __elementLabel getOrElse { error("missing element label.") }, attribute, __scope, __obj.mixed.flatMap(x => rt.DataRecord.toXML(x, x.namespace, x.key, __scope).toSeq): _*) } }
As you can see the address object is parsed properly, and it's stored only once. It seemed to have solved the problem, but it created a whole another issue for round trip. DataRecord.toXML
no longer knew how to output XML since it does not store scala.xml.Elem
anymore. mixed
is declared as rt.DataRecord[Any]
so it can store built-in types like Int
and String
, XML nodes like scala.xml.Elem
, and finally user-defined case classes like Address
. XML output logic for built-in types and XML nodes can be shipped, but the user-defined types needs to be supported too. This looked like a good opportunity for me to try implementing type class:
trait XMLWriter[A] { implicit val ev = this def toXML(__obj: A, __namespace: Option[String], __elementLabel: Option[String], __scope: NamespaceBinding): NodeSeq }
All of the companion objects already implement toXML
, so they just have to extend XMLWriter[A]
. However, I did not find a way to grab XMLWriter[A]
out of Any
once the object is stored in DataRecord[Any]
, which means XMLWriter[A]
needs to be stored in DataRecord[A]
. The problem with that approach is that it introduces extra parameter that's always set to specific value depending on value type A
or DataRecord[A]
. For String
it will always be some __StringXMLWriter
and for Address
it will always be Address
. On top of that, it adds an extra parameter that's not useful during pattern matching. Here's how I worked around it.
First, add a constructor helper method under object DataRecord
called def dataRecord
, which take the first three parameters explicitly and take the XMLWriter[A]
implicitly using the context-bound grammer:
def dataRecord[A:XMLWriter](namespace: Option[String], key: Option[String], value: A): DataRecord[A] = DataRecord(namespace, key, value, implicitly[XMLWriter[A]])
At this point we need to supply implicit values for built-in types and XML nodes that are used in scalaxb:
object XMLWriter { implicit object __NodeXMLWriter extends XMLWriter[Node] { def toXML(__obj: Node, __namespace: Option[String], __elementLabel: Option[String], __scope: NamespaceBinding): NodeSeq = __obj } implicit object __StringXMLWriter extends XMLWriter[String] { def toXML(__obj: String, __namespace: Option[String], __elementLabel: Option[String], __scope: scala.xml.NamespaceBinding): scala.xml.NodeSeq = Helper.stringToXML(__obj, __namespace, __elementLabel, __scope) } ... }
Interesting thing about Scala spec is where it looks for the implicit parameters. Programming in Scala p.440-441:
Moreover, with one exception, the implicit conversion must be in scope as a single identifier.
There's one exception to the "single identifier" rule. The compiler will also look for implicit definitions in the companion object of the source or expected target types of the conversion.
Note implicit val ev = this
in the definition of XMLWriter[A]
:
trait XMLWriter[A] { implicit val ev = this def toXML(__obj: A, __namespace: Option[String], __elementLabel: Option[String], __scope: NamespaceBinding): NodeSeq }
Since Address
extends XMLWriter[Address]
, this make Address
object available as an implicit value.
The new def dataRecord
will at least solve the construction of DataRecord
but we are still stuck with four parameters for pattern matching.
Pattern matching is nothing but an application of def unapply
. In order to keep compatibility with the older DataRecord
, we can define DataRecord
as a trait with three original values. In the object DataRecord
, we can define unapply
as follows:
def unapply[A](record: DataRecord[A]): Option[(Option[String], Option[String], A)] = Some(record.namespace, record.key, record.value)
Now that pattern matching is faked, we might as well fake the object construction. Instead of def dataRecord
, we can say def apply
to mimic the constructor of DataRecord
. To actually hold the values including XMLWriter[A]
, we define a private case class within object DataRecord
:
object DataRecord { private case class DataWriter[+A]( namespace: Option[String], key: Option[String], value: A, writer: XMLWriter[_]) extends DataRecord[A] def apply[A:XMLWriter](namespace: Option[String], key: Option[String], value: A): DataRecord[A] = DataWriter(namespace, key, value, implicitly[XMLWriter[A]]) def apply[A:XMLWriter](value: A): DataRecord[A] = apply(None, None, value) def unapply[A](record: DataRecord[A]): Option[(Option[String], Option[String], A)] = Some(record.namespace, record.key, record.value) def toXML[A](__obj: DataRecord[A], __namespace: Option[String], __elementLabel: Option[String], __scope: scala.xml.NamespaceBinding): scala.xml.NodeSeq = __obj match { case w: DataWriter[_] => w.writer.asInstanceOf[XMLWriter[A]].toXML(__obj.value, __namespace, __elementLabel, __scope) case _ => error("unknown DataRecord.") } }
Now we have backward-compatible DataRecord
, which also does type-specific XML output.