XmlSerializer, Ambient XML Namespaces and Digital Signatures 

If you use XmlSerializer type to perform serialization of documents which are digitally signed later on, you should be careful.

XML namespaces which are included in the serialized form could cause trouble for anyone signing the document after serialization, especially in the case of normalized signature checks.

Let's go step by step.

Suppose we have this simple schema, let's call it problem.xsd:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema targetNamespace="
http://www.gama-system.com/problems.xsd"
           elementFormDefault="qualified"
           xmlns="
http://www.gama-system.com/problems.xsd"
           xmlns:xs="
http://www.w3.org/2001/XMLSchema">
  <xs:element name="Problem" type="ProblemType"/>
  <xs:complexType name="ProblemType">
    <xs:sequence>
      <xs:element name="Name" type="xs:string" />
      <xs:element name="Severity" type="xs:int" />
      <xs:element name="Definition" type="DefinitionType"/>
      <xs:element name="Description" type="xs:string" />
    </xs:sequence>
  </xs:complexType>
  <xs:complexType name="DefinitionType">
    <xs:simpleContent>
      <xs:extension base="xs:base64Binary">
        <xs:attribute name="Id" type="GUIDType" use="required"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
  <xs:simpleType name="GUIDType">
    <xs:restriction base="xs:string">
      <xs:pattern value="Id-[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-
                         [0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}"/>
    </xs:restriction>
  </xs:simpleType>
</xs:schema>

This schema describes a problem, which is defined by a name (typed as string), severity (typed as integer), definition (typed as byte array) and description (typed as string). The schema also says that the definition of a problem has an Id attribute, which we will use when digitally signing a specific problem definition. This Id attribute is defined as GUID, as the simple type GUIDType defines.

Instance documents validating against this schema would look like this:

<?xml version="1.0"?>
<Problem xmlns="
http://www.gama-system.com/problems.xsd">
  <Name>Specific problem</Name>
  <Severity>4</Severity>
  <Definition Id="c31dd112-dd42-41da-c11d-33ff7d2112s2">MD1sDQ8=</Definition>
  <Description>This is a specific problem.</Description>
</Problem>

Or this:

<?xml version="1.0"?>
<Problem xmlns="http://www.gama-system.com/problems.xsd">
  <Name>XML DigSig Problem</Name>
  <Severity>5</Severity>
  <Definition Id="b01cb152-cf93-48df-b07e-97ea7f2ec2e9">CgsMDQ8=</Definition>
  <Description>Ambient namespaces break digsigs.</Description>
</Problem>

Mark this one as exhibit A.

Only a few of you out there are still generating XML documents by hand, since there exists a notion of schema compilers. In the .NET Framework world, there is xsd.exe, which bridges the gap between the XML type system and the CLR type system.

xsd.exe /c problem.xsd

The tool compiles problem.xsd schema into the CLR type system. This allows you to use in-schema defined classes and serialize them later on with the XmlSerializer class. The second instance document (exhibit A) serialization program would look like this:

// generate problem
ProblemType problem = new ProblemType();
problem.Name = "XML DigSig Problem";
problem.Severity = 5;
DefinitionType dt = new DefinitionType();
dt.Id = Guid.NewGuid().ToString();
dt.Value = new byte[] { 0xa, 0xb, 0xc, 0xd, 0xf };
problem.Definition = dt;
problem.Description = "Ambient namespaces break digsigs.";

// serialize problem
XmlSerializer ser = new XmlSerializer(typeof(ProblemType));
FileStream stream = new FileStream("Problem.xml", FileMode.Create, FileAccess.Write);
ser.Serialize(stream, problem);
stream.Close();
           
Here lie the dragons.

XmlSerializer class default serialization mechanism would output this:

<?xml version="1.0"?>
<Problem xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xmlns:xsd="http://www.w3.org/2001/XMLSchema"
         xmlns="http://www.gama-system.com/problems.xsd">
  <Name>XML DigSig Problem</Name>
  <Severity>5</Severity>
  <Definition Id="b01cb152-cf93-48df-b07e-97ea7f2ec2e9">CgsMDQ8=</Definition>
  <Description>Ambient namespaces break digsigs.</Description>
</Problem>

Mark this one as exhibit B.

If you look closely, you will notice two additional prefix namespace declarations in exhibit B bound to xsi and xsd prefixes, against exhibit A.

The fact is, that both documents (exhibit B, and exhibit A) are valid against the problem.xsd schema.

<theory>

Prefixed namespaces are part of the XML Infoset. All XML processing is done on XML Infoset level. Since only declarations (look at prefixes xsi and xsd) are made in exhibit B, the document itself is not semantically different from exhibit A. That stated, instance documents are equivalent and should validate against the same schema.

</theory>

What happens if we sign the Definition element of exhibit B (XmlSerializer generated, prefixed namespaces present)?

We get this:

<?xml version="1.0"?>
<Problem xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xmlns:xsd="http://www.w3.org/2001/XMLSchema"
         xmlns="http://www.gama-system.com/problems.xsd">
  <Name>XML DigSig Problem</Name>
  <Severity>5</Severity>
  <Definition Id="b01cb152-cf93-48df-b07e-97ea7f2ec2e9">CgsMDQ8=</Definition>
  <Description>Ambient namespaces break digsigs.</Description>
  <Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
    <SignedInfo>
      <CanonicalizationMethod Algorithm="http://www.w3.org/TR/...20010315" />
      <SignatureMethod Algorithm="http://www.w3.org/...rsa-sha1" />
      <Reference URI="#Id-b01cb152-cf93-48df-b07e-97ea7f2ec2e9">
        <DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />
        <DigestValue>k3gbdFVJEpv4LWJAvvHUZZo/VUQ=</DigestValue>
      </Reference>
    </SignedInfo>
    <SignatureValue>K8f...p14=</SignatureValue>
    <KeyInfo>
      <KeyValue>
        <RSAKeyValue>
          <Modulus>eVs...rL4=</Modulus>
          <Exponent>AQAB</Exponent>
        </RSAKeyValue>
      </KeyValue>
      <X509Data>
        <X509Certificate>MIIF...Bw==</X509Certificate>
      </X509Data>
    </KeyInfo>
  </Signature>

</Problem>

Let's call this document exhibit D.

This document is the same as exhibit B, but has the Definition element digitally signed. Note the /Problem/Signature/SingedInfo/Reference[@URI] value. Digital signature is performed only on the Definition element and not the complete document.

Now, if one would validate the same document without the prefixed namespace declarations, as in:

<?xml version="1.0"?>
<Problem xmlns="http://www.gama-system.com/problems.xsd">
  <Name>XML DigSig Problem</Name>
  <Severity>5</Severity>
  <Definition Id="b01cb152-cf93-48df-b07e-97ea7f2ec2e9">CgsMDQ8=</Definition>
  <Description>Ambient namespaces break digsigs.</Description>
  <Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
    ...
  </Signature>

</Problem>

... the signature verification would fail. Let's call this document exhibit C.

<theory>

As said earlier, all XML processing is done on the XML Infoset level. Since ambient prefixed namespace declarations are visible in all child elements of the declaring element, exhibits C and D are different. Explicitly, element contexts are different for element Definition, since exhibit C does not have ambient declarations present and exhibit D does. The signature verification fails.

</theory>

Solution?

Much simpler than what's written above. Force XmlSerializer class to serialize what should be serialized in the first place. We need to declare the namespace definition of the serialized document and prevent XmlSerializer to be too smart. The .NET Framework serialization mechanism contains a XmlSerializerNamespaces class which can be specified during serialization process.

Since we know the only (and by the way, default) namespace of the serialized document, this makes things work out OK:

// generate problem
ProblemType problem = new ProblemType();
problem.Name = "XML DigSig Problem";
problem.Severity = 5;
DefinitionType dt = new DefinitionType();
dt.Id = Guid.NewGuid().ToString();
dt.Value = new byte[] { 0xa, 0xb, 0xc, 0xd, 0xf };
problem.Definition = dt;
problem.Description = "Ambient namespaces break digsigs.";

// serialize problem
XmlSerializerNamespaces xsn = new XmlSerializerNamespaces();
xsn.Add(String.Empty, "http://www.gama-system.com/problem.xsd");

XmlSerializer ser = new XmlSerializer(typeof(ProblemType));
FileStream stream = new FileStream("Problem.xml", FileMode.Create, FileAccess.Write);
ser.Serialize(stream, problem, xsn);
stream.Close();

This will force XmlSerializer to produce a valid document - with valid XML element contexts, without any ambient namespaces.

The question is, why does XmlSerialzer produce this namespaces by default? That should be a topic for another post.

Categories:  CLR | XML
Wednesday, 19 September 2007 21:57:57 (Central Europe Standard Time, UTC+01:00)  #    Comments

 

Copyright © 2003-2024 , Matevž Gačnik
Recent Posts
RD / MVP
Feeds
RSS: Atom:
Archives
Categories
Blogroll
Legal

The opinions expressed herein are my own personal opinions and do not represent my company's view in any way.

My views often change.

This blog is just a collection of bytes.

Copyright © 2003-2024
Matevž Gačnik

Send mail to the author(s) E-mail