Bug in XmlSerializer, XmlSerializerNamespaces 

XmlSerializer is a great peace of technology. Combined with xsd.exe and friends (XmlSerializerNamespaces, et al.) it's a powerful tool for one to get around XML instance serialization/deserialization.

But, there is a potentially serious bug present, even in 3.5 SP1 version of the .NET Framework.

Suppose we have the following XML structure:

<Envelope xmlns="NamespaceA"
          xmlns:B="NamespaceB">
  <B:Header></B:Header>
  <Body></Body>
</Envelope>

This tells you that Envelope, and Body elements are in the same namespace (namely 'NamespaceA'), while Header is qualified with 'NamespaceB'.

Now suppose we need to programmatically insert <B:Header> element into an empty, core, document.

Core document:

<Envelope xmlns="NamespaceA"
          xmlns:B="NamespaceB">
  <Body></Body>
</Envelope>

Now do an XmlNode.InsertNode() of the following:

<B:Header>...</B:Header>

We should get:

<Envelope xmlns="NamespaceA"
          xmlns:B="NamespaceB">
  <B:Header>...</B:Header>
  <Body></Body>
</Envelope>

To get the to be inserted part one would serialize (using XmlSerializer) the following Header document:

<B: Header xmlns:B="NamespaceB">
  ...
</B:Header>

To do this, a simple XmlSerializer magic will do the trick:

XmlSerializerNamespaces xsn = new XmlSerializerNamespaces();
xsn.Add("B", "NamespaceB");

XmlSerializer xSer = new XmlSerializer(typeof(Header));
XmlTextWriter tw = new XmlTextWriter(ms, null);
xSer.Serialize(tw, h, xsn);

ms.Seek(0, SeekOrigin.Begin);

XmlDocument doc = new XmlDocument()
doc.Load(ms);
ms.Close();

This would generate exactly what we wanted. A prefixed namespace based XML document, with the B prefix bound to 'NamespaceB'.

Now, if we would import this document fragment into our core document using XmlNode.ImportNode(), we would get:

<Envelope xmlns="NamespaceA"
          xmlns:B="NamespaceB">
  <B:Header xmlns:B="NamespaceB">...</B:Header>
  <Body></Body>
</Envelope>

Which is valid and actually, from an XML Infoset view, an isomorphic document to the original. So what if it's got the same namespace declared twice, right?

Right - until you involve digital signatures. I have described a specific problem with ambient namespaces in length in this blog entry: XmlSerializer, Ambient XML Namespaces and Digital Signatures.

When importing a node from another context, XmlNode and friends do a resolve against all namespace declarations in scope. So, when importing such a header, we shouldn't get a duplicate namespace declaration.

The problem is, we don't get a duplicate namespace declaration, since XmlSerializer actually inserts a normal XML attribute into the Header element. That's why we seem to get another namespace declaration. It's actually not a declaration but a plain old attribute. It's even visible (in this case in XmlElement.Attributes), and it definitely shouldn't be there.

So if you hit this special case, remove all attributes before importing the node into your core document. Like this:

XmlSerializerNamespaces xsn = new XmlSerializerNamespaces();
xsn.Add("B", "NamespaceB");

XmlSerializer xSer = new XmlSerializer(typeof(Header));
XmlTextWriter tw = new XmlTextWriter(ms, null);
xSer.Serialize(tw, h, xsn);

ms.Seek(0, SeekOrigin.Begin);

XmlDocument doc = new XmlDocument()
doc.Load(ms);
ms.Close();
doc.DocumentElement.Attributes.RemoveAll();

Note that the serialized document representation will not change, since an ambient namespace declaration (B linked to 'NamespaceB') still exists in the XML Infoset of doc XML document.

Categories:  .NET 3.5 - General | XML
Monday, April 06, 2009 1:22:33 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XmlSerializer: Serialized Syntax and How to Override It 

Recently I needed to specify exactly how I would like a specific class serialized. Suppose we have the following simple schema:

<xs:schema targetNamespace="http://my.favourite.ns/person.xsd"
   elementFormDefault="qualified"
   xmlns:xs="
http://www.w3.org/2001/XMLSchema"
  
xmlns="http://my.favourite.ns/person.xsd">
  <xs:complexType name="Person">
    <xs:sequence>
      <xs:element name="Name" type="xs:string" />
      <xs:element name="Surname" type="xs:string" />
      <xs:element name="Age" type="xs:int" minOccurs="0" />
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Let's call this schema President.xsd.

A valid XML instance against this schema would be, for example:

<Person xmlns="http://my.favourite.ns/person.xsd">
   <Name>Barrack</Name>
   <Surname>Obama</Surname>
   <Age>47</Age>
</Person>

Since we are serializing against a specific XML schema (XSD), we have an option of schema compilation:

xsd /c President.xsd

This, obviously, yields a programmatic type system result in a form of a C# class. All well and done.

Now.

If we serialize the filled up class instance back to XML, we get a valid XML instance. It's valid against President.xsd.

There is a case where your schema changes ever so slightly - read, the namespaces change, and you don't want to recompile the entire solution to support this, but you still want to use XML serialization. Who doesn't - what do you do?

Suppose we want to get the following back, when serializing:

<PresidentPerson xmlns="http://schemas.gama-system.com/
      president.xsd"
>
   <Name>Barrack</Name>
   <Surname>Obama</Surname>
   <Age>47</Age>
</PresidentPerson>

There is an option to override the default serialization technique of XmlSerializer. Enter  the world of XmlAttributes and XmlAttributeOverrides:

private XmlSerializer GetOverridedSerializer()
{
   // set overrides for person element
   XmlAttributes attrsPerson = new XmlAttributes();
   XmlRootAttribute rootPerson =
      new XmlRootAttribute("PresidentPerson");
   rootPerson.Namespace = "
http://schemas.gama-system.com/
      president.xsd
";
   attrsPerson.XmlRoot = rootPerson;

   // create overrider
   XmlAttributeOverrides xOver = new XmlAttributeOverrides();
   xOver.Add(typeof(Person), attrsPerson);

   XmlSerializer xSer = new XmlSerializer(typeof(Person), xOver);
   return xSer;
}

Now serialize normally:

Stream ms = new MemoryStream();
XmlTextWriter tw = new XmlTextWriter(ms, null);
xSer.Serialize(tw, person);

This will work even if you only have a compiled version of your object graph, and you don't have any sources. System.Xml.Serialization.XmlAttributeOverrides class allows you to adorn any XML serializable class with your own XML syntax - element names, attribute names, namespaces and types.

Remember - you can override them all and still serialize your angle brackets.


Categories:  XML
Friday, August 29, 2008 7:38:07 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Demos from the NT Conference 2008 

As promised, here are the sources from my NTK 2008 sessions [1].

Talk: Document Style Service Interfaces

Read the following blog entry, I tried to describe the concept in detail. Also, this blog post discusses issues when using large document parameters with reliable transport  (WS-RM) channels.

Demo: Document Style Service Interfaces [Download]

This demo defines a service interface with the document parameter model, ie. Guid CreatePerson(XmlDocument person). It shows three different approaches to creation of the passed document:

  1. Raw XML creation
  2. XML Serialization of the (attribute annotated) object graph
  3. XML Serialization using the client object model

Also, versioned schemas for the Person document are shown, including the support for document validation and version independence.

Talk: Windows Server 2008 and Transactional NTFS

This blog entry describes the concept.

Demo 1: Logging using CLFS (Common Log File System) [Download]
Demo 2: NTFS Transactions using the File System, SQL, WCF [Download]
Demo 3: NTFS Transactions using the WCF, MTOM Transport [Download] [2]

[1] All sources are in VS 2008 solution file format.
[2] This transaction spans from the client, through the service boundary, to the server.

Categories:  .NET 3.5 - WCF | Transactions | XML
Thursday, May 15, 2008 4:24:19 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Laws and Digital Signatures 

Suppose we have a document like this:

<?xml version="1.0"?>
<root xmlns="urn-foo-bar">
  <subroot>
    <value1>value1</value1>
    <value2>value2</value2>
  </subroot>
  <Signature xmlns="h
ttp://www.w3.org/2000/09/xmldsig#">
    <SignedInfo>
      <CanonicalizationMethod
        Algorithm="
http://www.w3.org/TR/2001/REC-xml-c14n-20010315" />
      <SignatureMethod
        Algorithm="
http://www.w3.org/2000/09/xmldsig#rsa-sha1" />
      <Reference URI="">
        <Transforms>
          <Transform 
            Algorithm="
http://www.w3.org/2000/09/
              xmldsig#enveloped-signature
"/>
        </Transforms>
        <DigestMethod
          Algorithm="
http://www.w3.org/2000/09/xmldsig#sha1" />
        <DigestValue>1Xp...EOko=</DigestValue>
      </Reference>
    </SignedInfo>
    <SignatureValue>nls...cH0k=</SignatureValue>
    <KeyInfo>
      <KeyValue>
        <RSAKeyValue>
          <Modulus>9f3W...fxG0E=</Modulus>
          <Exponent>AQAB</Exponent>
        </RSAKeyValue>
      </KeyValue>
      <X509Data>
        <X509Certificate>MIIEi...ktYgN</X509Certificate>
      </X509Data>
    </KeyInfo>
  </Signature>
</root>

This document represents data and an enveloped digital signature over the complete XML document. The digital signature completeness is defined in the Reference element, which has URI attribute set to empty string (Reference Uri="").

Checking the Signature

The following should always be applied during signature validation:

  1. Validating the digital signature
  2. Validating the certificate(s) used to create the signature
  3. Validating the certificate(s) chain(s)

Note: In most situations this is the optimal validation sequence. Why? Signatures are broken far more frequently then certificates are revoked/expired. And certificates are revoked/expired far more frequently then their chains.

1. Validating the digital signature

First, get it out of there:

XmlNamespaceManager xmlns = new XmlNamespaceManager(xdkDocument.NameTable); [1]
xmlns.AddNamespace("ds", "
http://www.w3.org/2000/09/xmldsig#");
XmlNodeList nodeList = xdkDocument.SelectNodes("//ds:Signature", xmlns);
 
[1] xdkDocument should be an XmlDocument instance representing your document.

Second, construct a SignedXml instance:

foreach (XmlNode xmlNode in nodeList)
{
  // create signed xml object
  SignedXml signedXml = new SignedXml(xdkDocument); [2]

  // verify signature
  signedXml.LoadXml((XmlElement)xmlNode);
}

[2] Note that we are constructing the SignedXml instance from a complete document, not only the signature. Read this.

Third, validate:

bool booSigValid = signedXml.CheckSignature();

If booSigValid is true, proceed.

2. Validating the certificate(s) used to create the signature

First, get it out of there:

XmlNode xndCert = xmlNode.SelectSingleNode(".//ds:X509Certificate", xmlns); [3]

[3] There can be multiple X509Certificate elements qualified with http://www.w3.org/2000/09/xmldsig# namespace in there. Xml Digital Signature specification is allowing the serialization of a complete certificate chain of the certificate used to sign the document. Normally, the signing certificate should be the first to be serialized.

Second, get the X509Certificate2 instance:

byte[] bytCert = Convert.FromBase64String(xndCert.InnerText);
X509Certificate2 x509cert = new X509Certificate2(bytCert);

Third, validate:

bool booCertValid = x509cert.Verify();

If booCertValid is true, proceed.

3. Validating the certificate(s) chain(s)

Building and validating the chain:

X509Chain certChain = new X509Chain();
bool booChainValid = certChain.Build(x509cert);
int intChainLength = certChain.ChainElements.Count; [4]

If booChainValid is true, your signature is valid.

Some Rules and Some Laws

We have three booleans:

  • booSigValid - signature validity
  • booCertValid - certificate validity
  • booChainValid - certificate's chain validity

If booSigValid evaluates to false, there is no discussion. Someone changed the document.

What happens if one of the following two expressions evaluates to true:

1. ((booSigValid) && (!booCertValid) && (!booChainValid))
2. ((booSigValid) && (booCertValid) && (!booChainValid))

This normally means that either the certificate is not valid (CRLed or expired) [4], or one of the chain's certificate is not valid/expired.

[4] The premise is that one checked the signature according to 1, 2, 3 schema described above.

The Question

Is digital signature valid even if CA revoked the certificate after the signature has already been done? Is it valid even after the certificate expires? If signature is valid and certificate has been revoked, what is the legal validity of the signature?

In legal terms, the signature would be invalid on both upper assertions, 1 and 2.

This means, that once the generator of the signature is dead, or one of his predecessors is dead, all his children die too.

Timestamps to the Rescue

According to most country's digital signature laws the signature is valid only during the validity of the signing certificate and validity of the signing certificate's chain, both being checked for revocation and expiry date ... if you don't timestamp it.

If the source document has another signature from a trusted authority, and that authority is a timestamp authority, it would look like this:

<?xml version="1.0"?>
<root xmlns="urn-foo-bar">
  <subroot>
    <value1>value1</value1>
    <value2>value2</value2>
  </subroot>
  <Signature xmlns="
http://www.w3.org/2000/09/xmldsig#">
    ...
  </Signature>
  <dsig:Signature Id="TimeStampToken"
   
xmlns:dsig="http://www.w3.org/2000/09/xmldsig#">
    <dsig:SignedInfo>
      <dsig:CanonicalizationMethod
        Algorithm="
http://www.w3.org/TR/2001/REC-xml-c14n-20010315" />
      <dsig:SignatureMethod
        Algorithm="
http://www.w3.org/2000/09/xmldsig#rsa-sha1" />
      <dsig:Reference
        URI="#TimeStampInfo-113D2EEB158BBB2D7CC000000000004DF65">
        <dsig:DigestMethod
          Algorithm="
http://www.w3.org/2000/09/xmldsig#sha1" />
          <dsig:DigestValue>y+xw...scKg=</dsig:DigestValue>
      </dsig:Reference>
      <dsig:Reference URI="#TimeStampAuthority">
        <dsig:DigestMethod
          Algorithm="
http://www.w3.org/2000/09/xmldsig#sha1" />
        <dsig:DigestValue>KhFIr...Sv4=</dsig:DigestValue>
      <dsig:/Reference>
    </dsig:SignedInfo>
    <dsig:SignatureValue>R4m...k3aQ==</dsig:SignatureValue>
    <dsig:KeyInfo Id="TimeStampAuthority">
      <dsig:X509Data>
        <dsig:X509Certificate>MII...Osmg==</dsig:X509Certificate>
      </dsig:X509Data>
    </dsig:KeyInfo>
    <dsig:Object
      Id="TimeStampInfo-113D2EEB158BBB2D7CC000000000004DF65">
      <ts:TimeStampInfo
         xmlns:ts="
http://www.provider.com/schemas
           
/timestamp-protocol-20020207
"
         xmlns:ds="
http://www.w3.org/2000/09/xmldsig#">
        <ts:Policy id="
http://provider.tsa.com/documents" />
          <ts:Digest>
            <ds:DigestMethod Algorithm="
http://www.w3.org/2000/
              09/xmldsig#sha1"
/>
            <ds:DigestValue>V7+bH...Kmsec=</ds:DigestValue>
          </ts:Digest>
          <ts:SerialNumber>938...045</ts:SerialNumber>
          <ts:CreationTime>2008-04-13T11:31:42.004Z</ts:CreationTime>
          <ts:Nonce>121...780</ts:Nonce>
      </ts:TimeStampInfo>
    </dsig:Object>
  </dsig:Signature>
</root>

The second signature would be performed by an out-of-band authority, normally a TSA authority. It would only sign a hash value (in this case SHA1 hash) which was constructed by hashing the original document and the included digital signature.

This (second) signature should be checked using the same 1, 2, 3 steps. For the purpose of this mind experiment, let's say it would generate a booTimestampValid boolean.

Now, let's reexamine the booleans:

  1. ((booSigValid) && (!booCertValid) && (!booChainValid) && (booTimestampValid))
  2. ((booSigValid) && (booCertValid) && (!booChainValid) && (booTimestampValid))

In this case, even though the signature's certificate (or its chain) is invalid, the signature would pass legal validity if the timesamp's signature is valid, together with its certificate and certificate chain. Note that the TSA signature is generated with a different set of keys than the original digital signature.

Actually booTimestampValid is defined as ((booSigValid) && (booCertValid) && (booChainValid)) for the timestamp signature/certificate/certificate chain [5].

[5] Legal validity is guaranteed only in cases where 1 or 2 are true.

Categories:  Other | XML
Wednesday, April 16, 2008 6:32:29 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Happy Birthday XML 

This week XML is ten years old. The core XML 1.0 specification was released in February 1998.

It's a nice anniversary to have.

The XML + Namespaces specification has a built in namespace declaration of http://www.w3.org/XML/1998/namespace. That's an implicit namespace declaration, a special one, governing all other. One namespace declaration to rule them all. Bound to xml: prefix.

XML was born and published as a W3C Recommendation on 10th of February 1998.

So, well done XML. You did a lot for IT industry in the past decade.

Categories:  XML
Wednesday, February 13, 2008 8:36:09 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Approaches to Document Style Parameter Models 

I'm a huge fan of document style parameter models when implementing a public, programmatic façade to a business functionality that often changes.

public interface IDocumentParameterModel
{
   [OperationContract]
   [FaultContract(typeof(XmlInvalidException))]
   XmlDocument Process(XmlDocument doc);
}

This contract defines a simple method, called Process, which processes the input document. The idea is to define the document schema and validate inbound XML documents, while throwing exceptions on validation errors. The processing semantics is arbitrary and can support any kind of action, depending on the defined invoke document schema.

A simple instance document which validates against a version 1.0 processing schema could look like this:

<?xml version="1.0?>
<Process xmlns="http://www.gama-system.com/process10.xsd" version="1.0">
   <Instruction>Add</Instruction>
   <Parameter1>10</Parameter1>
   <Parameter2>21</Parameter2>
</Process>

Another processing instruction, supported in version 1.1 of the processing schema, with different semantics could be:

<?xml version="1.0?>
<Process xmlns="http://www.gama-system.com/process11.xsd" version="1.1">
   <Instruction>Store</Instruction>
   <Content>77u/PEFwcGxpY2F0aW9uIHhtbG5zPSJod...mdVcCI</Content>
</Process>

Note that the default XML namespace changed, but that is not a norm. It only allows you to automate schema retrieval using the schema repository (think System.Xml.Schema.XmlSchemaSet), load all supported schemas and validate automatically.

public class ProcessService : IDocumentParameterModel
{
   public XmlDocument Process(XmlDocument doc)
   {
      XmlReaderSettings sett = new XmlReaderSettings();

      sett.Schemas.Add(<document namespace 1>, <schema uri 1>);
      ...
      sett.Schemas.Add(<document namespace n>, <schema uri n>);

      sett.ValidationType = ValidationType.Schema;
      sett.ValidationEventHandler += new
         ValidationEventHandler(XmlInvalidHandler);
      XmlReader books = XmlReader.Create(doc.OuterXml, sett);
      while (books.Read()) { }

      // processing goes here
      ...
   }

   static void XmlInvalidHandler(object sender, ValidationEventArgs e)
   {
      if (e.Severity == XmlSeverityType.Error)
         throw new XmlInvalidException(e.Message);
   }
}

The main benefit of this approach is decoupling the parameter model and method processing version from the communication contract. A service maintainer has an option to change the terms of processing over time, while supporting older version-aware document instances.

This notion is of course most beneficial in situations where your processing syntax changes frequently and has complex validation schemas. A simple case presented here is informational only.

So, how do we validate?

  • We need to check the instance document version first. This is especially true in cases where the document is not qualified with a different namespace when the version changes.
  • We grab the appropriate schema or schema set
  • We validate the inbound XML document, throw a typed XmlInvalidException if invalid
  • We process the call

The service side is quite straightforward.

Let's look at the client and what are the options for painless generation of service calls using this mechanism.

Generally, one can always produce an instance invoke document by hand on the client. By hand meaning using System.Xml classes and DOM concepts. Since this is higly error prone and gets tedious with increasing complexity, there is a notion of a schema compiler, which automatically translates your XML Schema into the CLR type system. Xsd.exe and XmlSerializer are your friends.

If your schema requires parts of the instance document to be digitally signed or encrypted, you will need to adorn the serializer output with some manual DOM work. This might also be a reason to use the third option.

The third, and easiest option for the general developer, is to provide a local object model, which serializes the requests on the client. This is an example:

ProcessInstruction pi = new ProcessInstruction();
pi.Instruction = "Add";
pi.Parameter1 = 10;
pi.Parameter2 = 21;
pi.Sign(cert); // pi.Encrypt(cert);
pi.Serialize();
proxy.Process(pi.SerializedForm);

The main benefit of this approach comes down to having an option on the server and the client. Client developers have three different levels of complexity for generating service calls. The model allows them to be as close to the wire as they see fit. Or they can be abstracted completely from the wire representation if you provide a local object model to access your services.

Categories:  .NET 3.0 - WCF | .NET 3.5 - WCF | Architecture | Web Services | XML
Monday, September 24, 2007 11:19:10 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XmlSerializer, Ambient XML Namespaces and Digital Signatures 

If you use XmlSerializer type to perform serialization of documents which are digitally signed later on, you should be careful.

XML namespaces which are included in the serialized form could cause trouble for anyone signing the document after serialization, especially in the case of normalized signature checks.

Let's go step by step.

Suppose we have this simple schema, let's call it problem.xsd:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema targetNamespace="
http://www.gama-system.com/problems.xsd"
           elementFormDefault="qualified"
           xmlns="
http://www.gama-system.com/problems.xsd"
           xmlns:xs="
http://www.w3.org/2001/XMLSchema">
  <xs:element name="Problem" type="ProblemType"/>
  <xs:complexType name="ProblemType">
    <xs:sequence>
      <xs:element name="Name" type="xs:string" />
      <xs:element name="Severity" type="xs:int" />
      <xs:element name="Definition" type="DefinitionType"/>
      <xs:element name="Description" type="xs:string" />
    </xs:sequence>
  </xs:complexType>
  <xs:complexType name="DefinitionType">
    <xs:simpleContent>
      <xs:extension base="xs:base64Binary">
        <xs:attribute name="Id" type="GUIDType" use="required"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
  <xs:simpleType name="GUIDType">
    <xs:restriction base="xs:string">
      <xs:pattern value="Id-[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-
                         [0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}"/>
    </xs:restriction>
  </xs:simpleType>
</xs:schema>

This schema describes a problem, which is defined by a name (typed as string), severity (typed as integer), definition (typed as byte array) and description (typed as string). The schema also says that the definition of a problem has an Id attribute, which we will use when digitally signing a specific problem definition. This Id attribute is defined as GUID, as the simple type GUIDType defines.

Instance documents validating against this schema would look like this:

<?xml version="1.0"?>
<Problem xmlns="
http://www.gama-system.com/problems.xsd">
  <Name>Specific problem</Name>
  <Severity>4</Severity>
  <Definition Id="c31dd112-dd42-41da-c11d-33ff7d2112s2">MD1sDQ8=</Definition>
  <Description>This is a specific problem.</Description>
</Problem>

Or this:

<?xml version="1.0"?>
<Problem xmlns="http://www.gama-system.com/problems.xsd">
  <Name>XML DigSig Problem</Name>
  <Severity>5</Severity>
  <Definition Id="b01cb152-cf93-48df-b07e-97ea7f2ec2e9">CgsMDQ8=</Definition>
  <Description>Ambient namespaces break digsigs.</Description>
</Problem>

Mark this one as exhibit A.

Only a few of you out there are still generating XML documents by hand, since there exists a notion of schema compilers. In the .NET Framework world, there is xsd.exe, which bridges the gap between the XML type system and the CLR type system.

xsd.exe /c problem.xsd

The tool compiles problem.xsd schema into the CLR type system. This allows you to use in-schema defined classes and serialize them later on with the XmlSerializer class. The second instance document (exhibit A) serialization program would look like this:

// generate problem
ProblemType problem = new ProblemType();
problem.Name = "XML DigSig Problem";
problem.Severity = 5;
DefinitionType dt = new DefinitionType();
dt.Id = Guid.NewGuid().ToString();
dt.Value = new byte[] { 0xa, 0xb, 0xc, 0xd, 0xf };
problem.Definition = dt;
problem.Description = "Ambient namespaces break digsigs.";

// serialize problem
XmlSerializer ser = new XmlSerializer(typeof(ProblemType));
FileStream stream = new FileStream("Problem.xml", FileMode.Create, FileAccess.Write);
ser.Serialize(stream, problem);
stream.Close();
           
Here lie the dragons.

XmlSerializer class default serialization mechanism would output this:

<?xml version="1.0"?>
<Problem xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xmlns:xsd="http://www.w3.org/2001/XMLSchema"
         xmlns="http://www.gama-system.com/problems.xsd">
  <Name>XML DigSig Problem</Name>
  <Severity>5</Severity>
  <Definition Id="b01cb152-cf93-48df-b07e-97ea7f2ec2e9">CgsMDQ8=</Definition>
  <Description>Ambient namespaces break digsigs.</Description>
</Problem>

Mark this one as exhibit B.

If you look closely, you will notice two additional prefix namespace declarations in exhibit B bound to xsi and xsd prefixes, against exhibit A.

The fact is, that both documents (exhibit B, and exhibit A) are valid against the problem.xsd schema.

<theory>

Prefixed namespaces are part of the XML Infoset. All XML processing is done on XML Infoset level. Since only declarations (look at prefixes xsi and xsd) are made in exhibit B, the document itself is not semantically different from exhibit A. That stated, instance documents are equivalent and should validate against the same schema.

</theory>

What happens if we sign the Definition element of exhibit B (XmlSerializer generated, prefixed namespaces present)?

We get this:

<?xml version="1.0"?>
<Problem xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xmlns:xsd="http://www.w3.org/2001/XMLSchema"
         xmlns="http://www.gama-system.com/problems.xsd">
  <Name>XML DigSig Problem</Name>
  <Severity>5</Severity>
  <Definition Id="b01cb152-cf93-48df-b07e-97ea7f2ec2e9">CgsMDQ8=</Definition>
  <Description>Ambient namespaces break digsigs.</Description>
  <Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
    <SignedInfo>
      <CanonicalizationMethod Algorithm="http://www.w3.org/TR/...20010315" />
      <SignatureMethod Algorithm="http://www.w3.org/...rsa-sha1" />
      <Reference URI="#Id-b01cb152-cf93-48df-b07e-97ea7f2ec2e9">
        <DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />
        <DigestValue>k3gbdFVJEpv4LWJAvvHUZZo/VUQ=</DigestValue>
      </Reference>
    </SignedInfo>
    <SignatureValue>K8f...p14=</SignatureValue>
    <KeyInfo>
      <KeyValue>
        <RSAKeyValue>
          <Modulus>eVs...rL4=</Modulus>
          <Exponent>AQAB</Exponent>
        </RSAKeyValue>
      </KeyValue>
      <X509Data>
        <X509Certificate>MIIF...Bw==</X509Certificate>
      </X509Data>
    </KeyInfo>
  </Signature>

</Problem>

Let's call this document exhibit D.

This document is the same as exhibit B, but has the Definition element digitally signed. Note the /Problem/Signature/SingedInfo/Reference[@URI] value. Digital signature is performed only on the Definition element and not the complete document.

Now, if one would validate the same document without the prefixed namespace declarations, as in:

<?xml version="1.0"?>
<Problem xmlns="http://www.gama-system.com/problems.xsd">
  <Name>XML DigSig Problem</Name>
  <Severity>5</Severity>
  <Definition Id="b01cb152-cf93-48df-b07e-97ea7f2ec2e9">CgsMDQ8=</Definition>
  <Description>Ambient namespaces break digsigs.</Description>
  <Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
    ...
  </Signature>

</Problem>

... the signature verification would fail. Let's call this document exhibit C.

<theory>

As said earlier, all XML processing is done on the XML Infoset level. Since ambient prefixed namespace declarations are visible in all child elements of the declaring element, exhibits C and D are different. Explicitly, element contexts are different for element Definition, since exhibit C does not have ambient declarations present and exhibit D does. The signature verification fails.

</theory>

Solution?

Much simpler than what's written above. Force XmlSerializer class to serialize what should be serialized in the first place. We need to declare the namespace definition of the serialized document and prevent XmlSerializer to be too smart. The .NET Framework serialization mechanism contains a XmlSerializerNamespaces class which can be specified during serialization process.

Since we know the only (and by the way, default) namespace of the serialized document, this makes things work out OK:

// generate problem
ProblemType problem = new ProblemType();
problem.Name = "XML DigSig Problem";
problem.Severity = 5;
DefinitionType dt = new DefinitionType();
dt.Id = Guid.NewGuid().ToString();
dt.Value = new byte[] { 0xa, 0xb, 0xc, 0xd, 0xf };
problem.Definition = dt;
problem.Description = "Ambient namespaces break digsigs.";

// serialize problem
XmlSerializerNamespaces xsn = new XmlSerializerNamespaces();
xsn.Add(String.Empty, "http://www.gama-system.com/problem.xsd");

XmlSerializer ser = new XmlSerializer(typeof(ProblemType));
FileStream stream = new FileStream("Problem.xml", FileMode.Create, FileAccess.Write);
ser.Serialize(stream, problem, xsn);
stream.Close();

This will force XmlSerializer to produce a valid document - with valid XML element contexts, without any ambient namespaces.

The question is, why does XmlSerialzer produce this namespaces by default? That should be a topic for another post.

Categories:  CLR | XML
Wednesday, September 19, 2007 9:57:57 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XML - Are parsing APIs too complex? 

I just analyzed my design work for the past day.

It turns out that I implemented simple two level hierarchy in strings, rather than XML, which is strange for my liking.

I had to implement this: A, where A has children a1, a2, ..., an, B, where B has children b1, b2, ..., bm, C, where ... and so on. Had to save the definition somewhere.

What was the first implementation like?

"A:a1:a2:a3|B:b1:b2:b3:b4:b5|C:c1:c2:c3:c4:c5:c6", where first-level node was always array[i][0] if I just parsed the string with is-everywhere split() method, passing "|" as the second parameter.

This is how it went:

var firstLevel = split(array, "|");
var secondLevel = split(firstLevel[i], ":");
var item = secondLevel(j);

Why?

My <insert worship preference>, it is still easier to parse strings than XML in JavaScript.

Categories:  XML
Monday, December 11, 2006 5:53:39 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XML Notepad 2006 

Great tool was released today by the XML Team (Webdata), from Microsoft.

Find it here: http://www.microsoft.com/downloads/details.aspx?FamilyID=72D6AA49-787D-4118-BA5F-4F30FE913628&displaylang=en

It's a .NET Framework 2.0 application which can be used as a simple raw XML editor. It's got XSL support, XML differentiation, XML Schema validation, entity name intellisense, and, as the name suggests, it's as simple as notepad.exe. Superb performance on large documents, too.

Great. Tune it up, change the icons and layout then ship it with Vista, I say.

I find it quite attractive, since nowadays I don't spend as much time looking at angle brackets anymore.

Categories:  XML
Tuesday, September 05, 2006 8:25:25 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Article: XML Schema - Specification Primer 

Next article in XML series is discussing XML Schema. This is a two part article.

Language: Slovenian


Naslov:

XML Schema (1/2)

XML Schema (2/2)

XML Schema

Categories:  Articles | XML
Sunday, June 04, 2006 8:50:18 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Article: XML Namespaces and PSVI Problems 

Next article in XML series is discussing XML Namespaces and PSVI problems.

Language: Slovenian


Naslov:

Imenski prostori XML in problemi v PSVI

Imenski prostori XML in problemi v PSVI

Categories:  Articles | XML
Saturday, June 03, 2006 11:50:01 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Article: The Importance of XML Typization 

This article is starting the XML series. First we dive into XML typization importance and XML Infoset.

Language: Slovenian


Naslov:

Pomembnost tipizacije XML

Pomembnost tipizacije XML

Categories:  Articles | XML
Saturday, June 03, 2006 10:29:29 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Article: Type Systems Compared, XML, CLR 

I'm going to publish a series of my articles, which went out the door a couple of months ago.

All articles are in Slovene language.

Here goes the first one.


Naslov:

Tipski sistem XML <> Tipski sistem CLR

Tipski sistem XML <> Tipski sistem CLR

Categories:  Articles | CLR | XML
Thursday, June 01, 2006 2:38:54 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 On AJAX being dead 

A fellow MVP, Daniel Cazzulino, has a post titled AJAX may be the biggest waste of time for the web. While I agree with most of the points there, one should think about what Microsoft is doing to lower the AJAX development experience boundary.

Having to deal with JavaScript, raw (D)HTML and XML is definitely not going to scale from the developer penetration perspective. Nobody wants to do this is 2006. Therefore if Atlas guys make their magic happen, this would actually not be neccessary. It they achieve what they started, one would be abstracted from client side programming in most of the situations.

<atlas:UpdatePanel/> and <atlas:ScriptManager/> are your friends. And they could go a long way.

If this actually happens then we are actually discussing whether rich web based apps are more appropriate for the future web. There are scenarios that benefit from all these technologies, obviously. And if the industry concludes that DHTML with XmlHttpRequests is not powerful enough, who would stop the same model to produce rich WPF/E code from being emitted out of an Atlas enabled app.

We have, for the most part, been able to abstract the plumbing that is going on behind the scenes. If it's server side generated code, that should be running on a client, and if it is JavaScript, because all browsers run it, so be it.

We have swallowed the pill on the SOAP stacks already. We don't care if the communication starts with a SCT Request+Response messages, following by the key exchange. We do not care that a simple request-response model produces 15 messages while starting up. We do not care that there is raw XML being transfered. After all, it is all a fog, doing what it is supposed to do best - hiding the abstraction behind our beautiful SOAP/Services stack API.

Categories:  Other | Web Services | XML
Saturday, May 27, 2006 11:07:39 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 System.Xml v2 Performance: Blazing 

Congratulations go to System.Xml team.

This is fabulous! It's kicking buts, that's what it is.

Categories:  XML
Wednesday, June 08, 2005 12:49:47 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XQuery petition feedback 

Based on my previous post about Stylus Studio's XQuery support petition, Jonathan Robie writes:

Does Matevz really believe that the lack of a Microsoft editor on the XQuery spec is the reason it's taken so long?

No (that's why there's a 'maybe' and 'helps' in there). But it doesn't help either. From my point of view there are three real limiting factors for limping with XQuery for more than 6 years (1998 workshop, 1999 working group gathered):

  1. Competitive corporate agendas
  2. Becoming tightly coupled with other XML specs
  3. Ambitious spec in the first place

In that order. Microsoft's reasons right now are completely transparent. They would be more than thankful if the spec reached Recommendation status. Including partial support in SQL Server 2005 is a bit of a gamble with development dollars. But holding it back, on the contrary, can backfire too.

Going back to my statement:

I'm wondering why the XQuery spec isn't moving anywhere. Maybe the lack of Microsoft editor in the editor list helps ignoring the importance of this technology in the soon-to-be-released products. Current editors don't seem to be bothered with the decisions Microsoft has to take. I'm sure though, that Jonathan Robie (DataDirect Technologies) is pushing hard on Microsoft's behalf.

From Jonathan's response I believe he doesn't agree with the editor part and not him pushing on Microsoft's behalf.

From my perception, major mainstream platform support for XQuery would do well both for the vendors and the XQuery in general. It's been cooking so long that it needs solid support, before becoming overcooked, like XML Schema. And yes, I agree that there are some wonderful implementations out in the wild already. Developer penetration is what this technology still has to achieve.

I'm sure, Jonathan, that Paul Cotton & Co, would be more than willing to wrap up, if things aligned. Looking forward to your viewpoint on why it's taking so long. The last one found is already a bit stale.

Categories:  XML
Wednesday, May 04, 2005 12:44:01 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XQuery Support Petition 

I just received an email from Stylus Studio creators asking me to sign a petition on the lack of XQuery support in .NET Framework 2.0.

I'm sorry. I cannot do that. It's just the rule I have.

Implementing a working draft version of an XML-based technology in a wide-spread product, like .NET Framework 2.0 is just out of the question. It has been done before with the XSL implementation in IE5, which then split to XSLT and XSL-FO, causing havoc for Microsoft.

On the other hand, implementing a stable subset of XQuery in SQL Server 2005 is another thing. While I don't necessarily agree with the necessity, I do agree that SQL 2005 and .NET Framework are two completely different beasts having different life cycle characteristics and flop-survival methods.

I'm wondering why the XQuery spec isn't moving anywhere. Maybe the lack of Microsoft editor in the editor list, helps ignoring the importance of this technology in the soon-to-be-released products. Current editors don't seem to be bothered with the decisions Microsoft has to take. I'm sure though, that Jonathan Robie (DataDirect Technologies) is pushing hard on Microsoft's behalf.

In any case, it's too late anyway.

Categories:  XML
Friday, April 29, 2005 8:01:35 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XMLisms and binary XML rants 

There is a furious discussion going on (again) on the XML-DEV mailing list about the necessity of binary XML representation format. Tons of press ink has also been spilled on this issue.

 

In essence, the XML data model consists of three layers:

  1. XML Serialization Syntax (often called XML 1.0, XML 1.1)
  2. XML Infoset (abstract data model)
  3. PSVI (Post Schema Validation Infoset), (typed data model, brought by XML Schema)

The problem lies in number 1. XML serialization stacks produce nasty angle bracket wire serialization format which needs to be parsed into the XML Infoset before it can be exposed by any programmatic XML-exposing technology, like a DOM, SAX or what have you. In the reverse things get done in the opposite direction.

 

If we rule schema out of the view temporarily, there are currently two ways to encode a single XML document. One being the ordinary XML 1.0 + Namespaces 1.0 wire syntax, represented in XML Infoset 1.0 form by the parsers/stacks. The second one is XML 1.1 + Namespaces 1.1 wire syntax, represented in XML Infoset 1.1 form, which didn't gain enough momentum and it's a question whether it will in the future.

 

Question still remains about whether the XML industry has reached a sweet spot in the (non)complexity of the serialization syntax to allow fast processing in the future. It is my belief that we will not see a great adoption of any binary XML serialization format, like XOP or BinaryXML outside the MTOM area, which pushes XOP into SOAP. That stated, one should recognize the importance of main vendors not reaching the agreement for quite some time. Even if they do reach it some time in the future, the processing time gap will long be gone, squashed by the Moore's law. This will essentially kill the push behind binary serialization advantages outside the transport mechanisms (read SOAP). Actually, having a 33% penalty on base64 encoded data is not something the industry could really be concerned about.

 

There are numerous limiting factors in designing an interoperable serialization syntax for binary XML. It all comes down to optimization space. What do we want to optimize? Parsing speed? Transfer speed? Wire size? Generation speed? Even if those don't seem connected, it turns out that they are sometimes orthogonal. You cannot optimize for generation speed and expect small wire size.

 

We will, in contrary, see a lot more XML Infoset binary representations that are vendor-centric, being only compatible in intra-vendor-technology scenarios. Microsoft's Indigo is one such technology, which will allow proprietary binary XML encoding (see System.ServiceModel.Channels.BinaryMessageEncoderFactory class) for all SOAP envelopes traveling between Indigo endpoints being on the same or different machines.

Categories:  XML
Tuesday, April 12, 2005 3:59:39 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Indigo CTP 

Microsoft “Indigo” is now available for general public.

If you are an MSDN Subscriber, you can download the bits from here.

Categories:  Web Services | Work | XML
Sunday, March 20, 2005 6:34:26 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Binary XML 

XML industry has reached the mountain top: http://news.com.com/Putting+XML+in+the+fast+lane/2100-7345_3-5534249.html?tag=st.prev

If this thing continues, and adds another stupidity on top of a base stack, we'll be back in the 70s.

Processing power and network throughput will handle the load of cross boundary XML being serialized as XML 1.0 + Namespaces. We do not need XML 1.1, which is a flop anyway, and for sure, we don't need another Infoset.

Let the major vendors deliver binary Infoset for intra-firewall scenarios. Every other form of communication mechanism should use the d*mn angle brackets, if it chooses the XML dialect for the payload.

Categories:  XML
Friday, January 14, 2005 2:46:42 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 WS-Eventing: SQL Persistance Provider 

Update: Memory persistance included

In my previous posts I said I will write a SQL based persistence provider for Plumbwork.Orange WS-Eventing implementation and help John Bristowe a bit.

It's done now, as is memory based persistance option, but since http://www.gotdotnet.com still has problems with workspaces, I cannot upload it.

Classes can be downloaded here:

All you need to do is replace one line in SubscriptionManagerFactory.cs:

return new XmlSubscriptionManager() as ISubscriptionManager;

With:

return new SqlSubscriptionManager() as ISubscriptionManager;

or

return new MemorySubscriptionManager() as ISubscriptionManager;

Since some members of the workspace are already working on configuration application block integration, all config data should go in there someday.

My implementation now uses SQL Server as a subscription storage for durable WS-Eventing subscriptions. System.Collections.Hashtable is used in memory based persistance model. Complete support includes:

  • Creating a subscription
  • Removing a subscription
  • Renewing a subscription
  • Expiring a subscription

When GDN Workspaces come back online, I will post this to Plumbwork.Orange.

Categories:  Web Services | XML
Friday, August 27, 2004 10:48:47 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 WS-Eventing 

John writes:

I'm thankful Matevz didn't blast me for my über-crappy persistance model (read "save to C:\subscriptions.xml"):

Give LOCAL SERVICE account permissions to write/modify the c:\ directory. By default Plumbwork.Orange.Eventing.dll will write subscriptions file (called subscriptions.xml) there.

This line (above) - due to my extreme laziness while coding - makes me shudder. This is something I really, really, really need to clean up.

[Via http://www.bristowe.com]

I agree that next version should include a config option to use Plumbwork.Orange.Eventing.MemorySubscriptionManager instead of Plumbwork.Orange.Eventing.XmlSubscriptionManager.

Even better would be to add SqlSubscriptionManager, which I can do, when I get into the GDN workspace.

Categories:  Web Services | Work | XML
Tuesday, August 24, 2004 9:11:21 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 WS-Eventing Implementation 

WS-Eventing Application

It took me half of today to implement a WS-Eventing based service, together with service control application and demonstration client.

I took Plumbwork.Orange implementation of WS-Eventing stack. It includes WSE 2.0 based implementation of WS-Eventing specification, written by John Bristowe. Latest post about updates can be found here.

How it works

Windows service queries message queue (MSMQ) after the period elapses. If it finds any messages, they are dispatched to all registered clients.

There is a service control application, which can enroll new messages into the queue.

There's also a simple client which you can use to register and receive notifications.

Availability

You can download the bits from here:

  • Installer package for WS-Eventing windows service, which does registrations and sends notifications back. Grab it here. Source is available here.
  • Source code, which you can use to compile the service control application. Grab it here.
  • Source code, which you can use to compile the WS-Eventing client. Grab it here.

Do the following:

  • Install WS-Eventing service.
  • Update WSEventingService.exe.config with desired endpoint address, MSMQ query period and queue name
  • Give LOCAL SERVICE account permissions to write/modify the c:\ directory. By default Plumbwork.Orange.Eventing.dll will write subscriptions file (called subscriptions.xml) there.
  • Start the service using SCM (Service Control Manager). Service will automatically create the specified queue. Note: You should have Message Queuing installed.
  • Start the service control application.
  • Start the client. It will register automatically.
  • Send notification using the service control application and watch it emerge on the client side.

If you get into trouble, please email me. Have fun!

Categories:  Web Services | Work | XML
Sunday, August 22, 2004 10:49:46 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Services <> Web Services 

A great post by Clemens in which he does a beautiful distinction between services and web services.

My thoughts would be:

  • Service is a self consistent piece of software which MAY use one or more web services for communication with the outside world.
  • As always, web services MUST be considered to be within the presentation layer of the solution. They just do not spit HTML out, they prefer XML.

MAY and MUST are to be interpreted as defined in RFC 2119.

Categories:  Web Services | XML
Wednesday, August 18, 2004 1:06:45 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Why only Schematron 

Dare wants to get feedback on System.Xml 3.0 features.

What is wrong with this one?

Categories:  XML
Friday, June 25, 2004 9:33:24 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Daniel does it the right way 

Daniel Cazzulino, one of our breed, shares this perfect blog posting about a super efficient way of passing XML data as XML, but without loading full DOM on the server side.

Started here and here.

I especially like the availability of arbitrary positioning of XPathNavigator to only serialize the bits you are interested in.

The only limitation of this solution is that it does not ship with FX 1.0/1.1 and you have to be a master in XML to fully grok it. But hey, if you don't, you can still use XmlDocument as a return type. :)

Categories:  MVP | XML
Monday, May 31, 2004 7:06:03 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 What happened? 

Harwey said it's code complete around March 1st.

As I got the confirmation today that first bits from the Indigo team will be available to the selected ones in late July/early August, this is definitely something to look for.

Why...

Why is...

Why is it...

Why is it still...

Why is it still brewing?

Categories:  Web Services | XML
Monday, May 17, 2004 6:29:13 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Do it the right way 

With all the talk going on I agree that this is bad:

[WebService]
public class MyWebService
{
   [WebMethod]
   public string GetSomeXML()
   {
     // get some xml
     return xml;
   }

}

And:

public class MyClient
{
   public static void Main()
   {
      MyWebService objWS = new MyWebService();
      string strXML = objWS.GetSomeXML();
      XmlDocument doc = new XmlDocument();
      doc.LoadXml(strXml);
      // processing
   }
}

A lot better (in most situations) is:

[WebService]
public class MyWebService
{
   [WebMethod]
   public XmlDocument GetSomeXML()
   {
     // return xml
   }
}

The former scenario serializes the document directly into the output SOAP stream, therefore bypassing double string (de)serialization.

There are special cases, when one would like to bypass the other approach (passing XML as XmlDocument) on the server side. If you have all the data ready and want to pass it as quickly as humanly possible, without rehydrating the entire full blown DOM-capable object, you would use System.String (xsd:string in the XML world) and System.Text.StringBuilder to contacenate it.

If you don't know what to choose I propose this:

  • It is year 2004, therefore platform and tool support is available in a way that XML processing is not a limitation from the XSD type system -> platform type system conversion side. Therefore choose XmlDocument.
  • Choose XmlDocument.
  • Choose the string way if and only if you are expecting clients which have no other way to bridge/decouple the raw SOAP XML string into something programmatic inside your platform.

In any case, things will change in late July or early August.

Categories:  XML
Monday, May 17, 2004 6:17:32 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Validation bug in .NET Fx 1/1.1 

I wrote about a bug in validation engine of .NET Framework 1.0/1.1 a couple of weeks ago. There was a lot of posts/discussions/emails about this issue later on.

As Dare, Web Data Team PM, points out, it turns that this anomaly is manifesting itself through System.Xml.XmlValidaingReader, because System.Uri class has a problem. And System.Uri has a problem, because RFC 2396 does not support empty values in BNF notation of URI.

So, what I propose is that if you end up in a similiar situation that we did in a production environment and want to validate XML instances or XML digital signatures (which are likely to be prone to this problem too, depends on a generation engine) and current Whidbey release is not your cup of tea, THEN CHANGE THE SPECIFICATION/SCHEMA.

Simply change xsd:anyURI with xsd:string. It will help. :)

I know this is architecturally a bad idea. But there is no other way to get around this bug until Whidbey ships (unless you want to change platforms).

I'm glad that usability is driving the ambiguity choice in this case. I'm glad that decision has been made to support empty strings in System.Uri even though the spec is not clear. Some things are just more natural than others.

Categories:  XML
Thursday, May 13, 2004 7:37:30 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 WebData team 

Today I spent the whole day with Microsoft Webdata team, which is responsible for the majority of System.Xml namespace.

We discussed new features in System.Xml v2, some of them have already been mentioned.

Dare, Mark, Arpan, thank you for the insight. I hope we helped you make some decisions.

Categories:  XML
Thursday, April 08, 2004 1:46:10 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Serious bug in System.Xml.XmlValidatingReader 

There is a problem with schema validation of xs:anyURI data type in System.Xml.XmlValidatingReader.

The schema spec and especially RFC 2396 state that xs:anyURI instance can be empty, but System.Xml.XmlValidatingReader keeps failing on such an instance.

To reproduce the error use the following schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="AnyURI" type="xs:anyURI">
  </xs:element>
</xs:schema>

And this instance document:

<?xml version="1.0" encoding="UTF-8"?>
<AnyURI/>

There is currently no workaround for .NET FX 1.0/1.1. Actually Whidbey is the only patch that fixes this. :)

The problem is even more troublesome when one does not have direct control over instance document syntax/serialization. For example in case of auto generated XML by Microsoft Office InfoPath during digital signature insertion. Attribute /Signature/SignedInfo/Reference/@URI is (according to XML Signature schema) typed as xs:anyURI.

Validation problem therefore manifests itself as inability to validate any digitally signed InfoPath documents.

Categories:  XML
Monday, March 08, 2004 6:31:39 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XML 1.1 is alive 

W3C yesterday released a v1.1 of complete XML data model and serialization syntax stack. You can get XML Infoset 1.1, XML Namespaces 1.1 and XML 1.1 specifications.

Now the fun begins.

Categories:  XML
Thursday, February 05, 2004 9:21:52 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XML instance document survival 

This got my attention today:

Ultimately, I think the question is really a distraction. One of the great strengths of XML is that the instances exist independently not only from individual schema definitions, but also independently from the schema language of the day. [From: Don Box]

True indeed. In case of industry shifting to RelaxNG (very unlikely), instance documents would survive nicely. As long as there is a schema, that describes an instance document, everything is fine. When the connection is lost somehow, we can't talk about instances any more.

XML without a defined schema is no better than CSV. It's not even easier to parse.

Categories:  XML
Saturday, January 24, 2004 8:21:14 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Major problem supposed to be fixed: XML Schema versioning 

We'll see how this works out. I have been at the PDC, seen Doug's talk and I agree that this allows schema to be versioned over time. What is bothering me is the structural extension of the schema itself, just to support versioning.

And yes, I know this is the only way, since W3C didn't pay attention to versioning in the first place. It still bothers me, since I like my content models clean.

Categories:  XML
Friday, January 23, 2004 9:51:54 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 WSE on Windows Server 2003 

Can't get one of my solutions to work on a Windows Server 2003 based server. Client works fine, but server side X509-based decryption fails with an error that should not happen (Cannot find the certificate and private key for decrtyption).

Everything installed and correctly setup. Even permissions. :)

Since even the official Microsoft newsgroup didn't help, I'm really stuck. The funny thing is, that if I disallow access to private key and/or remove the certificate, error message changes giving me a clue that WSE looks at the cert unsuccessfully.

Categories:  Web Services | XML
Wednesday, November 26, 2003 2:32:11 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 UDDI level 2 status 

We are unsuccessful with our emails to Microsoft UDDI team to upgrade our UDDI account to level 2 status. If anyone knows an appropriate contact or is able to help directly, please do so.

UDDI Provider name: Gama System
Current UDDI level: 1

Categories:  Web Services | Work | XML
Saturday, October 04, 2003 1:54:35 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Word 2003 namespace change 

While preparing for my Tuesday demonstration of XML features in Office 2003 I found out that Word XML namespace changed from: http://schemas.microsoft.com/office/word/2003/2/wordml to http://schemas.microsoft.com/office/word/2003/wordml.

It's not that I'm opposed to changing beta-time namespaces but all my documents, saved in Office 2003 beta2 as XML, won't open up properly in Office 2003 RTM. I have to change them by hand.

Another thing that pops a question is: What if Microsoft releases two Word versions within a year that need different namespaces? That has not happened yet, but this kind of namespace naming convention is not as flexible as a standard year/month, W3C like one.

Changing the namespace to http://schemas.microsoft.com/office/word/2003/10/wordml would break things too, but wouldn't brake the convention.

Conventions, especially namespace declarations, carry a semantic meaning and one should avoid breaking them.

Categories:  XML
Sunday, September 28, 2003 7:58:23 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Mono web services 

Latest daily build of Mono already allows web service proxy creation, compilation and execution. I had to manually compile the wsdl.exe equivalent (called MonoWSDL.exe) and then do:

  • mono MonoWSDL.exe http://www.gama-system.com/webservices/stockquotes.asmx?wsdl (outputs StockQuotes.cs proxy)
  • mcs /r:/usr/local/lib/System.Web.Services.dll /t:library StockQuotes.cs
  • write a simple console web service client app (StockQuotesClient.cs)
  • compile it using mcs /r:StockQuotes.dll StockQuotesClient.cs
  • run it with mono StockQuotesClient.exe NASDAQ MSFT

What did I get?

This:

That's sweet. But Mono can now also do it the other way around. I can generate proxies on our platform using the standard wsdl.exe tool. Mono web services test page looks like this:

When one adds the "?wsdl" suffix to a web service endpoint WSDL is returned as expected.

I like it.

[Note: http://www.gama-system.com/webservices/stockquotes.asmx is our free stock ticker web service]

Categories:  XML | Web Services | Mono
Saturday, September 13, 2003 12:29:49 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Microsoft.com Web Services 

Here's my implementation of Microsoft.com Web Services client: http://www.request-response.com/playground/mscomws.

It's done manually:

  1. Create SOAP Request
  2. Get response
  3. Scrape through response using XSLT
  4. Generate HTML, using XSLT's for-each, value-of, concat, substring-before, ...
  5. Insert HTML into .aspx page
Categories:  XML | Web Services
Saturday, September 06, 2003 9:13:21 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XML schema review done 

I just finished a review of a huge XML schema for the biggest mobile operator in Slovenia.

By huge, I mean around 2000 lines. It takes XML Spy one second to validate an instance document on my machine. Element nesting goes ten levels deep.

Categories:  Work | XML
Wednesday, August 06, 2003 6:37:30 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XQuery in Whidbey 

This just came through:  Finally, Whidbey will provide increased power for performing common tasks involving the manipulation of XML. In addition to delivering increased performance when accessing and managing XML, Whidbey will include support for XML-based data processing using XML Query Language (XQuery).

It just seems strange to me, since XQuery is still in working draft status and will probably stay that way for quite some time.

Categories:  XML
Tuesday, July 29, 2003 7:23:36 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 XPathNavigator discussions 

I just had another lenghty discussion with another developer trying to convey the benefits of XPathNavigator usage.

It seems to me, that Infoset-based access to SQL Server data can only be done using this:

  • Get DataSet
  • DataSet -> XmlDataDocument (constructor, which takes DataSet, public XmlDataDocument(DataSet))
  • XmlDataDocument -> XPathNavigator (XmlDataDocument.CreateNavigator)

One can also create an XmlNode instead of XPathNavigator, but I prefer the XPath data model.

This seems a much more scalable solution than using "FOR XML RAW/AUTO/EXPLICIT" and populating an XmlReader with SqlCommand.ExecuteXmlReader. "FOR XML RAW/AUTO/EXPLICIT" is slow and requires an XML serialization/deserialization pair.

It's bad/wrong/sloppy, when people do that between app tiers.

Categories:  XML
Monday, July 28, 2003 9:33:44 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Querying RSS URLs on UDDI 

Based on this and this I just wrote a web service, which enables RSS URL lookup for web logs registered at UDDI.

It should be simple to integrate this into current RSS aggregators. Email me for code.

Web service URL: http://www.request-response.com/webservices/rssquery/rssquery.asmx

There are two metods. One returns an RSS URL based on UDDI service key. One returns an RSS URL based on UDDI service name. If multiple versions of RSS feeds are found, service looks for RSS 2.0 feed first. Then RSS 1.0, then RSS 0.9x. It returns '-1' if no feed is found.

Web service does dynamic queries against Microsoft's UDDI server inquire API.

Categories:  RSS | Web Services | XML
Thursday, July 24, 2003 3:45:33 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 UDDI registration 

I just updated our UDDI profile, which until now included only web services that we provide to general public. Now it includes my personal web log as a service linked to RSS 2.0 tModel.

Clemens implemented an OMPL view over this, which makes it damn simple for aggregator folks to do what seems logical...

... implement UDDI lookup features, so people can change their virtual addresses without a headache.

Categories:  RSS | XML
Tuesday, July 22, 2003 12:27:32 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 WSE 2.0 is in the building 

WSE 2.0 Tech Preview is out. Finally.

Got a few sleepless nights ahead and some thinking to do. Looks like we will have an option to abandon lonely hosting of web services in IIS, which is fun to have.

I like options. Options make people think.

Categories:  XML
Tuesday, July 15, 2003 11:12:50 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 Hell just froze over 
Categories:  XML
Tuesday, June 24, 2003 7:29:45 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 MSMQ limitations 

What seemed to be no obstacle at all is turning out to complicate my architectural designs lately.

Microsoft Message Queuing (MSMQ) has this strange limitation (at least for year 2003) which prevents you to have messages longer than 4MB. Since most of .NET architects are dumping object instances into MSMQ, which get serialized into XML, we all have a problem with binary data. The problem lies in binary XML serialization,  XML Schema and its base64Binary datatype, which is used in encoding. We do not get 4MB, but ~3MB message content limitation, due to a well known 1.333 factor of base64 encoding.

Architectural design is now vastly different, since I have to split the binary documents, while allowing them to be linked appropriately with their parent messages. And since I'm building a document management system which will push .doc, .xls and friends on a MSMQ stack, 4MB is often not enough.

Categories:  Work | XML
Tuesday, June 24, 2003 3:17:59 PM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

 OTS 2003 
Sure enough, it was it's 8th reincarnation. It finished yesterday and I gave a talk on Wednesday. Talked about XML versus CLR type system, dived into XML Schema specifics and compared early programmatic type systems with modern ones, including JVM and CLR.

Later on, I joined in and answered questions on a e-business related roundtable. The conference room was half full (~100 folks), which wasn't that bad. See you next year Maribor guys...
Categories:  XML | Work
Friday, June 20, 2003 11:19:06 AM (Central Europe Standard Time, UTC+01:00)  #    Comments

 

Copyright © 2003-2014 , Matevž Gačnik
Recent Posts
RD / MVP
Feeds
RSS: Atom:
Archives
Categories
Blogroll
Legal

The opinions expressed herein are my own personal opinions and do not represent my company's view in any way.

My views often change.

This blog is just a collection of bytes.

Copyright © 2003-2014
Matevž Gačnik

Send mail to the author(s) E-mail