Judge Rules that Microsoft Must Stop Selling Word
Judge Leonard Davis of the U.S. District Court for Eastern Texas just ruled today that Microsoft violated a patent held by i4i of Canada, and such must stop selling Microsoft Word in the U.S. within 60 days. The dispute centers around Microsoft Word 2007's .docx file format. I just finished reading the abstract for i4i's U.S. Patent 5,787,449 and delved a little deeper into the .docx file format, and are a few of my thoughts.
- First, it's ironic that Microsoft started developing the .docx format because of governmental pressure - both in the US and especially in Europe. Microsoft had been under attack for having a proprietary and closed format, and as such came up with the .docx format to satisfy its attackers. The format is both open (with published specs) and non-proprietary (submitted to standards bodies as an available format for others). One area of governmental legal systems first forced the development of a new format, and now another area is trying to kill it.
- Patents in the software world are becoming so general and vague that they can potentially cover such broad areas, which in turn can dramatically stifle competitive innovation. This is a perfect example. In summary, this patent (filed in 1994 and granted in 1998) details an idea whereby instead of storing documents as one large file that combine both content (e.g., text) and formatting intermixed, they instead separate the content from the formatting markup into separate files, allowing content to be changed independently of style. However, the exact structure of the separation is not detailed in the patent, just the concept of separation. So, theoretically, nobody can come up with any future file format that splits the two, regardless of the technology to combine or display them. This is an example of a patent that should never have been granted because it is too broad. In the physical world, here's an analogy. Let's make believe that once upon a time all shoes were constructed as single solid pieces that you squeezed on your foot. Then someone came up with an idea (and patented it), saying that a new approach would be to separate the fitting of the shoe from the shoe itself - but they didn't explicitly invent shoelaces/eyelets or Velcro, metal clasps, etc. So, no future shoe manufacturer could now make shoe designs with clasps, laces, Velcro fasteners, or anything else without violating this patent.
- The reason Microsoft's .docx format theoretically violates this patent is that although the .docx format looks like it is a single file (that incorporates both content and style), it actually does separate the two. The .docx format is basically a .zip (compressed) file. (See for yourself - change the extension of a Word document from .docx to .zip and then open it.) Within the zip structure are a number of XML files, where content and style are separated. However, because they are incorporated into a single .docx file, in practicality nobody works with the formatting and content components separately. Therefore, in the real world, even if there were some merit to the i4i patent, it really doesn't apply to .docx files.
- Moreover, this has much broader implications than Microsoft. By far the most common technology today that uses the concept of separating content from format is CSS (Cascading Style Sheets). This forms the basis of virtually every single web page around the world being created today. (The whole idea of CSS is to separate style information from HTML files, leaving the HTML files with principally content. All of the style information can then be maintained in CSS files, allowing web designers to change the entire look-and-feel of a site by just changing its "theme".) Moreover, virtually every software maker who makes web design tools, including Adobe, Microsoft, Oracle, IBM, and countless others, uses CSS. The CSS specification is managed by the World Wide Web Consortium, an international organization. Are they now the next target for a lawsuit? If so, does that put a halt on everybody's web development? If not, how can they not be targeted when CSS technology much more closely fits the pattern of the patent than the .docx file format?
- Ironically, a few days ago Microsoft secured its own patent covering its XML-based format of its Office 2007 suite (including Word), Office Open XML (OOXML). Although patented, Microsoft has also offered the format to world as a free standard. (It is now accepted as an ISO and an ECMA standard.)
It's troubling when developers and marketers can't effectively develop and market because of situations like this. Let's just hope that a higher court rectifies this problem.
Domus is a full service advertising agency based in Philadelphia, with areas of expertise in multiple digital and interactive arenas. As both users and developers of technology, we follow closely situations such as this - as should most businesses. For more information on Domus, please visit our web site.
1. The docx format is not an "open standard" in the way that term should be understood. It's a proprietary format developed at MS and then MS successfully lobbied the standards group to adopt their format in order to make it harder for other applications, such as OpenOffice, to compete. Of course, it is possible for OO to retool in order to read/write the MS format. That's the point, to make OO retool, or continue to use a non-standard XML document format.
2. You should have looked at the timeline for XML document generation. If you had done so, you would see that this patent was applied for and granted long before XML documents were anything like a standard way of document generation. At the time of this patent, you couldn't even view an XML document in IE and certainly could not edit or generate one in Word (which would have been Word 98 at the time the patent was granted and Word 6 at the time of application). The patent application was made 4 years before the first public DTD was available.
I'm not a fan of software patents but so far, I haven't seen a legitimate objection to this one.
1. It is an "open standard" - just because you don't like the fact that Microsoft created it and then pushed it out to the standards bodies does not change the fact that those bodies accepted it; therefore, it is open. Are the only allowable "open" standards those that are created by someone other than Microsoft?
2. I did look at the timeline. The fact that the .docx format uses XML as its underlying format is a ruse in the patent dispute. The real focus of the patent is the concept of separating content from formatting. XML IS and WAS an open standard (developed in 1996-1997and submitted to the W3C in early 1998), not a proprietary standard for i4i to patent for a particular use. Microsoft did not copy i4i's DTD; they just used XML when separating content and formatting.
You are missing the point of "open standards" -- they are not intended to be a cover for proprietary formats, which cover is then used to submarine other implementations. The fact that MS has a patent on the format means that they are free to modify terms of use at any time and even withdraw the technology from the standards at any time. Further, by submitting an already existing technology, they put themselves ahead of every other implementation. Because the format is patented, much current software cannot implement it because the "open format" which is not really "open" is incompatible with their licenses. Finally, you have no knowledge of how much of the implementation they did not release to the standards committee, and said implementation remains protected. If I reversed engineered .docx implementation and reimplemented (accidentally or on purpose) some of their protected implementation, I would still be subject to suit.
You can't have it both ways. You can't have an "open" standard that is subject to manipulation by a patent holder. Ironical that you would defend this practice by MS while condemning a far less devious practice by some small Canadian company.
I understand your point, but I still disagree. Just because one company creates a standard and then offers it to the public standards committees does not make it a cover for a proprietary format; it just means that one company created it (as opposed to requiring that it be created by committee). And it also does not imply that the intent is to "submarine other implementations". In fact, there is no evidence that Microsoft had tried, is trying, or intends to try to submarine any other implementation.
Next, although it is patented, other companies are allowed to implement it because Microsoft explicitly offered it up to both ISO and ECMA. If you have a specific situation that you know of where an implementation is prohibited, I'd be interested to find out.
You also have no knowledge of exactly what aspect of the implementation was released or not released to the standards committees, so hypothesizing about an unproven protected component that would impact your implementation is unfair. On the other hand, there obviously is enough of the implementation that is not protected because a number of organizations have products that can read the .docx format - they're currently available today.
I don't know explicitly how the OOXML standard is subject to "manipulation by the patent holder" and I don't think you stated an explicit, non-hypothetical example either. The format as defined in the committees is open.
Finally, I don't condemn anyone from trying to create patents, but my beef with the i4i/Microsoft suit is that i4i is not claiming that it owns a specific file format or that it owns a specific algorithm. Rather, they claim that they own the concept of separating formatting from content (using XML). That's much too vague and I don't believe should stand up in court (and hopefully will get overturned in a higher court). That's very different than what Microsoft did - invest money to create a specific format, implement it, patent the specific format, and then offer it up as is to standards committees for all to use for free - with no evidence (thus far) of any past, present, or future litigation to prevent the fair use of the accepted open standard.
(I'm sure, though, that this is an issue where we'll never agree, but the exchange has been interesting. Thanks!)
As I read the i4i patent, it does not cover separating formatting from content in the way that CSS separates formatting from content. In fact, in the opening i4i distinguishes its invention from traditional stylesheets. Rather, as I understood it, 'content' is pure text with no mark up. Formatting information is kept outside of the pure text by counting characters and recording in a separate file the markers that would otherwise appear in the content as mark up. Word 2003 does something conceptually similar, but not the same. In Word 2003, in a single file, content includes 'structural' markup. The markup in the content includes identifiers (names). Those identifiers (names) reference style information elsewhere in the file. In Word 2007, the files are separated and combined into a .zip file, but all of the files (except images) contain markup, including, importanly the file with content. You can't find a file that is pure text with no markup. The result is that I do not see how this infringes the i4i patent.