by Vern Sheridan Poythress
updated, Mar. 7, 2008
I believe that moral principles affect our use of digital file formats. To see how, we need first some preliminary explanation of what file formats do.
The nature of file formats
Programs such as writing programs (like Microsoft Word, Microsoft Wordpad, Corel WordPerfect, and OpenOffice Writer) or audio programs (like Audacity) produce files that store the author’s product on a permanent medium (a harddisk or a CDROM). At the smallest level, the permanent storage takes the form of a long series of “bits,” symbolized by 0s and 1s. To understand the file contents, one must have a computer program that interprets the bits. The interpreting program must understand the meaning of the bits, which depends on using specific “formats” that specify the correspondence between meaningful text, including formatted text, and the bits. For graphic objects (pictures, diagrams, etc.), the format must specify how the graphic object is represented in bits. For audio files, the format must specify the relation between sounds and the bits.
The importance of file formats
The issue of formats is important because not all formats are publicly disclosed, and not all are free from royalties or patent encumbrances. A person or a corporation that has exclusive control over a particular format can use that control to pressure people into using its product exclusively in order to access the materials that are recorded in its format.
Some formats, like .html, which is the primary format for internet pages, are publicly specified, royalty free, and unrestricted. Others are not. Among these latter are most of the formats used by Microsoft Corporation’s programs.
Microsoft Corporation has produced several versions of its suite of office programs. The office programs include Microsoft Word (for text writing), Microsoft PowerPoint (for presentations), and Microsoft Excel (for spreadsheets). The files produced by these programs have names that include an “extension” identifying the program from which the files derive. Thus files with the extension “.doc” have been produced in Microsoft Word. Files with the extension “.ppt” have been produced in Microsoft PowerPoint. Various versions of all these programs have appeared in MS Office 6.0, 95, 97, 2000, XP, 2003, and 2007.1 Office versions 97, 2000, and XP share a common format (as far as I know), but the other versions differ in native format for Word documents, and sometimes for other types of document. The shifts in format put pressure on users to buy the latest version of Microsoft Office, because otherwise they may not be able (or may be able only with difficulty) to read and revise a file produced by someone else with a more recent version of the program.
In addition, Microsoft Corporation has never made public the formatting used in Office 6.0, 95, or 97/2000/XP. The formats are secret. Consequently, it has been difficult for anyone using another program to read these files correctly. This fact puts pressure on everyone who receives a secret-format document to buy from Microsoft in order to be able to read and revise the files that someone else sends.
An ethical issue
When we share files as email attachments or post files at a website, the format becomes an ethical issue. Few people in the Third World have sufficient wealth to afford Microsoft Office easily. Still less can they afford to keep buying multiple versions of Microsoft products as the formats change. Sharing files in secret formats effectively excludes these people from the information process, or else makes them pay a “tax” to Microsoft for obtaining information that should be freely available. Moreover, even outside the Third World, among wealthier nations, some people do not wish to support Microsoft Corporation, because they think it is arrogant and prone to use monopolistic practices. It is not courteous to send people files in a secret format that implies that they should support Microsoft.
Gradually, through hundreds of hours of work, programmers outside of Microsoft have decoded large parts of the secret formats. That has enabled programs like OpenOffice to read from these secret formats and write to them. But because of the secrecy, the exchange between formats is still not absolutely perfect. Pressure is therefore still in place to buy Microsoft products in order to access the secret formats.
In Microsoft Office 2003 there is a new “.xml” format available for Word and for Excel. (There is no new format for PowerPoint.) This format is easier for other programs to understand. Office 2007 has similar, but not identical, formats.
Office 2007 finally has publicly specified formats for most of its pieces. Moreover, Microsoft has posted on the internet a promise concerning open use of the Office 2007 formats.
But the future for Microsoft formats is still under discussion. Here is one evaluation that is less than encouraging:
In other words, even though the MS XMLRS [the new specification for Office 2007] may be fully unencumbered through patent grants and a convenant not to sue, a number of the features and functions that the MS Office applications implement remain proprietary, private, and are not available for implementation by other developers.
The litmus test to apply is whether, even in theory, a competitor could develop an application that implements the entire set of features and functionality represented in the current MS binary format or MS XMLRS, in a platform independent manner and without infringing on MS intellectual property. We believe such an implementation is not possible, thus necessarily limiting the fidelity of MS binary to ODF conversion. (from Andy Updegrove, <http://www.consortiuminfo.org/standardsblog/article.php?story=20060621194845935>, quoting from Sun Microsystems)
The situation continues to change. Eventually, open programs for translation between Microsoft Office 2007 formats and other formats may be available. But they will necessarily be incomplete, because not everything is publicly specified in the new Microsoft formats. Moreover, as of Mar. 7, 2008, Microsoft still talks about “licensing fees” for commercial use of its specifications, which by itself makes them unacceptable as a basis for free exchange of information.
I have decided to wait to see what works out with respect to the Microsoft formats. I have been waiting since 2005, and the formats are still encumbered. Meanwhile, the international standards body OASIS in 2005 officially approved the “open document” formats, and they have become an ISO standard. These formats have no encumbrances. Moreover, they can be read by the OpenOffice program, which is available for free. The program is guaranteed to remain permanently free because the code is freely available and is freely modifiable under a generous license. For further discussion, see the Wikipedia article on OpenDocument, and its subsection under “Licensing.”
OpenOffice will convert files to Microsoft formats, if one so desires. But the conversion is not perfect, for the reasons already indicated. The conversion of text files to .doc is very good. The conversion of presentation files to .ppt shows a few failures to carry everything over in the area of animation.
For the sake of freedom of access, I am making my documents available in OpenDocument formats. Their status and the openness of the formats are clear. Moreover, the formats are already in use with more than one office suite and with other programs.
All of these issues have ties with the larger issues of laws concerning copyrights and patents. As long as laws exist restricting the use of copyrighted documents and patented ideas, Christians should obey the laws, according to Romans 13. But one must also consider whether the laws ought to be changed, and whether they are in conformity with biblical standards. I think they need radical rethinking, as argued by John Frame in a posted article. I have myself explored the ethics of copying in an article.
1 One should note that the distinct new formats for MS Office 2003 (and as far as I know for Office 2007) have different extensions (.xml).
Copyright (c) 2005, 2006, 2007, 2008 by Vern Sheridan Poythress.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license can be found at the Free Software Foundation website.