So I want some form of migration. In the example above, this is known as "Save as"!
However, I know that every time I do migration I introduce some sort of errors. So if I migrate from those PowerPoint 4 files to today's PowerPoint, and then from today's to tomorrow's PowerPoint, and then from tomorrow's to the next great thing, I will introduce cumulative errors whose impact I will only be able to assess at some horribly cringe-making moment, like in the middle of a presentation using a host's machine. So the best way to do migration is to start from the original file and migrate to today's version. Always. It's nuts for Microsoft to drop old file format support from its software (at least from this pint of view).
This approach of migrating from original version to today's version is called Migration on Request, and was described in a paper by Mellor, Wheatley and Sergeant back in 2002 (I referred to it earlier), but the idea hasn't caught on much. They had some other great ideas, like writing the migration tool in a specially portable version of C with all the nasty bits removed, called C--.
I have wondered from time to time however, for that class of documents we call Office Documents (word processing, spreadsheets, presentations), whether tacking onto an open source project which has a strong developer community might be a better approach. Something like OpenOffice. I'm not sure how many file formats this already supports (always growing, I guess, but Chapter 3 of the "Getting Started" documentation lists the following:
Microsoft Word 6.0/95/97/2000/XP) (.doc and .dot)... which is not a bad list (just the word processing bit, too)... and maybe extended in more up to date versions. For interest their FAQs have a question "Why does OpenOffice not support the file format my application uses?"
Microsoft Word 2003 XML (.xml)
Microsoft WinWord 5 (.doc)
StarWriter formats (.sdw, .sgl, and .vor)
AportisDoc (Palm) (.pdb)
Pocket Word (.psw)
WordPerfect Document (.wpd)
WPS 2000/Office 1.0 (.wps)
Ichitaro 8/9/10/11 (.jtd and .jtt)
Hangul WP 97 (.hwp)
.rtf, .txt, and .csv
"There may be several reasons, for example:Making legacy file formats more open was the subject of my previous post, and I guess we have to wait and see. But there are plenty of legacy word processing formats not on that list (Samna, for example, later to evolve into Lotus Word Pro, as well as formats for obsolete computers like the Atari, such as the German word processor SIGNUM, supposedly very good for mathematical formulae). What about earlier version of MS Word? Wikipedia lists a bunch of word processors; there must be many documents in obscure locations in these formats.
- The file formats may not be open and available.
- There may not be enough developers available to do the work (either paid or volunteer).
- There may not be enough interest in it.
- There may be reasonable, available workarounds."
With a concerted effort, we could gradually build OpenOffice input filters for these obsolete document types, thus brining them into the preservable digital world. And this is an effort that could bring in that extraordinary community of enthusiasts who do so much to build document converters and other kinds of software, so much ignored by the digital preservation community!