People are always looking for "PDF" or "Excel". But what do they really want? This blog entry is essentially a copy of a recent posting I made to comp.databases.pick (with new material added at bottom). I reproduced it here because the exact same question came up a day later in the jBase forum about how to create a PDF file. This is a real FAQ and I’ve discussed the topic here before, but it seems it’s time to revisit the topic.
When someone says they want PDF, there are two concepts in play:
1) They want output in the PDF medium.
2) They may want the special formatting and other features available
in the PDF medium.
It’s like when people say they want "Excel" but all they really want is a delimited file that can be opened in Excel. There’s a big difference. A Pick developer usually thinks CSV when they get a request for Excel, but end-users who use Excel really want colors and borders and special fonts and many other features provided by that feature-rich product.
[AD] For this I created NebulaXLite which also works for OpenOffice and Google Spreadsheets.
For "PDF" if you just want the output format, then I recommend PDF Creator, which is what I use for sending all output as-is to a .pdf file. There are many other tools like it of varying quality and price. But if you want formatting per #2 above, then you will want to be able to generate formatting codes from BASIC. Many people choose PostScript to do this, but the code is virtually unmaintainable 5 minutes after it’s written. I used to use HtmlDoc, written by Michael Sweet the same guy who wrote CUPS. That takes pure HTML (easily generated from BASIC) and converts it to PDF. It was the best of its type as far as freeware goes (IMHO), but never fully compliant or current with standards – another one of those great but generally dead FOSS projects.
So after trying many others I took a close look at PrintWizard and got hooked. Just look at the BASIC to PDF demos I created. We don’t sell it and I don’t get commissions on referrals. I just advocate its use because I think it’s good software.
If you’d like more info on PrintWizard, check out this other blog I wrote, and see the contact info at the bottom of our PrintWizard page, link in last paragraph. Also see my update to PickWiki on the topic, which (no surprise) contains a reference back here.
New stuff not posted to CDP…
Let’s step back a minute. I keep talking about using products to solve the problem. The real focus is on what the products do for us. When you’re generating attractively formatted output, you need to put formatting code somewhere. You’re not going to turn your greenscreen UI into a dynamic browser interface with no code changes – the @(x,y) cursor addressing won’t work. (Well, I actually did write a utility to do that, but the project was shelved and I’d rather not get into that here.) When you’re creating Excel or PDF, you can’t simply use PRINT to output text and suddenly get nice columns, colors, separator lines, images, and all of those other features.
So for formatting, my personal choice is to use common syntax that’s familiar to the widest possible audience, which in most cases is XML. It’s not important whether it’s strict HTML or something similar but slightly different. A "markup language" defines a view for a specific user interface. There are many XML dialects for all sort of user interfaces (mentioned in my upcoming blog discussing "Balsamiq"). So if you have the slightest exposure to HTML then you’ll understand immediately how to generate formatted documents using XML syntax. With reference to the above products, NebulaXLite renders spreadsheets as SpreadsheetML, a well documented schema that is 100% compatible with Excel, OpenOffice, and Google Spreadsheets. Even without knowing the details someone can look at the XML generated from NebulaXLite, understand to some extent what it’s doing, and even modify that text file with code. PrintWizard uses PWML (need I define that?) which is really an extended HTML that you can easily generate from code. This is not the case with PostScript which consists of sequences of code that mean nothing without a manual in front of you – and the rendered document is unmaintainable after it’s produced.
I don’t care what product you use to generate Excel or PDF, but be kind to yourself, and use easy to create and easy to read formatting that allows for rich results. Your code will be easier to maintain later, reducing your development expenses, and justifying the purchase of any low-cost product like those mentioned here.