Solutions for Excel, CSV, Leading Zeros, and related problems

This seems to be a frequent topic in MV forums so I’m posting notes here. In short, people want to use CSV to import product IDs or other text data into Excel. Sure, you can do it.

I’ve worked with Excel for many years and have written several interfaces between MV and Excel including NebulaXLite, NebulAnalysis, and (upcoming) NebulaXChange.

One thing that needs to be understood is that CSV isn’t even a defined standard, it’s a convention and subject to interpretation in various implementations. This explains part of the reason why “C” SV implies commas but you can as easily use tabs, and different software interprets quotes and other delimiters differently. CSV is a data transport mechanism, not a data representation mechanism, so the desire and assumption that Excel will interpret the data as we wish is a faulty premise on which to begin. When data is surrounded in quotes in a CSV file, that’s not an indication that it’s a Text field, it’s just a different way for Excel to recognize delimited fields. The bottom line is that Excel will always interpret fields that are imported with a simple click on an associated file.

To get more control over data that you didn’t create, you need to use the Text Import Wizard where you can specify that specific columns are to be treated as Text versus numbers or dates. Yes, it’s an extra step and a real pain.

Some people may be familiar with using a single quote to denote text fields. If you are in Excel and you want to enter a code into a cell like
   00123
you know it will get immediately translated to 123. However, if you prefix the data as
   ‘00123
then the single cell will get converted to Text on the fly and the data will be shown with the leading zeros. The same holds true if you enter other text that may get translated into a numeric or date type. Unfortunately that trick doesn’t work when importing data from CSV. You can’t say this:
   "‘00123","‘00456"

There is one other devious option, which is to use formulas. In your CSV file, try this:
   123,="00123","12E3",="12E3"
That gets imported as:
    123   00123   1.20E+04   12E3
That data can be used anywhere in your spreadsheet. If you don’t like the inconsistency of having a field begin with an equal sign rather than a quote, go head and quote that field with standard "escaping":
   "=""00123"""
Just remember that when you import a number like this, you’re explicitly telling Excel that you want a cell to be a text field and not a number. If in some other cell you wish to operate on the data as a number, you need to be explicit there as well. Given the data above as row 1, put this in line 2 of a test CSV:
   ,=VALUE(B1)
You’ll see that the text 00123 is now a number 123 and can be used in calculations.

I don’t believe there is a way to override the default behavior that Excel uses when opening a CSV file that has been associated with Excel (click on it and Excel opens it by default. However another option is to take more control over Excel. If someone really wants Excel to Import CSV without interpreting the data, or to interpret the data in a way other than the default, I can write an Excel add-in that will do the job. You would just need to open Excel and go to File>Open CSV rather than going through the Text Import Wizard or customizing your CSV. Of course anyone who wanted this functionality would need to install the plugin.

Finally, and you can probably do this yourself – rather than associating Excel with CSV files, create a new program using your favorite language, and associate CSV files with that program. Your program should open a CSV file on demand, re-write all 00123 references as ="00123", re-file the CSV, then open the file with Excel.

For you Pick/MV folks, my recommendation would be to either modify your BASIC code to use the ="00123" format, or to just write your ,00123,00456 data as normal, and then execute a special post-processor on generated CSV files which prep that data for Excel. Remember that there is a difference between CSV files which are a general purpose transport, and Excel files which can include formulas.

I would be remiss if I didn’t mention that NebulaXLite not only allows you to get around all of these issues but it also allows you to create stunningly attractive Excel workbooks in the process. Don’t just give someone a CSV when they ask for Excel, give them a real Excel workbook. And remember: documents created with NebulaXLite can also be opened in Open Office and Google Docs, and you don’t need to do a single thing differently. NebulaXLite is completely free for all developers, even end-users in development mode, and all we ask is a one-time $200 purchase of a permanent license when reports actually go production. No other purchases or commitments are required. Download and install the free developer/evaluation account from the product page.

Leave a Reply