CSV File Format

Prev Next

CSV File Format

The CSV (Comma Separated Values) is a simple structured text file format existing since the nostalgic 1970's along with the introduction of FORTRAN 77 and is supported by Beyond4P, along with HTML/MHTML, XML and JSON.

THe biggest drawback is the ignorance towards country specific locales. CSV generated from Excel files use symbols such as

  • List separator (could be commas, semicolons, tab stops, etc.)
  • Decimal point (as used in US, GB, CH) or decimal comma (as used in DE)
  • Thousand separator (various symbols and blanks exist)

The files can be loaded with and without specifying the list separators. If no list separator is specified, then Beyond4P tries to automatically detect the applicable separators by checking for commas, semicolons and tabs. Regarding decimal symbols, the standard function table load() will check for numerals in text and CSV files (but not HMTL files and likes) with decimal commas and convert them to decimal points. The new function table load unchanged() will not do the conversion. You may need this if you need to discriminate between actual numbers and other numeric data separated with commas but not considered as regular numeric data.
Consider using the clean num() function to prepare numbers containing commas (and possibly more, such as thousand separators, currency symbols, etc.).
Consider using table save with local decimal separator() to save CSV files using the decimal separator symbol applicable with your country settings (comma or point).
Consider using table save with decimal comma() to make sure all numbers use decimal commas.
The original function table save() saves numbers with decimal points, regardless of the applicable country settings.

Notice on files with fixed column widths: The best is to load these files with "new line" as separator symbol so the table contains one item per row. Then create additional columns with dedicated data items using the left() / middle() / right() functions where you can specify numeric column positions, and remove redundant white space symbos with the trim() function.