C - Validate

Prev Next

Overview

In the third step, consider further checks on the tables, e.g. that all required headers are available, the data inside the table is consistent, etc. The functions listed below are some commonly used ones:

  • table check headers(): B4P checks if all headers specified here exist in the table. Various checking options exist, e.g. you can also check that no additional column headers than those specified are in the table, or even check for complying a certain sequence. You can also identify repeated header names.
  • table keep columns(): Only the specified columns will be kept. The others will be discarded.
  • table describe(): B4P checks the contents in selected columns and provides description. E.g. you can easily identify if a particular column contains telephone numbers, simple names, dates, and not something else.
  • table check duplicates() checks repeated entries in case redundant entries need to be removed or consolidated.
  • table fill vertically() can be used to replicate contents in rows below if they are blank, assuming the same contents as above are assumed.
  • table rename headers() lets you rename column headers in order to simplify or harmonize header names across multiple tables.
  • table lookup() and related functions allows you to look up data from other tables.

If the loaded table does actually contain a collection of mutiple sub-tables, then consider the fuctions desribed in the section Explore and filter and extract sub-tables.

During the validation process, consider further preparation steps on the table.


Wikipedia Example (continued from step 2)

Do some validation on this table:

  • Check for header names and keep those columns. Discard the others.
  • Some rows contain '...' because these country names are interpreted as regions belonging to other countries. Example: 'Akrotiri and Dhekelia – See United Kingdom, The'

Simple Example
  table keep columns             ( countries, { Country name, 'Alpha-2 code', 'Alpha-3 code' } );
  table delete selected rows     ( countries, ['Alpha-2 code']=='...' );
  table delete columns           ( countries,  'Alpha-3 code' );

  // Check if the 2-character country code is always consisting of 2 characters.
  table process selected rows    ( countries, ['Alpha-2 code']{}!=2, echo("The country code ", ['Alpha-2 code'], " may be incorrect" ) );

  // Some country names need to be harmonized with a language list to combine, e.g. 'Holy See' and 'Vatican City'.

  table initialize               ( country name alignment,
  { { Country code, Aligned country name },
    { KR, South Korea },                           { KP, North Korea },
    { US, United States },                         { VN, Vietnam },
    { BN, Brunei },                                { CV, Cape Verde },
    { CZ, Czech Republic },                        { LA, Laos },
    { RU, Russia },                                { VA, Vatican City },
    { SY, Syria },                                 { CC, 'Cocos (Keeling) Islands' },
    { CD, 'Democratic Republic of the Congo' },    { CG, 'Republic of the Congo' },
    { TL, East Timor },                            { CI, Ivory Coast },
    { FM, 'Federated States of Micronesia' },      { MM, 'Myanmar (Burma)' },
    { PS, Palestine },                             { TZ, Tanzania },
    { ST, Sao Tome and Principe },                 { SO, Somaliland } } );

  table lookup                   ( countries, 'Alpha-2 code', Country name, country name alignment, Country code, Aligned country name );

  table list( countries, briefly, 5  ); // List first and last 5 rows and 2 columns only
The table has been validated.
Row #: Code Text _______________________________________________________
    1:   include ( Support Library );
    2:   table load                     ( countries, temp.csv); // Hide!  Step 3 begins - Semantic alignment
    3:   table keep columns             ( countries, { Country name, 'Alpha-2 code', 'Alpha-3 code' } );
                                                                                                      ^
Function call: table keep columns(countries,{'Country name','Alpha-2 code','Alpha-3 code'})
Table headers: ISO 3166[1] name[5],World Factbook[6] official state name[a],Sovereignty [6][7][8],ISO 3166-1[2],ISO 3166-1[2] (1),ISO 3166-1[2] (2),ISO 3166-2[3] subdivision codes link,TLD [9]

2nd function parameter (columns to keep) = {'Country name','Alpha-2 code','Alpha-3 code'}:

Header name 'Country name' not found.
Specified table column name or number is invalid.
Code execution will stop.  Going to interactive mode
-------------------------------------------------------------------------------
Type 'help' for help, 'docs' or 'toc' for online B4P documentation.
>>
Try it yourself: Open GUI_Features_Validate_and_Explore.b4p in B4P_Examples.zip. Decompress before use.