E - Augment

Prev Next

Overview

Data augmentation (enrichment) typically involves adding further valuble information. Consider a lookup table with some economic and demographic data which you want to add to your analysis in order to provide a better statistical weighting. Or for stock market data, obtain a list of present credit ratings and key financial data of the corporation you are tracking. We suggest to use the functions already described in the previous sections.


Wikipedia Example (continued from step 4)

Do some enrichment this table:

  • Specify the plausible surface areas for China and Denmark (i.e. without Greenland and provinces pretending to claim).
  • Add population density and variation between the number of inhabitants from two different sources.

Simple Example

[c1:Country,{China,Denmark},Area] = {9597000,42952}; // km2  (Denmark without Greenland, and CN without China South Sea)

table insert columns            ( c1, { Inhabitants Variation, Inhabitants per km2 } );
table process selected rows     ( c1, [Inhabitants] == '', [Inhabitants] = [Population]);
table process                   ( c1, [Inhabitants per km2]   = [Inhabitants] / [Area];
                                      [Inhabitants Variation] = ([Inhabitants] - [Population])/[Inhabitants] );
echo( "Table C1: ");
table list                      ( c1, briefly, 4, last col, 2 ); // List just 3 columns and first and last 4 rows

Enrichtment done.
Table C1:
    0 : Country                  | Area    | Inhabitants
    1 : Afghanistan              | 652230  | 41100000   
    2 : Egypt                    | 1001450 | 103500000  
    3 : Albania                  | 28748   | 2800000    
    4 : Algeria                  | 2381741 | 44900000   
  ... :
  194 : Central African Republic | 622984  | 5600000    
  195 : Cyprus                   | 9251    | 1300000    
  196 : China                    | 9597000 | 1422584933
  197 : Denmark                  | 42952   | 5948136    

Try it yourself: Open TAB_Features_Enrichment.b4p in B4P_Examples.zip. Decompress before use.