table histogram

Prev Next

Function Names

table histogram

Description

This function creates a 1- or 2-dimensional histogram from a sequence of input values located in 1 or 2 columns in the source table. A histogram is a visual representation of the distribution of the values across a defined set of intervals, also known as bins. This function allows to define intervals freely, even with different widths

For 1-dimensional histograms, one column needs to be specified for the input data. In addition, the intervals need to be specified in a set.. A total of n+2 bins will be created for n interval values. The first bin covers the values below the first interval, and the last bin counts all non-numeric and invalid values, for example blanks.

For 2-dimensional histograms, a 2nd data column and a set containing two sets of intervals needs to be provided..

For a 1-dimensional histogram, the new table contains two columns labeled with header names "Intervals" and "Count".

For a 2-dimensional histogram, the top left cell is labeld "Intervals". The first set of intervals is listed below. The second set of intervals is listed to the right.

By default, data values are counted in the first interval bin where the value is greater than or equal to the interval value. Example: Intervals list 10, 20, ... 100. If the value is 20, then it will be counted in the bin greater than or equal to 20.

You can also specify the comparison operators directly in the 4th function parameter:

  • >= Default. The first bin is labeled "Below".
  • > Comparison is based on 'greater than'. In the example above, the value 20 would be counted in the interval bin 10. The first bin is labeled "Below".
  • < Comparison is based on 'less than'. In the above example, the value 20 would be counted in the interval 'less than' 30. The last bin is labeled "Above".
  • <= Comparison is based on 'less than or equal to'. In the above example, teh value would be counted in the interval 20. The last bin is labeled "Above".

All non-numeric values, including blanks, are counted in the last bin called "Others".

Call as: procedure

Restrictions

Indirect parameter passing is disabled

Parameter count

2

Parameters

No.TypeDescription
1.
input
string Name of existing table

2.
input
string Name of new table for the histogram

3.
input
table columns Columns with input data

See table columns as function parameters for general ruling for this parameter.
Specific rules apply for this function:

  • Either 1 or 2 header names or column numbers may be specified



The data in the specified columns may contain both numeric and non-numeric contents, whereas the numeric contents are coounted in the respective histogram bins and non-numeric contents counted in the bin for invalid contents.

4.
input
set Intervals

The interval values define the distribution of the input values into the different bins. For n intervals provided, n+2 bins will be provided. The first bin covers all values below the first interval. The next bins are for the values above the intervals. And the last bin is used to count all non-numeric values, for example blanks and text. All interval values must be numeric (For example dates are allowed) and in ascending order. Violations result in error messages and execution will stop.
Valid examples:

  • { 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100 }
  • { -50 .. -40 .. +50 }
  • { 0.1, 1, 10, 100, 1000 }
  • {{ 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100 }, { 0.1, 1, 10, 100, 1000 }} (Valid for a 2-dimensional histogram)

Invalid examples:

  • { 10, 20, 30, 25, 40 } (Not all values are ascending)
  • { below, 1..10, above } (Texts are not allowed. First bin will cover the 'below' values anyway)
  • { 10, 20, 30, 'n/a' } (The last bin will cover all non-numeric values, so do no specify this last 'n/a' entry)

Opt. 5.
input
string Comparison Operator

Following operators are allowed: '>=', '>', '<=', '<'. See main description further up for details.

Default value: >=

Exceptions

Multiple sets of input values contain different number of values
Interval values are not numeric or in ascending order
Invalid comparison choice
No interval values specified
Number of input value sets differe from number of interval values (histogram dimensions)

Examples


table initialize     ( source, {{ Data 1, Data 2}} );

[source:Data 1,..]   = { 1, -2, 3,  b,  c, '',3.5, 4.5, 4.5,  5, 5.5, 5.5, 6, 7, 8, 12 };
[source:Data 2,..]   = { 5,  9, 0, 10, 11, 20, 10,  20,  20, 20, 20,  xyz,  3, 2, 1,  5 };

bins 1[]             = { 0..5 };
bins 2[]             = { 0,10,20 };


echo ("Demonstrate a 1-dimensional histogram with default '>= operator and");

table histogram                ( source, histogram 1, Data 1, bins 1[] );
table process selected rows    ( histogram 1, is numeric([Intervals]), [Intervals] = ">= " + str([Intervals], "0.0" ) );

echo ("Demonstrate a 1-dimensional histogram with '<' operator: ");

table histogram                ( source, histogram 2, Data 1, bins 1[], "<" );
table process selected rows    ( histogram 2, is numeric([Intervals]), [Intervals] = "< " + str([Intervals], "0.0" ) );
table arrange with spacing     ( histogram 1, histogram 2 );
table list                     ( histogram 1);

echo ("Demonstrate a 1-dimensional histogram with '>' operator and");

table histogram                ( source, histogram 1, Data 1, bins 1[], ">" );
table process selected rows    ( histogram 1, is numeric([Intervals]), [Intervals] = "> " + str([Intervals], "0.0" ) );

echo ("Demonstrate a 1-dimensional histogram with '<=' operator: ");

table histogram                ( source, histogram 2, Data 1, bins 1[], "<=" );
table process selected rows    ( histogram 2, is numeric([Intervals]), [Intervals] = "<= " + str([Intervals], "0.0" ) );
table arrange with spacing     ( histogram 1, histogram 2 );
table list                     ( histogram 1);



echo ("Demonstrate a 2-dimensional histogram: ");

table histogram                ( source, histogram, {Data 1, Data 2}, { bins 1[], bins 2[] } );
table process selected rows    ( histogram, is numeric([Intervals]), [Intervals] = ">= " + str([Intervals], "0.0" ) );
table process selected columns ( histogram, 0, is numeric([.]), [.] = ">= " + str([.], "0.0" ) );
table list                     ( histogram );


echo ("Demonstrate a 2-dimensional histogram with '<' operator: ");

table histogram                ( source, histogram, {Data 1, Data 2}, { bins 1[], bins 2[] }, "<" );
table process selected rows    ( histogram, is numeric([Intervals]), [Intervals] = "< " + str([Intervals], "0.0" ) );
table process selected columns ( histogram, 0, is numeric([.]), [.] = "< " + str([.], "0.0" ) );
table list                     ( histogram );
               

Output

Demonstrate a 1-dimensional histogram with default '>= operator and
Demonstrate a 1-dimensional histogram with '<' operator:
    0 : Intervals | Count |  | Intervals | Count
    1 : Below     | 1     |  | < 0.0     | 1    
    2 : >= 0.0    | 0     |  | < 1.0     | 0    
    3 : >= 1.0    | 1     |  | < 2.0     | 1    
    4 : >= 2.0    | 0     |  | < 3.0     | 0    
    5 : >= 3.0    | 2     |  | < 4.0     | 2    
    6 : >= 4.0    | 2     |  | < 5.0     | 2    
    7 : >= 5.0    | 7     |  | Above     | 7    
    8 : Others    | 3     |  | Others    | 3    

Demonstrate a 1-dimensional histogram with '>' operator and
Demonstrate a 1-dimensional histogram with '<=' operator:
    0 : Intervals | Count |  | Intervals | Count
    1 : Below     | 1     |  | <= 0.0    | 1    
    2 : > 0.0     | 1     |  | <= 1.0    | 1    
    3 : > 1.0     | 0     |  | <= 2.0    | 0    
    4 : > 2.0     | 1     |  | <= 3.0    | 1    
    5 : > 3.0     | 1     |  | <= 4.0    | 1    
    6 : > 4.0     | 3     |  | <= 5.0    | 3    
    7 : > 5.0     | 6     |  | Above     | 6    
    8 : Others    | 3     |  | Others    | 3    

Demonstrate a 2-dimensional histogram:
    0 : Intervals | Below | >= 0.0 | >= 10.0 | >= 20.0 | Others
    1 : Below     | 0     | 1      | 0       | 0       | 0     
    2 : >= 0.0    | 0     | 0      | 0       | 0       | 0     
    3 : >= 1.0    | 0     | 1      | 0       | 0       | 0     
    4 : >= 2.0    | 0     | 0      | 0       | 0       | 0     
    5 : >= 3.0    | 0     | 1      | 1       | 0       | 0     
    6 : >= 4.0    | 0     | 0      | 0       | 2       | 0     
    7 : >= 5.0    | 0     | 4      | 0       | 2       | 1     
    8 : Others    | 0     | 0      | 2       | 1       | 0     

Demonstrate a 2-dimensional histogram with '<' operator:
    0 : Intervals | < 0.0 | < 10.0 | < 20.0 | Above | Others
    1 : < 0.0     | 0     | 1      | 0      | 0     | 0     
    2 : < 1.0     | 0     | 0      | 0      | 0     | 0     
    3 : < 2.0     | 0     | 1      | 0      | 0     | 0     
    4 : < 3.0     | 0     | 0      | 0      | 0     | 0     
    5 : < 4.0     | 0     | 1      | 1      | 0     | 0     
    6 : < 5.0     | 0     | 0      | 0      | 2     | 0     
    7 : Above     | 0     | 4      | 0      | 2     | 1     
    8 : Others    | 0     | 0      | 2      | 1     | 0     

Try it yourself: Open LIB_Function_table_histogram.b4p in B4P_Examples.zip. Decompress before use.

See also

histogram