Data report overview

The dataset examined has the following dimensions:

Feature Result
Number of observations 303
Number of variables 14

Checks performed

The following variable checks were performed, depending on the data type of each variable:

  character factor labelled haven labelled numeric integer logical Date
Identify miscoded missing values × × × × × × ×
Identify prefixed and suffixed whitespace × × × ×
Identify levels with < 6 obs. × × × ×
Identify case issues × × × ×
Identify misclassified numeric or integer variables × × × ×
Identify outliers × × ×

Please note that all numerical values in the following have been rounded to 1 decimals.

Summary table

  Variable class # unique values Missing observations Any problems?
age numeric 42 4.62 % ×
sex character 3 4.95 %
chest_pain_type character 5 4.29 %
resting_bp numeric 48 6.93 % ×
cholesterol numeric 150 3.96 % ×
fasting_blood_sugar character 3 3.63 %
resting_ecg numeric 4 5.94 % ×
max_heartrate numeric 92 4.62 % ×
exer_angina character 3 5.61 %
old_peak numeric 41 3.30 % ×
slope character 4 5.28 %
n_vessels numeric 6 4.62 % ×
defect character 4 6.27 %
heart_disease numeric 3 6.60 %

Variable list

age

Feature Result
Variable type numeric
Number of missing obs. 14 (4.62 %)
Number of unique values 41
Median 56
1st and 3rd quartiles 48; 61
Min. and max. 29; 77

  • Note that the following possible outlier values were detected: "70", "71", "74", "76", "77".

sex

Feature Result
Variable type character
Number of missing obs. 15 (4.95 %)
Number of unique values 2
Mode “male”


chest_pain_type

Feature Result
Variable type character
Number of missing obs. 13 (4.29 %)
Number of unique values 4
Mode “typical angina”


resting_bp

Feature Result
Variable type numeric
Number of missing obs. 21 (6.93 %)
Number of unique values 47
Median 130
1st and 3rd quartiles 120; 140
Min. and max. 94; 200

  • Note that the following possible outlier values were detected: "94", "178", "180", "192", "200".

cholesterol

Feature Result
Variable type numeric
Number of missing obs. 12 (3.96 %)
Number of unique values 149
Median 242
1st and 3rd quartiles 211.5; 275.5
Min. and max. 126; 564

  • Note that the following possible outlier values were detected: "126", "131", "141", "407", "409", "417", "564".

fasting_blood_sugar

Feature Result
Variable type character
Number of missing obs. 11 (3.63 %)
Number of unique values 2
Mode “lt_120”


resting_ecg

  • Note that this variable is treated as a factor variable below, as it only takes a few unique values.
Feature Result
Variable type numeric
Number of missing obs. 18 (5.94 %)
Number of unique values 3
Mode “1”
Reference category 0

  • Note that the following levels have at most five observations: "2".

max_heartrate

Feature Result
Variable type numeric
Number of missing obs. 14 (4.62 %)
Number of unique values 91
Median 153
1st and 3rd quartiles 134; 166
Min. and max. 71; 202

  • Note that the following possible outlier values were detected: "194", "195", "202".

exer_angina

Feature Result
Variable type character
Number of missing obs. 17 (5.61 %)
Number of unique values 2
Mode “no”


old_peak

Feature Result
Variable type numeric
Number of missing obs. 10 (3.3 %)
Number of unique values 40
Median 0.8
1st and 3rd quartiles 0; 1.6
Min. and max. 0; 6.2

  • Note that the following possible outlier values were detected: "6.2".

slope

Feature Result
Variable type character
Number of missing obs. 16 (5.28 %)
Number of unique values 3
Mode “flat”


n_vessels

  • Note that this variable is treated as a factor variable below, as it only takes a few unique values.
Feature Result
Variable type numeric
Number of missing obs. 14 (4.62 %)
Number of unique values 5
Mode “0”
Reference category 0

  • Note that the following levels have at most five observations: "4".

defect

Feature Result
Variable type character
Number of missing obs. 19 (6.27 %)
Number of unique values 3
Mode “fixed_defect”


heart_disease

  • Note that this variable is treated as a factor variable below, as it only takes a few unique values.
Feature Result
Variable type numeric
Number of missing obs. 20 (6.6 %)
Number of unique values 2
Mode “1”
Reference category 0


Report generation information:

  • Created by: Michael Clark (username: micl).

  • Report creation time: Mon Jul 06 2020 10:51:49

  • Report was run from directory: /Users/micl/Documents/Stats/Repositories/Workshops/workshops-2020/exploratory-data-analysis-tools

  • dataMaid v1.4.0 [Pkg: 2019-12-10 from CRAN (R 4.0.0)]

  • R version 4.0.0 (2020-04-24).

  • Platform: x86_64-apple-darwin17.0 (64-bit)(macOS Catalina 10.15.2).

  • Function call: dataMaid::makeDataReport(data = hd, output = "html", file = "other_docs/dataMaid_report", replace = TRUE, maxDecimals = 1)