This function creates a data profiling report.
Usage
create_report(
data,
output_format = html_document(toc = TRUE, toc_depth = 6, theme = "yeti"),
output_file = "report.html",
output_dir = getwd(),
y = NULL,
config = configure_report(),
report_title = "Data Profiling Report",
...
)
Arguments
- data
input data
- output_format
output format in render. Default is
html_document(toc = TRUE, toc_depth = 6, theme = "yeti")
.- output_file
output file name in render. Default is "report.html".
- output_dir
output directory for report in render. Default is user's current directory.
- y
name of response variable if any. Response variables will be passed to appropriate plotting functions automatically.
- config
report configuration generated by configure_report.
- report_title
report title. Default is "Data Profiling Report".
- ...
other arguments to be passed to render.
Details
config
is a named list to be evaluated by create_report
.
Each name should exactly match a function name.
By doing so, that function and corresponding content will be added to the report.
If you do not want to include certain functions/content, do not add it to config
.
configure_report generates the default template. You may customize the content using that function.
All function arguments will be passed to do.call as a list.
Note
If both y
and plot_prcomp
are present, y
will be removed from plot_prcomp
.
If there are multiple options for the same function, all of them will be plotted.
For example, create_report(..., y = "a", config = list("plot_bar" = list("with" = "b")))
will create 3 bar charts:
regular frequency bar chart
bar chart aggregated by response variable "a"
bar chart aggregated by `with` variable "b"`
Examples
if (FALSE) {
# Create report
create_report(iris)
create_report(airquality, y = "Ozone")
# Load library
library(ggplot2)
library(data.table)
library(rmarkdown)
# Set some missing values
diamonds2 <- data.table(diamonds)
for (j in 5:ncol(diamonds2)) {
set(diamonds2,
i = sample.int(nrow(diamonds2), sample.int(nrow(diamonds2), 1)),
j,
value = NA_integer_)
}
# Create customized report for diamonds2 dataset
create_report(
data = diamonds2,
output_format = html_document(toc = TRUE, toc_depth = 6, theme = "flatly"),
output_file = "report.html",
output_dir = getwd(),
y = "price",
config = configure_report(
add_plot_prcomp = TRUE,
plot_qq_args = list("by" = "cut", sampled_rows = 1000L),
plot_bar_args = list("with" = "carat"),
plot_correlation_args = list("cor_args" = list("use" = "pairwise.complete.obs")),
plot_boxplot_args = list("by" = "cut"),
global_ggtheme = quote(theme_light())
)
)
## Configure report without `configure_report`
config <- list(
"introduce" = list(),
"plot_intro" = list(),
"plot_str" = list(
"type" = "diagonal",
"fontSize" = 35,
"width" = 1000,
"margin" = list("left" = 350, "right" = 250)
),
"plot_missing" = list(),
"plot_histogram" = list(),
"plot_density" = list(),
"plot_qq" = list(sampled_rows = 1000L),
"plot_bar" = list(),
"plot_correlation" = list("cor_args" = list("use" = "pairwise.complete.obs")),
"plot_prcomp" = list(),
"plot_boxplot" = list(),
"plot_scatterplot" = list(sampled_rows = 1000L)
)
}