Package 'plotBart'

Title: Diagnostic and Plotting Functions to Supplement 'bartCause'
Description: Functions to assist in diagnostics and plotting during the causal inference modeling process. Supplements the 'bartCause' package.
Authors: Joseph Marlo [aut, cre], George Perrett [aut]
Maintainer: Joseph Marlo <[email protected]>
License: MIT + file LICENSE
Version: 0.1.30
Built: 2025-02-19 05:15:43 UTC
Source: https://github.com/priism-center/plotbart

Help Index


Lalonde dataset

Description

Lalonde dataset

Usage

lalonde

Format

An object of class data.frame with 445 rows and 12 columns.

Source

https://CRAN.R-project.org/package=arm


Plot the balance

Description

Visualize balance of variables between treatment and control groups. Balance plot reflects balance in standardized units.

Usage

plot_balance(
  .data,
  treatment,
  confounders,
  compare = c("means", "variance", "covariance"),
  estimand = c("ATE", "ATT", "ATC"),
  limit_continuous = NULL,
  limit_catagorical = NULL
)

Arguments

.data

dataframe

treatment

the column denoted treatment. Must be binary.

confounders

character list of column names denoting the X columns of interest

compare

character of either means or variance denotes what to compare balance on

estimand

character of either ATE, ATT or ATC the causal estimand you are making inferences about

limit_continuous

integer that can be used to limit the plot to only show the limit_continuous most imbalanced variables

limit_catagorical

integer that can be used to limit the plot to only show the limit_categorical most imbalanced variables

Value

ggplot object

Author(s)

George Perrett & Joseph Marlo

Examples

data(lalonde)
plot_balance(lalonde, 'treat', c('re78', 'age', 'educ'),
compare = 'means', estimand = 'ATE') +
labs(title = 'My new title')

Plot the histogram or density of the Conditional Average Treatment Effect

Description

Plot the conditional average treatment effect (CATE) of a 'bartCause' model. The conditional average treatment effect is derived from taking the difference between predictions for each individual under the control condition and under the treatment condition averaged over the population. Means of the CATE distribution will resemble SATE and PATE but the CATE distribution accounts for more uncertainty than SATE and less uncertainty than PATE.

Usage

plot_CATE(
  .model,
  type = c("histogram", "density"),
  ci_80 = FALSE,
  ci_95 = FALSE,
  reference = NULL,
  .mean = FALSE,
  .median = FALSE
)

Arguments

.model

a model produced by 'bartCause::bartc()'

type

histogram or density

ci_80

TRUE/FALSE. Show the 80% credible interval?

ci_95

TRUE/FALSE. Show the 95% credible interval?

reference

numeric. Show a vertical reference line at this value

.mean

TRUE/FALSE. Show the mean reference line

.median

TRUE/FALSE. Show the median reference line

Value

ggplot object

Author(s)

George Perrett, Joseph Marlo

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSup.rule = 'none'
)
plot_CATE(model_results)

Plot common support based on the standard deviation rule, chi squared rule, or both

Description

Plot common support based on the standard deviation rule, chi squared rule, or both.

Usage

plot_common_support(
  .model,
  .x = NULL,
  .y = NULL,
  rule = c("both", "sd", "chi")
)

Arguments

.model

a model produced by 'bartCause::bartc()'

.x

a character string denoting which covariate to use a the x axis default is to use a regression tree to predict the variable with least common support.

.y

a character string denoting which covariate to use a the y axis default is the outcome variable y

rule

one of c('both', 'sd', 'chi') denoting which rule to use to identify lack of support

Details

Sufficient overlap/common support is an assumption of causal inference. BART models use the uncertainty of counter factual uncertainty. When the posterior distribution of an individual's counterfactual prediction extends beyond a specified cut-point, that point likely has insufficient common support. 'bartCause' model offer the option to automatically remove points without common support from analyses, however, this must be specified during model fitting. Cut-points are determined through one of two rules: the standard deviation (sd) or chi-squared (chi). Under the standard deviation rule, a point has weak common support if its posterior distribution of the counterfactual deviation is greater than the maximum posterior of the observed predictions with 1 standard deviation of the distribution of standard deviations for each individual's predicted outcome under the observed assignment. Under the chi-squared rule, a point is discarded if the variance between its counterfactual prediction over observed prediction are statistically different under a chi-squared distribution with 1 degree of freedom. For more details on discard rules see Hill and Su 2013.

When called this plot will show how many points would have been removed under the standard deviation and chi-squared rules. This plot should be used as a diagnostic for 'bartCause' models fit without a common-support rule.

Value

ggplot object

Author(s)

George Perrett, Joseph Marlo

References

Hill, J., & Su, Y. S. (2013). Assessing lack of common support in causal inference using Bayesian nonparametrics: Implications for evaluating the effect of breastfeeding on children's cognitive outcomes. The Annals of Applied Statistics, 1386-1420.

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSuprule = 'none'
)
plot_common_support(model_results)
plot_common_support(model_results)

Plot the covariance

Description

Visualize balance of the covariance of variables between treatment and control groups. Balance plot reflects balance in standardized units.

Usage

plot_covariance(.data, treatment, confounders)

Arguments

.data

dataframe

treatment

the column denoted treatment. Must be binary.

confounders

character list of column names denoting the X columns of interest

Value

ggplot object

Author(s)

George Perrett

Examples

data(lalonde)
plot_covariance(lalonde, 'treat', c('re75','re74' , 'age', 'educ')) + labs(title = 'My new title')

Plot Individual Conditional Average Treatment effects

Description

Plots a histogram of Individual Conditional Average Treatment effects (ICATE). ICATEs are the difference in each individual's predicted outcome under the treatment and predicted outcome under the control averaged over the individual. Plots of ICATEs are useful to identify potential heterogeneous treatment effects between different individuals. ICATE plots can be grouped by discrete variables.

Usage

plot_ICATE(.model, .group_by = NULL, n_bins = 30, .alpha = 0.7)

Arguments

.model

a model produced by 'bartCause::bartc()'

.group_by

a grouping variable as a vector

n_bins

number of bins

.alpha

transparency of histograms

Value

ggplot object

Author(s)

George Perrett

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSup.rule = 'none'
)
plot_ICATE(model_results, lalonde$married)

Auto-Bin a plot of a continuous moderating variable into a discrete moderating variable

Description

Use a regression tree to optimally bin a continous variable

Usage

plot_moderator_c_bin(
  .model,
  moderator,
  type = c("density", "histogram", "errorbar"),
  .alpha = 0.7,
  facet = FALSE,
  .ncol = 1,
  .name = "bin"
)

Arguments

.model

a model produced by 'bartCause::bartc()'

moderator

the moderator as a vector

type

string to specify if you would like to plot a histogram, density or error bar plot

.alpha

transparency value [0, 1]

facet

TRUE/FALSE. Create panel plots of each moderator level?

.ncol

number of columns to use when faceting

.name

sting representing the name of the moderating variable

Value

ggplot object

Author(s)

George Perrett

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSuprule = 'none'
)
plot_moderator_c_bin(model_results, lalonde$age, .name = 'age')

LOESS plot of a continuous moderating variable

Description

Plot the LOESS prediction of ICATEs by a continuous covariate. This is an alternative to partial dependency plots to assess treatment effect heterogeneity by a continuous covariate. See Carnegie, Dorie and Hill 2019.

Usage

plot_moderator_c_loess(.model, moderator, line_color = "blue")

Arguments

.model

a model produced by 'bartCause::bartc()'

moderator

the moderator as a vector

line_color

the color of the loess line

Value

ggplot object

Author(s)

George Perrett, Joseph Marlo

References

Carnegie, N., Dorie, V., & Hill, J. L. (2019). Examining treatment effect heterogeneity using BART. Observational Studies, 5(2), 52-70.

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSuprule = 'none'
)
plot_moderator_c_loess(model_results, lalonde$age)

Partial dependency plot of a continuous moderating variable

Description

Plot a partial dependency plot with a continuous covariate from a 'bartCause' model. Identify treatment effect variation predicted across levels of a continuous variable.

Usage

plot_moderator_c_pd(.model, moderator, n_bins = NULL)

Arguments

.model

a model produced by 'bartCause::bartc()'

moderator

the moderator as a vector

n_bins

number of bins to cut the moderator with. Defaults to the lesser of 15 and number of distinct levels of the moderator

Details

Partial dependency plots are one way to evaluate heterogeneous treatment effects that vary by values of a continuous covariate. For more information on partial dependency plots from BART causal inference models see Green and Kern 2012.

Value

ggplot object

Author(s)

George Perrett, Joseph Marlo

References

Green, D. P., & Kern, H. L. (2012). Modeling heterogeneous treatment effects in survey experiments with Bayesian additive regression trees. Public opinion quarterly, 76(3), 491-511.

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSuprule = 'none',
 keepTrees = TRUE
)
plot_moderator_c_pd(model_results, lalonde$age)

Plot the Conditional Average Treatment Effect conditional on a discrete moderator

Description

Plot the Conditional Average Treatment Effect split by a discrete moderating variable. This plot will provide a visual test of moderation by discrete variables.

Usage

plot_moderator_d(
  .model,
  moderator,
  type = c("density", "histogram", "errorbar"),
  .alpha = 0.7,
  facet = FALSE,
  .ncol = 1
)

Arguments

.model

a model produced by 'bartCause::bartc()'

moderator

the moderator as a vector

type

string to specify if you would like to plot a histogram, density or error bar plot

.alpha

transparency value [0, 1]

facet

TRUE/FALSE. Create panel plots of each moderator level?

.ncol

number of columns to use when faceting

Value

ggplot object

Author(s)

George Perrett

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSuprule = 'none'
)
plot_moderator_d(model_results, lalonde$educ)

Plot the posterior interval of the Conditional Average Treatment Effect grouped by a discrete variable

Description

Plots the range of the Conditional Average Treatment Effect grouped by a discrete variable. This is analogous to plot_moderator_d_density but is preferable for moderators with many categories. Rather than plotting the full density, the posterior range is shown.

Usage

plot_moderator_d_linerange(.model, moderator, .alpha = 0.7, horizontal = FALSE)

Arguments

.model

a model produced by 'bartCause::bartc()'

moderator

the moderator as a vector

.alpha

transparency value [0, 1]

horizontal

flip the plot horizontal?

Value

ggplot object

Author(s)

George Perrett, Joseph Marlo

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSuprule = 'none'
)
plot_moderator_d_linerange(model_results, lalonde$educ)

Plot the overlap via propensity score method

Description

Plot histograms showing the overlap between propensity scores by treatment status.

Usage

plot_overlap_pScores(
  .data,
  treatment,
  confounders,
  plot_type = c("histogram", "density"),
  trim = TRUE,
  min_x = NULL,
  max_x = NULL,
  pscores = NULL,
  ...
)

Arguments

.data

dataframe

treatment

character. Name of the treatment column within .data

confounders

character list of column names denoting confounders within .data

plot_type

the plot type, one of c('Histogram', 'Density')

trim

a logical if set to true y axis will be trimmed to better visualize areas of overlap

min_x

numeric value specifying the minimum propensity score value to be shown on the x axis

max_x

numeric value specifying the maximum propensity score value to be shown on the x axis

pscores

propensity scores. If not provided, then propensity scores will be calculated using BART

...

additional arguments passed to 'dbarts::bart2' propensity score calculation

Value

ggplot object

Author(s)

George Perrett, Joseph Marlo

See Also

plot_overlap_vars

Examples

data(lalonde)
plot_overlap_pScores(
 .data = lalonde,
 treatment = 'treat',
 confounders = c('age', 'educ'),
 plot_type = 'histogram',
 pscores = NULL,
 seed = 44
)

Plot the overlap of variables

Description

Plot histograms showing the overlap between variables by treatment status.

Usage

plot_overlap_vars(
  .data,
  treatment,
  confounders,
  plot_type = c("histogram", "density"),
  min_x = NULL,
  max_x = NULL
)

Arguments

.data

dataframe

treatment

character. Name of the treatment column within .data

confounders

character list of column names denoting confounders within .data

plot_type

the plot type, one of c('histogram', 'density'). Defaults to 'histogram'

min_x

numeric value specifying the minimum value to be shown on the x axis

max_x

numeric value specifying the maximum value to be shown on the x axis

Value

ggplot object

Author(s)

George Perrett, Joseph Marlo

See Also

plot_overlap_pScores

Examples

data(lalonde)
plot_overlap_vars(
 .data = lalonde,
 treatment = 'treat',
 confounders = c('age', 'educ'),
 plot_type = 'Histogram'
)

Plot histogram or density of Population Average Treatment Effect

Description

Plot shows the Population Average Treatment Effect which is derived from the posterior predictive distribution of the difference between yz=1,Xy | z=1, X and yz=0,Xy | z=0, X. Mean of PATE will resemble CATE and SATE but PATE will account for more uncertainty and is recommended for informing inferences on the average treatment effect.

Usage

plot_PATE(
  .model,
  type = c("histogram", "density"),
  ci_80 = FALSE,
  ci_95 = FALSE,
  reference = NULL,
  .mean = FALSE,
  .median = FALSE
)

Arguments

.model

a model produced by 'bartCause::bartc()'

type

histogram or density

ci_80

TRUE/FALSE. Show the 80% credible interval?

ci_95

TRUE/FALSE. Show the 95% credible interval?

reference

numeric. Show a vertical reference line at this value

.mean

TRUE/FALSE. Show the mean reference line

.median

TRUE/FALSE. Show the median reference line

Value

ggplot object

Author(s)

George Perrett, Joseph Marlo

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSup.rule = 'none'
)
plot_PATE(model_results)

Plot a regression tree predicting variables with lack of overlap

Description

Identify variables that predict lack of overlap

Usage

plot_predicted_common_support(
  .model,
  max_depth = 3,
  rule = c("both", "sd", "chi")
)

Arguments

.model

a model produced by 'bartCause::bartc()'

max_depth

a number indicatin the max depth of the tree. Higher numbers are more prone to overfitting.

rule

one of c('both', 'sd', 'chi') denoting which rule to use to identify lack of support

Value

ggplot object

Author(s)

George Perrett, Joseph Marlo

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSuprule = 'none'
)
plot_predicted_common_support (model_results)
plot_predicted_common_support (model_results, max_depth = 2, rule = 'chi')

Plot the model residual

Description

Visualize balance of variables between treatment and control groups. Balance plot reflects balance in standardized units.

Usage

plot_residual_density(.model)

Arguments

.model

a model produced by 'bartCause::bartc()'

Value

ggplot object

Author(s)

George Perrett & Joseph Marlo

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSup.rule = 'none'
)
plot_residual_density(model_results)

Plot histogram or density of Sample Average Treatment Effects

Description

Plot a histogram or density of the Sample Average Treatment Effect (SATE). The Sample Average Treatment Effect is derived from taking the difference of each individual's observed outcome and a predicted counterfactual outcome from a BART model averaged over the population. The mean of SATE will resemble means of CATE and PATE but will account for the least uncertainty.

Usage

plot_SATE(
  .model,
  type = c("histogram", "density"),
  ci_80 = FALSE,
  ci_95 = FALSE,
  reference = NULL,
  .mean = FALSE,
  .median = FALSE,
  check_overlap = FALSE,
  overlap_rule = c("none", "sd", "chisq")
)

Arguments

.model

a model produced by 'bartCause::bartc()'

type

histogram or density

ci_80

TRUE/FALSE. Show the 80% credible interval?

ci_95

TRUE/FALSE. Show the 95% credible interval?

reference

numeric. Show a vertical reference line at this x-axis value

.mean

TRUE/FALSE. Show the mean reference line

.median

TRUE/FALSE. Show the median reference line

check_overlap

TRUE/FALSE. Check if any overlap rules are applicable

overlap_rule

enter overlap rules to view how different bartCause removal rules would have influenced results. Only applicable if check_overlap is TRUE.

Value

ggplot object

Author(s)

George Perrett, Joseph Marlo

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSup.rule = 'none'
)
plot_SATE(model_results)

Trace plot the estimands of a 'bartCause::bartc()' model

Description

Returns a ggplot of the estimated effect over each iteration of the model fit. This is used to visually assess the convergence of Markov chain Monte Carlo (MCMC) sampling. Chains should be well mixed such that no single color is notably separate from others.

Usage

plot_trace(.model, type = c("cate", "sate", "pate", "sigma"))

Arguments

.model

a model produced by 'bartCause::bartc()'

type

parameter to plot options are average treatment effects: 'cate', 'sate' and 'pate' as well as posterior predicitve uncertainty 'sigma'

Value

ggplot object

Author(s)

Joseph Marlo, George Perrett

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSup.rule = 'none'
)
plot_trace(.model = model_results)

Trace rank plot the estimands of a 'bartCause::bartc()' model

Description

Trace plots may occlude convegence issues within Markov Chains. Trace rank plots present an alternartive convergence diagnostic of MCMC convergence. Trank plots are described in detail in Vehtari et al. (2021).

Usage

plot_trank(.model, type = c("cate", "sate", "pate", "sigma"))

Arguments

.model

a model produced by 'bartCause::bartc()'

type

parameter to plot options are average treatment effects: 'cate', 'sate' and 'pate' as well as posterior predicitve uncertainty 'sigma'

Value

ggplot object

Author(s)

George Perrett

References

Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., & Bürkner, P. C. (2021). Rank-normalization, folding, and localization: An improved R ̂ for assessing convergence of MCMC (with discussion). Bayesian analysis, 16(2), 667-718.

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSup.rule = 'none'
)
plot_trank(.model = model_results)

Plot a waterfall of the ICATEs

Description

Plots the point and posterior intervals of each individual's ICATE ordered by the ICATE or a continuous variable. Points can be colored by a discrete variable. Waterfall plots are a useful visual diagnostic of possible treatment effect heterogeneity. A flat line implies little treatment effect heterogeneity while a steeper curve implies that the treatment effect varies across individuals in the sample. Ordering points by a continuous variable or coloring points by a discrete variable can be helpful to identify potential moderators of the treatment effect.

Usage

plot_waterfall(
  .model,
  descending = TRUE,
  .order = NULL,
  .color = NULL,
  .alpha = 0.5
)

Arguments

.model

a model produced by 'bartCause::bartc()'

descending

order the ICATEs by value?

.order

a vector representing a custom order

.color

a vector representing colors

.alpha

transparency value [0, 1]

Value

ggplot object

Author(s)

George Perrett

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSuprule = 'none'
)
plot_waterfall(model_results)

Auto-Bin a table of a continuous moderating variable into a discrete moderating variable

Description

Use a regression tree to optimally bin a continuous variable, this function will print out a table with estimates and 95

Usage

table_moderator_c_bin(.model, moderator, .name = "bin")

Arguments

.model

a model produced by 'bartCause::bartc()'

moderator

the moderator as a vector

.name

sting representing the name of the moderating variable

Value

a data.frame object

Author(s)

George Perrett

Examples

data(lalonde)
confounders <- c('age', 'educ', 'black', 'hisp', 'married', 'nodegr')
model_results <- bartCause::bartc(
 response = lalonde[['re78']],
 treatment = lalonde[['treat']],
 confounders = as.matrix(lalonde[, confounders]),
 estimand = 'ate',
 commonSuprule = 'none'
)
table_moderator_c_bin(model_results, lalonde$age, .name = 'age')