--- title: "Tutorial: Obtain an overall p-value for a factor variable" author: "Emily C. Zabor" date: "Last updated: `r format(Sys.Date(), '%B %d, %Y')`" output: rmarkdown::html_vignette vignette: > %\VignetteEncoding{UTF-8} %\VignetteIndexEntry{Tutorial: Obtain an overall p-value for a factor variable} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(riskclustr) ``` # Context After using `eh_test_subtype()` to obtain a model fit, if factor variables are involved in the analysis it will be of interest to obtain overall p-values testing for differences across subtypes across all levels of the factor variable. The `posthoc_factor_test()` function allows for post-hoc testing of a factor variable. # Example ```{r, message = FALSE} # Load needed packages library(riskclustr) library(dplyr) ``` ```{r} # create a new example dataset that contains a factor variable factor_data <- subtype_data %>% mutate( x4 = cut( x1, breaks = c(-3.4, -0.4, 0.3, 1.1, 3.8), include.lowest = T, labels = c("1st quart", "2nd quart", "3rd quart", "4th quart") ) ) ``` ```{r} # Fit the model using x4 in place of x1 mod1 <- eh_test_subtype( label = "subtype", M = 4, factors = list("x4", "x2", "x3"), data = factor_data, digits = 2 ) ``` After we have the model fit, we can obtain the p-value testing all levels of `x4` simulaneously. ```{r} mypval <- posthoc_factor_test( fit = mod1, factor = "x4", nlevels = 4 ) ``` The function returns both a formatted and unformatted p-value. The formatted p-value can be accessed as `pval`: ```{r} mypval$pval ``` The unformatted p-value can be accessed as `pval_raw`: ```{r} mypval$pval_raw ```