flowchart TD A[Raw Data Input] --> B[Data Quality Assessment] B --> C[Assumption Testing] C --> D[Method Selection] D --> E[Statistical Analysis] E --> F[Result Validation] F --> G[Interpretation Guidance] G --> H[Documentation Generation] B --> B1[Missing data patterns] B --> B2[Outlier detection] B --> B3[Distribution assessment] B --> B4[Sample size adequacy] C --> C1[Normality testing] C --> C2[Homogeneity of variance] C --> C3[Independence verification] C --> C4[Linearity assessment] D --> D1[Automatic method selection] D --> D2[Assumption-based routing] D --> D3[Robust alternatives] D --> D4[Sensitivity analysis] E --> E1[Primary analysis] E --> E2[Sensitivity testing] E --> E3[Effect size calculation] E --> E4[Confidence intervals] F --> F1[Result consistency checks] F --> F2[Computational validation] F --> F3[Cross-validation] F --> F4[Simulation verification] G --> G1[Clinical significance] G --> G2[Statistical interpretation] G --> G3[Limitation assessment] G --> G4[Recommendation generation] H --> H1[Assumption test reports] H --> H2[Method justification] H --> H3[Validation evidence] H --> H4[Regulatory documentation] style A fill:#ffebee style H fill:#e8f5e8 style C fill:#fff3e0 style E fill:#f3e5f5 style G fill:#e3f2fd
Key Takeaways
- Automated Validation: Implement comprehensive assumption checking that automatically validates statistical prerequisites with intelligent method selection and warning systems
- Clinical Standards: Master biostatistics protocols that meet pharmaceutical and clinical research standards for regulatory submission and peer review
- Intelligent Guidance: Create interpretation systems that provide context-aware statistical guidance, effect size interpretation, and clinical significance assessment
- Regulatory Compliance: Build statistical workflows that generate validation documentation, assumption testing evidence, and audit trails for regulatory compliance
- Professional Reliability: Establish enterprise-grade statistical computing with robust error handling, reproducible results, and comprehensive diagnostic reporting
Introduction
Statistical rigor forms the foundation of credible biostatistics applications, particularly in regulated industries where statistical methodology directly impacts regulatory approval, clinical decision-making, and patient safety. Enterprise statistical applications must go beyond basic calculations to implement comprehensive assumption checking, automated validation workflows, and intelligent interpretation guidance that meets professional research standards.
This tutorial transforms our secure t-test application into a statistically rigorous system that automatically validates assumptions, provides intelligent method selection, and generates comprehensive diagnostic reports. You’ll learn to implement clinical-grade statistical workflows that support regulatory submission, peer review, and professional research standards while maintaining the accessibility and user experience that stakeholders expect.
The statistical validation patterns you’ll master apply universally to biostatistics applications while addressing specific requirements for clinical trials, pharmaceutical research, and healthcare data analysis where statistical validity directly impacts research conclusions and regulatory decisions.
Understanding Statistical Rigor Requirements
Enterprise Statistical Standards
Professional statistical applications require comprehensive validation beyond basic calculations:
Critical Statistical Validation Components:
Assumption Testing Framework:
Comprehensive validation of statistical prerequisites including normality, homogeneity of variance, independence, and linearity with automated testing and intelligent interpretation.
Method Selection Intelligence:
Automated selection of appropriate statistical methods based on data characteristics, assumption test results, and analysis objectives with fallback options for assumption violations.
Diagnostic Reporting:
Generation of comprehensive diagnostic reports that document assumption testing, method selection rationale, and validation evidence for regulatory review.
Clinical Interpretation:
Context-aware interpretation guidance that connects statistical results to clinical significance, practical importance, and regulatory implications.
Regulatory Statistical Requirements
Pharmaceutical and clinical research applications must meet specific statistical standards:
FDA Guidance Compliance:
- Comprehensive assumption testing with documentation
- Method selection justification with sensitivity analysis
- Effect size reporting with clinical significance assessment
- Multiple comparison adjustments where appropriate
ICH E9 Statistical Principles:
- Pre-specified statistical analysis plans
- Assumption validation with appropriate testing
- Robust analysis methods for assumption violations
- Comprehensive sensitivity analysis documentation
Good Clinical Practice (GCP):
- Audit trail of statistical decisions and methods
- Validation of statistical software and procedures
- Documentation of assumption testing and results
- Reproducible analysis with version control
Comprehensive Assumption Testing Framework
Advanced Statistical Testing Engine
Create a robust statistical testing system that automatically validates assumptions:
# File: R/statistical_testing_engine.R
#' Enterprise Statistical Testing Engine
#'
#' @description Comprehensive statistical testing framework with automated
#' assumption checking, intelligent method selection, and regulatory compliance
#' documentation for clinical and pharmaceutical applications.
#'
#' @export
<- R6::R6Class(
StatisticalTestingEngine "StatisticalTestingEngine",
public = list(
#' @field config statistical testing configuration
config = NULL,
#' @field audit_logger audit logging system
audit_logger = NULL,
#' @field assumption_tests collection of assumption testing methods
assumption_tests = NULL,
#' @field method_selector intelligent method selection system
method_selector = NULL,
#' Initialize statistical testing engine
#'
#' @param config list. Statistical testing configuration
#' @param audit_logger AuditLogger. Audit logging instance
#'
initialize = function(config = list(), audit_logger = NULL) {
$config <- private$default_testing_config(config)
self$audit_logger <- audit_logger
self$assumption_tests <- AssumptionTestingFramework$new(self$config$assumptions)
self$method_selector <- StatisticalMethodSelector$new(self$config$method_selection)
self
if (!is.null(self$audit_logger)) {
$audit_logger$log_event(
self"StatisticalTestingEngine initialized",
level = "INFO",
category = "STATISTICAL_ANALYSIS"
)
}
},
#' Conduct comprehensive statistical analysis
#'
#' @param data data.frame. Input data for analysis
#' @param analysis_type character. Type of statistical analysis
#' @param parameters list. Analysis parameters and options
#' @param context list. Analysis context for regulatory documentation
#'
#' @return StatisticalAnalysisResult. Comprehensive analysis results
#'
conduct_analysis = function(data, analysis_type, parameters = list(),
context = list()) {
<- uuid::UUIDgenerate()
analysis_id <- Sys.time()
start_time
tryCatch({
# Log analysis start
if (!is.null(self$audit_logger)) {
$audit_logger$log_event(
self"Statistical analysis initiated",
level = "INFO",
category = "STATISTICAL_ANALYSIS",
details = list(
analysis_id = analysis_id,
analysis_type = analysis_type,
sample_size = nrow(data),
parameters = parameters,
context = context
)
)
}
# Step 1: Data quality assessment
<- private$assess_data_quality(data, context)
quality_assessment
if (!quality_assessment$suitable_for_analysis) {
return(StatisticalAnalysisResult$new(
analysis_id = analysis_id,
success = FALSE,
errors = quality_assessment$critical_issues,
warnings = quality_assessment$warnings,
recommendations = quality_assessment$recommendations
))
}
# Step 2: Comprehensive assumption testing
<- self$assumption_tests$test_all_assumptions(
assumption_results data = data,
analysis_type = analysis_type,
parameters = parameters
)
# Step 3: Intelligent method selection
<- self$method_selector$select_optimal_method(
selected_method data = data,
analysis_type = analysis_type,
assumption_results = assumption_results,
parameters = parameters
)
# Step 4: Execute primary analysis
<- private$execute_primary_analysis(
primary_results data = data,
method = selected_method,
parameters = parameters
)
# Step 5: Sensitivity analysis
<- private$conduct_sensitivity_analysis(
sensitivity_results data = data,
primary_method = selected_method,
primary_results = primary_results,
parameters = parameters
)
# Step 6: Effect size and clinical significance
<- private$analyze_effect_sizes(
effect_analysis data = data,
results = primary_results,
context = context
)
# Step 7: Generate interpretation guidance
<- private$generate_interpretation_guidance(
interpretation primary_results = primary_results,
assumption_results = assumption_results,
effect_analysis = effect_analysis,
context = context
)
# Step 8: Create comprehensive result object
<- StatisticalAnalysisResult$new(
analysis_result analysis_id = analysis_id,
success = TRUE,
data_quality = quality_assessment,
assumption_tests = assumption_results,
method_selection = selected_method,
primary_analysis = primary_results,
sensitivity_analysis = sensitivity_results,
effect_analysis = effect_analysis,
interpretation = interpretation,
execution_time = as.numeric(Sys.time() - start_time, units = "secs"),
parameters = parameters,
context = context
)
# Log successful completion
if (!is.null(self$audit_logger)) {
$audit_logger$log_event(
self"Statistical analysis completed successfully",
level = "INFO",
category = "STATISTICAL_ANALYSIS",
details = list(
analysis_id = analysis_id,
method_used = selected_method$method_name,
p_value = primary_results$p_value,
effect_size = effect_analysis$primary_effect_size,
assumptions_met = assumption_results$overall_assessment$suitable,
execution_time = analysis_result$execution_time
)
)
}
return(analysis_result)
error = function(e) {
},
# Log analysis error
if (!is.null(self$audit_logger)) {
$audit_logger$log_event(
selfpaste("Statistical analysis failed:", e$message),
level = "ERROR",
category = "STATISTICAL_ANALYSIS",
details = list(
analysis_id = analysis_id,
error_message = e$message,
analysis_type = analysis_type
)
)
}
return(StatisticalAnalysisResult$new(
analysis_id = analysis_id,
success = FALSE,
errors = list(paste("Analysis failed:", e$message)),
warnings = list(),
recommendations = list("Review data quality and analysis parameters")
))
})
},
#' Generate regulatory compliance report
#'
#' @param analysis_result StatisticalAnalysisResult. Analysis results
#' @param report_type character. Type of regulatory report
#'
#' @return RegulatoryReport. Formatted compliance report
#'
generate_regulatory_report = function(analysis_result, report_type = "FDA") {
$new()$generate_report(
RegulatoryReportGeneratoranalysis_result = analysis_result,
report_type = report_type,
compliance_standards = self$config$regulatory_compliance
)
}
),
private = list(
#' Default statistical testing configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete configuration
#'
default_testing_config = function(config) {
<- list(
default_config
assumptions = list(
normality = list(
tests = c("shapiro_wilk", "anderson_darling", "kolmogorov_smirnov"),
alpha_level = 0.05,
minimum_sample_size = 3,
maximum_sample_size = 5000
),
homogeneity = list(
tests = c("levene", "bartlett", "brown_forsythe"),
alpha_level = 0.05,
robust_default = TRUE
),
independence = list(
tests = c("durbin_watson", "runs_test"),
alpha_level = 0.05,
auto_detect = TRUE
),
outliers = list(
methods = c("iqr", "zscore", "modified_zscore", "isolation_forest"),
action_threshold = 0.05,
documentation_required = TRUE
)
),
method_selection = list(
prefer_robust = TRUE,
require_assumption_testing = TRUE,
sensitivity_analysis_required = TRUE,
effect_size_mandatory = TRUE
),
regulatory_compliance = list(
standards = c("FDA", "EMA", "ICH_E9"),
documentation_level = "comprehensive",
validation_required = TRUE,
audit_trail = TRUE
),
interpretation = list(
clinical_context_required = TRUE,
practical_significance_assessment = TRUE,
limitation_documentation = TRUE,
recommendation_generation = TRUE
)
)
modifyList(default_config, config)
},
#' Assess data quality for statistical analysis
#'
#' @param data data.frame. Input data
#' @param context list. Analysis context
#'
#' @return list. Data quality assessment results
#'
assess_data_quality = function(data, context) {
<- list()
quality_issues <- list()
warnings <- list()
recommendations
# Sample size assessment
if (nrow(data) < self$config$assumptions$normality$minimum_sample_size) {
<- append(quality_issues,
quality_issues "Sample size below minimum for reliable analysis")
else if (nrow(data) < 30) {
} <- append(warnings,
warnings "Small sample size may affect test reliability")
<- append(recommendations,
recommendations "Consider non-parametric alternatives")
}
# Missing data assessment
<- sapply(data, function(x) sum(is.na(x)) / length(x))
missing_props <- missing_props > 0.3
high_missing
if (any(high_missing)) {
<- append(quality_issues,
quality_issues paste("High missing data in variables:",
paste(names(data)[high_missing], collapse = ", ")))
}
# Data type validation
<- sapply(data, is.numeric)
numeric_vars if (!any(numeric_vars)) {
<- append(quality_issues,
quality_issues "No numeric variables found for analysis")
}
# Extreme value detection
for (var_name in names(data)[numeric_vars]) {
<- data[[var_name]][!is.na(data[[var_name]])]
var_data
if (length(var_data) > 0) {
# Check for extreme outliers (beyond 3 IQRs)
<- quantile(var_data, 0.25)
q1 <- quantile(var_data, 0.75)
q3 <- q3 - q1
iqr
<- sum(var_data < (q1 - 3*iqr) | var_data > (q3 + 3*iqr))
extreme_outliers
if (extreme_outliers > 0) {
<- append(warnings,
warnings paste("Extreme outliers detected in", var_name,
":", extreme_outliers, "observations"))
<- append(recommendations,
recommendations paste("Consider robust methods or outlier treatment for", var_name))
}
}
}
list(
suitable_for_analysis = length(quality_issues) == 0,
critical_issues = quality_issues,
warnings = warnings,
recommendations = recommendations,
data_summary = list(
n_observations = nrow(data),
n_variables = ncol(data),
n_numeric = sum(numeric_vars),
missing_data_summary = missing_props
)
)
},
#' Execute primary statistical analysis
#'
#' @param data data.frame. Analysis data
#' @param method StatisticalMethod. Selected statistical method
#' @param parameters list. Analysis parameters
#'
#' @return list. Primary analysis results
#'
execute_primary_analysis = function(data, method, parameters) {
switch(method$method_name,
"independent_ttest_students" = private$execute_students_ttest(data, parameters),
"independent_ttest_welch" = private$execute_welch_ttest(data, parameters),
"mann_whitney_u" = private$execute_mann_whitney(data, parameters),
"bootstrap_ttest" = private$execute_bootstrap_ttest(data, parameters),
# Default fallback
stop(paste("Unknown method:", method$method_name))
)
},
#' Execute Student's t-test
#'
#' @param data data.frame. Analysis data
#' @param parameters list. Test parameters
#'
#' @return list. T-test results
#'
execute_students_ttest = function(data, parameters) {
# Validate data structure
if (!"group" %in% names(data) || !"response" %in% names(data)) {
stop("Data must contain 'group' and 'response' variables")
}
# Execute t-test
<- t.test(
test_result ~ group,
response data = data,
var.equal = TRUE,
alternative = parameters$alternative %||% "two.sided",
conf.level = parameters$conf_level %||% 0.95
)
# Calculate additional statistics
<- split(data$response, data$group)
group_data <- lapply(group_data, function(x) {
group_stats list(
n = length(x[!is.na(x)]),
mean = mean(x, na.rm = TRUE),
sd = sd(x, na.rm = TRUE),
se = sd(x, na.rm = TRUE) / sqrt(length(x[!is.na(x)])),
median = median(x, na.rm = TRUE),
iqr = IQR(x, na.rm = TRUE)
)
})
# Effect size calculation (Cohen's d)
<- sqrt(((group_stats[[1]]$n - 1) * group_stats[[1]]$sd^2 +
pooled_sd 2]]$n - 1) * group_stats[[2]]$sd^2) /
(group_stats[[1]]$n + group_stats[[2]]$n - 2))
(group_stats[[
<- abs(group_stats[[1]]$mean - group_stats[[2]]$mean) / pooled_sd
cohens_d
list(
method_name = "Student's Independent Samples t-Test",
test_statistic = test_result$statistic,
degrees_freedom = test_result$parameter,
p_value = test_result$p.value,
confidence_interval = test_result$conf.int,
mean_difference = diff(test_result$estimate),
group_statistics = group_stats,
effect_size = list(
cohens_d = cohens_d,
interpretation = private$interpret_cohens_d(cohens_d)
),assumptions_required = c("normality", "homogeneity", "independence"),
raw_result = test_result
)
},
#' Execute Welch's t-test
#'
#' @param data data.frame. Analysis data
#' @param parameters list. Test parameters
#'
#' @return list. Welch t-test results
#'
execute_welch_ttest = function(data, parameters) {
# Execute Welch t-test
<- t.test(
test_result ~ group,
response data = data,
var.equal = FALSE, # Key difference from Student's t-test
alternative = parameters$alternative %||% "two.sided",
conf.level = parameters$conf_level %||% 0.95
)
# Calculate group statistics
<- split(data$response, data$group)
group_data <- lapply(group_data, function(x) {
group_stats list(
n = length(x[!is.na(x)]),
mean = mean(x, na.rm = TRUE),
sd = sd(x, na.rm = TRUE),
se = sd(x, na.rm = TRUE) / sqrt(length(x[!is.na(x)])),
median = median(x, na.rm = TRUE),
iqr = IQR(x, na.rm = TRUE)
)
})
# Effect size calculation (using pooled SD for comparability)
<- sqrt((group_stats[[1]]$sd^2 + group_stats[[2]]$sd^2) / 2)
pooled_sd <- abs(group_stats[[1]]$mean - group_stats[[2]]$mean) / pooled_sd
cohens_d
list(
method_name = "Welch's Independent Samples t-Test",
test_statistic = test_result$statistic,
degrees_freedom = test_result$parameter,
p_value = test_result$p.value,
confidence_interval = test_result$conf.int,
mean_difference = diff(test_result$estimate),
group_statistics = group_stats,
effect_size = list(
cohens_d = cohens_d,
interpretation = private$interpret_cohens_d(cohens_d)
),assumptions_required = c("normality", "independence"),
raw_result = test_result
)
},
#' Execute Mann-Whitney U test
#'
#' @param data data.frame. Analysis data
#' @param parameters list. Test parameters
#'
#' @return list. Mann-Whitney U test results
#'
execute_mann_whitney = function(data, parameters) {
<- split(data$response, data$group)
group_data
# Execute Mann-Whitney U test
<- wilcox.test(
test_result 1]],
group_data[[2]],
group_data[[alternative = parameters$alternative %||% "two.sided",
conf.int = TRUE,
conf.level = parameters$conf_level %||% 0.95
)
# Calculate group statistics
<- lapply(group_data, function(x) {
group_stats list(
n = length(x[!is.na(x)]),
median = median(x, na.rm = TRUE),
iqr = IQR(x, na.rm = TRUE),
mean_rank = mean(rank(c(group_data[[1]], group_data[[2]]))[seq_along(x)]),
q1 = quantile(x, 0.25, na.rm = TRUE),
q3 = quantile(x, 0.75, na.rm = TRUE)
)
})
# Effect size calculation (rank-biserial correlation)
<- group_stats[[1]]$n
n1 <- group_stats[[2]]$n
n2 <- test_result$statistic
U <- 1 - (2 * U) / (n1 * n2) # Rank-biserial correlation
r_rb
list(
method_name = "Mann-Whitney U Test",
test_statistic = test_result$statistic,
p_value = test_result$p.value,
confidence_interval = test_result$conf.int,
median_difference = test_result$estimate,
group_statistics = group_stats,
effect_size = list(
rank_biserial_r = r_rb,
interpretation = private$interpret_rank_biserial(abs(r_rb))
),assumptions_required = c("independence", "ordinal_data"),
raw_result = test_result
)
},
#' Conduct sensitivity analysis
#'
#' @param data data.frame. Analysis data
#' @param primary_method StatisticalMethod. Primary analysis method
#' @param primary_results list. Primary analysis results
#' @param parameters list. Analysis parameters
#'
#' @return list. Sensitivity analysis results
#'
conduct_sensitivity_analysis = function(data, primary_method, primary_results, parameters) {
<- list()
sensitivity_results
# Alternative methods sensitivity
<- private$get_alternative_methods(primary_method$method_name)
alternative_methods
for (alt_method in alternative_methods) {
tryCatch({
<- private$execute_primary_analysis(data,
alt_result list(method_name = alt_method),
parameters)<- list(
sensitivity_results[[alt_method]] p_value = alt_result$p_value,
effect_size = alt_result$effect_size,
conclusion_consistent = abs(alt_result$p_value - primary_results$p_value) < 0.05
)error = function(e) {
}, <- list(
sensitivity_results[[alt_method]] error = e$message,
conclusion_consistent = NA
)
})
}
# Outlier removal sensitivity
if (nrow(data) > 10) {
<- private$remove_extreme_outliers(data)
outlier_removed_data
if (nrow(outlier_removed_data) >= nrow(data) * 0.8) { # If less than 20% removed
tryCatch({
<- private$execute_primary_analysis(outlier_removed_data,
outlier_result
primary_method,
parameters)$outlier_removal <- list(
sensitivity_resultsn_removed = nrow(data) - nrow(outlier_removed_data),
p_value = outlier_result$p_value,
effect_size = outlier_result$effect_size,
conclusion_consistent = (primary_results$p_value < 0.05) == (outlier_result$p_value < 0.05)
)error = function(e) {
}, $outlier_removal <- list(error = e$message)
sensitivity_results
})
}
}
# Bootstrap sensitivity (if applicable)
if (primary_method$method_name %in% c("independent_ttest_students", "independent_ttest_welch")) {
tryCatch({
<- private$bootstrap_ttest_sensitivity(data, parameters)
bootstrap_result $bootstrap <- bootstrap_result
sensitivity_resultserror = function(e) {
}, $bootstrap <- list(error = e$message)
sensitivity_results
})
}
# Overall sensitivity assessment
<- sapply(sensitivity_results, function(x) {
consistent_results if ("conclusion_consistent" %in% names(x)) x$conclusion_consistent else NA
})
<- list(
sensitivity_summary methods_tested = length(sensitivity_results),
consistent_conclusions = sum(consistent_results, na.rm = TRUE),
consistency_rate = mean(consistent_results, na.rm = TRUE),
robust_conclusion = mean(consistent_results, na.rm = TRUE) >= 0.75,
detailed_results = sensitivity_results
)
return(sensitivity_summary)
},
#' Analyze effect sizes and practical significance
#'
#' @param data data.frame. Analysis data
#' @param results list. Primary analysis results
#' @param context list. Analysis context
#'
#' @return list. Effect size analysis
#'
analyze_effect_sizes = function(data, results, context) {
<- list()
effect_analysis
# Primary effect size from results
<- if ("cohens_d" %in% names(results$effect_size)) {
primary_effect list(
name = "Cohen's d",
value = results$effect_size$cohens_d,
interpretation = results$effect_size$interpretation,
type = "standardized_mean_difference"
)else if ("rank_biserial_r" %in% names(results$effect_size)) {
} list(
name = "Rank-biserial correlation",
value = results$effect_size$rank_biserial_r,
interpretation = results$effect_size$interpretation,
type = "rank_based"
)else {
} list(
name = "Unknown",
value = NA,
interpretation = "Not calculated",
type = "undefined"
)
}
$primary_effect_size <- primary_effect
effect_analysis
# Additional effect size calculations
if ("group_statistics" %in% names(results)) {
<- results$group_statistics
group_stats
# Glass's delta (using control group SD)
if (length(group_stats) == 2 && all(sapply(group_stats, function(x) "sd" %in% names(x)))) {
# Assume first group is control
<- abs(group_stats[[1]]$mean - group_stats[[2]]$mean) / group_stats[[1]]$sd
glass_delta
$glass_delta <- list(
effect_analysisvalue = glass_delta,
interpretation = private$interpret_cohens_d(glass_delta), # Same interpretation
description = "Effect size using control group standard deviation"
)
}
# Overlap statistics
if (all(sapply(group_stats, function(x) all(c("mean", "sd") %in% names(x))))) {
<- private$calculate_distribution_overlap(group_stats)
overlap_stats $overlap_statistics <- overlap_stats
effect_analysis
}
}
# Clinical significance assessment
if ("clinical_threshold" %in% names(context)) {
<- private$assess_clinical_significance(
clinical_assessment effect_size = primary_effect$value,
threshold = context$clinical_threshold,
context = context
)$clinical_significance <- clinical_assessment
effect_analysis
}
# Confidence intervals for effect sizes
if (!is.null(context$effect_size_ci) && context$effect_size_ci) {
<- private$calculate_effect_size_ci(data, results, primary_effect)
effect_ci $confidence_intervals <- effect_ci
effect_analysis
}
return(effect_analysis)
},
#' Generate comprehensive interpretation guidance
#'
#' @param primary_results list. Primary analysis results
#' @param assumption_results list. Assumption test results
#' @param effect_analysis list. Effect size analysis
#' @param context list. Analysis context
#'
#' @return list. Interpretation guidance
#'
generate_interpretation_guidance = function(primary_results, assumption_results,
effect_analysis, context) {
<- list()
interpretation
# Statistical significance interpretation
<- context$alpha_level %||% 0.05
alpha_level <- primary_results$p_value < alpha_level
is_significant
$statistical_significance <- list(
interpretationis_significant = is_significant,
p_value = primary_results$p_value,
alpha_level = alpha_level,
interpretation = if (is_significant) {
paste0("The result is statistically significant at the ",
" level (p = ", round(primary_results$p_value, 4), ").")
alpha_level, else {
} paste0("The result is not statistically significant at the ",
" level (p = ", round(primary_results$p_value, 4), ").")
alpha_level,
}
)
# Effect size interpretation
<- effect_analysis$primary_effect_size$value
effect_size_value $effect_size <- list(
interpretationmagnitude = effect_analysis$primary_effect_size$interpretation,
value = effect_size_value,
practical_importance = private$assess_practical_importance(
effect_size_value, $primary_effect_size$type,
effect_analysis
context
)
)
# Assumption validity assessment
$assumption_validity <- list(
interpretationoverall_assessment = assumption_results$overall_assessment$suitable,
violated_assumptions = assumption_results$overall_assessment$violations,
impact_on_results = private$assess_assumption_impact(
assumption_results, $method_name
primary_results
),recommendations = assumption_results$overall_assessment$recommendations
)
# Clinical significance (if applicable)
if ("clinical_significance" %in% names(effect_analysis)) {
$clinical_significance <- effect_analysis$clinical_significance
interpretation
}
# Confidence interval interpretation
if ("confidence_interval" %in% names(primary_results)) {
$confidence_interval <- list(
interpretationinterval = primary_results$confidence_interval,
level = context$conf_level %||% 0.95,
interpretation = private$interpret_confidence_interval(
$confidence_interval,
primary_results$method_name
primary_results
)
)
}
# Overall conclusion and recommendations
$conclusion <- private$generate_overall_conclusion(
interpretationstatistical_significance = interpretation$statistical_significance,
effect_size = interpretation$effect_size,
assumption_validity = interpretation$assumption_validity,
clinical_significance = interpretation$clinical_significance %||% NULL,
context = context
)
# Limitations and caveats
$limitations <- private$identify_analysis_limitations(
interpretation
primary_results,
assumption_results,
effect_analysis,
context
)
return(interpretation)
},
#' Helper function to interpret Cohen's d
#'
#' @param d numeric. Cohen's d value
#'
#' @return character. Interpretation
#'
interpret_cohens_d = function(d) {
<- abs(d)
abs_d if (abs_d < 0.2) {
"Very small effect"
else if (abs_d < 0.5) {
} "Small effect"
else if (abs_d < 0.8) {
} "Medium effect"
else {
} "Large effect"
}
},
#' Helper function to interpret rank-biserial correlation
#'
#' @param r numeric. Rank-biserial correlation value
#'
#' @return character. Interpretation
#'
interpret_rank_biserial = function(r) {
<- abs(r)
abs_r if (abs_r < 0.1) {
"Very small effect"
else if (abs_r < 0.3) {
} "Small effect"
else if (abs_r < 0.5) {
} "Medium effect"
else {
} "Large effect"
}
},
#' Get alternative methods for sensitivity analysis
#'
#' @param primary_method character. Primary method name
#'
#' @return character vector. Alternative method names
#'
get_alternative_methods = function(primary_method) {
switch(primary_method,
"independent_ttest_students" = c("independent_ttest_welch", "mann_whitney_u"),
"independent_ttest_welch" = c("independent_ttest_students", "mann_whitney_u"),
"mann_whitney_u" = c("independent_ttest_welch", "bootstrap_ttest"),
"bootstrap_ttest" = c("independent_ttest_welch", "mann_whitney_u"),
c() # Default: no alternatives
)
},
#' Remove extreme outliers from data
#'
#' @param data data.frame. Input data
#'
#' @return data.frame. Data with extreme outliers removed
#'
remove_extreme_outliers = function(data) {
if (!"response" %in% names(data)) {
return(data)
}
<- data$response[!is.na(data$response)]
response_data
if (length(response_data) < 5) {
return(data) # Don't remove outliers for very small samples
}
# Use 3 IQR rule for extreme outliers
<- quantile(response_data, 0.25)
q1 <- quantile(response_data, 0.75)
q3 <- q3 - q1
iqr
<- q1 - 3 * iqr
lower_bound <- q3 + 3 * iqr
upper_bound
# Keep only non-extreme values
<- !is.na(data$response) &
keep_indices $response >= lower_bound &
data$response <= upper_bound
data
return(data[keep_indices, ])
},
#' Bootstrap t-test sensitivity analysis
#'
#' @param data data.frame. Analysis data
#' @param parameters list. Analysis parameters
#' @param n_bootstrap integer. Number of bootstrap samples
#'
#' @return list. Bootstrap sensitivity results
#'
bootstrap_ttest_sensitivity = function(data, parameters, n_bootstrap = 1000) {
<- split(data$response, data$group)
group_data <- group_data[[1]][!is.na(group_data[[1]])]
group1 <- group_data[[2]][!is.na(group_data[[2]])]
group2
# Bootstrap sampling
<- replicate(n_bootstrap, {
bootstrap_results # Resample with replacement
<- sample(group1, length(group1), replace = TRUE)
boot_group1 <- sample(group2, length(group2), replace = TRUE)
boot_group2
# Calculate statistics
<- mean(boot_group1) - mean(boot_group2)
mean_diff <- sqrt((var(boot_group1) + var(boot_group2)) / 2)
pooled_sd <- mean_diff / pooled_sd
cohens_d
# Bootstrap t-test
<- t.test(boot_group1, boot_group2)
t_result
list(
mean_difference = mean_diff,
cohens_d = cohens_d,
p_value = t_result$p.value,
t_statistic = t_result$statistic
)simplify = FALSE)
},
# Summarize bootstrap results
<- sapply(bootstrap_results, function(x) x$p_value)
p_values <- sapply(bootstrap_results, function(x) x$cohens_d)
cohens_d_values
list(
n_bootstrap = n_bootstrap,
p_value_median = median(p_values, na.rm = TRUE),
p_value_ci = quantile(p_values, c(0.025, 0.975), na.rm = TRUE),
cohens_d_median = median(cohens_d_values, na.rm = TRUE),
cohens_d_ci = quantile(cohens_d_values, c(0.025, 0.975), na.rm = TRUE),
proportion_significant = mean(p_values < 0.05, na.rm = TRUE)
)
},
#' Calculate distribution overlap statistics
#'
#' @param group_stats list. Group statistics
#'
#' @return list. Overlap statistics
#'
calculate_distribution_overlap = function(group_stats) {
if (length(group_stats) != 2) {
return(list(error = "Overlap calculation requires exactly 2 groups"))
}
# Extract means and SDs
<- group_stats[[1]]$mean
mean1 <- group_stats[[2]]$mean
mean2 <- group_stats[[1]]$sd
sd1 <- group_stats[[2]]$sd
sd2
# Cohen's U3 (percentage of group 2 below mean of group 1)
<- pnorm(mean1, mean2, sd2)
u3
# Probability of superiority
<- pnorm((mean1 - mean2) / sqrt(sd1^2 + sd2^2))
prob_superiority
# Overlap coefficient (approximate)
if (sd1 > 0 && sd2 > 0) {
<- 2 * pnorm(-abs(mean1 - mean2) / (2 * sqrt((sd1^2 + sd2^2) / 2)))
overlap_coef else {
} <- NA
overlap_coef
}
list(
cohens_u3 = u3,
probability_superiority = prob_superiority,
overlap_coefficient = overlap_coef,
interpretation = list(
u3_meaning = paste0(round(u3 * 100, 1), "% of group 2 scores below group 1 mean"),
superiority_meaning = paste0(round(prob_superiority * 100, 1),
"% probability that group 1 score > group 2 score")
)
)
},
#' Assess clinical significance
#'
#' @param effect_size numeric. Effect size value
#' @param threshold numeric. Clinical significance threshold
#' @param context list. Analysis context
#'
#' @return list. Clinical significance assessment
#'
assess_clinical_significance = function(effect_size, threshold, context) {
<- abs(effect_size) >= threshold
is_clinically_significant
list(
is_clinically_significant = is_clinically_significant,
effect_size = effect_size,
threshold = threshold,
interpretation = if (is_clinically_significant) {
paste0("The effect size (", round(effect_size, 3),
") meets or exceeds the clinical significance threshold (",
").")
threshold, else {
} paste0("The effect size (", round(effect_size, 3),
") does not meet the clinical significance threshold (",
").")
threshold,
},clinical_context = context$clinical_context %||% "No specific context provided"
)
},
#' Assess practical importance of effect size
#'
#' @param effect_size numeric. Effect size value
#' @param effect_type character. Type of effect size
#' @param context list. Analysis context
#'
#' @return list. Practical importance assessment
#'
assess_practical_importance = function(effect_size, effect_type, context) {
# Domain-specific thresholds
<- context$domain_thresholds %||% list(
domain_thresholds clinical = list(small = 0.2, medium = 0.5, large = 0.8),
educational = list(small = 0.25, medium = 0.6, large = 1.0),
psychological = list(small = 0.2, medium = 0.5, large = 0.8)
)
<- context$domain %||% "clinical"
domain <- domain_thresholds[[domain]] %||% domain_thresholds$clinical
thresholds
<- abs(effect_size)
abs_effect
<- if (abs_effect < thresholds$small) {
practical_magnitude "Minimal practical importance"
else if (abs_effect < thresholds$medium) {
} "Small practical importance"
else if (abs_effect < thresholds$large) {
} "Moderate practical importance"
else {
} "Large practical importance"
}
list(
magnitude = practical_magnitude,
domain_context = domain,
thresholds_used = thresholds,
recommendation = private$generate_practical_recommendation(
practical_magnitude,
context
)
)
},
#' Generate practical recommendation based on effect size
#'
#' @param magnitude character. Practical magnitude assessment
#' @param context list. Analysis context
#'
#' @return character. Practical recommendation
#'
generate_practical_recommendation = function(magnitude, context) {
switch(magnitude,
"Minimal practical importance" =
"The observed difference may have limited practical significance. Consider whether the effect is meaningful in the specific context.",
"Small practical importance" =
"The observed difference suggests a small but potentially meaningful effect. Evaluate cost-benefit considerations for implementation.",
"Moderate practical importance" =
"The observed difference indicates a moderate effect that is likely to be practically meaningful. Strong consideration for implementation is warranted.",
"Large practical importance" =
"The observed difference indicates a large effect with substantial practical importance. Implementation should be strongly considered.",
"Unknown practical importance"
)
}
) )
Assumption Testing Framework
Comprehensive Assumption Validation
Create a sophisticated assumption testing system for statistical rigor:
# File: R/assumption_testing_framework.R
#' Comprehensive Assumption Testing Framework
#'
#' @description Advanced assumption testing system with automated validation,
#' intelligent interpretation, and regulatory compliance documentation.
#'
#' @export
<- R6::R6Class(
AssumptionTestingFramework "AssumptionTestingFramework",
public = list(
#' @field config assumption testing configuration
config = NULL,
#' @field test_registry collection of available assumption tests
test_registry = NULL,
#' Initialize assumption testing framework
#'
#' @param config list. Assumption testing configuration
#'
initialize = function(config = list()) {
$config <- private$default_assumption_config(config)
self$test_registry <- private$initialize_test_registry()
self
},
#' Test all relevant assumptions for analysis
#'
#' @param data data.frame. Analysis data
#' @param analysis_type character. Type of statistical analysis
#' @param parameters list. Analysis parameters
#'
#' @return AssumptionTestResults. Comprehensive assumption test results
#'
test_all_assumptions = function(data, analysis_type, parameters = list()) {
# Determine required assumptions for analysis type
<- private$get_required_assumptions(analysis_type)
required_assumptions
<- list()
assumption_results
# Test each required assumption
for (assumption in required_assumptions) {
<- switch(assumption,
test_result "normality" = self$test_normality(data, parameters),
"homogeneity" = self$test_homogeneity(data, parameters),
"independence" = self$test_independence(data, parameters),
"linearity" = self$test_linearity(data, parameters),
"outliers" = self$detect_outliers(data, parameters),
list(error = paste("Unknown assumption:", assumption))
)
<- test_result
assumption_results[[assumption]]
}
# Generate overall assessment
<- private$assess_overall_assumptions(
overall_assessment
assumption_results,
analysis_type,
parameters
)
$new(
AssumptionTestResultsanalysis_type = analysis_type,
assumption_results = assumption_results,
overall_assessment = overall_assessment,
parameters = parameters
)
},
#' Test normality assumption
#'
#' @param data data.frame. Input data
#' @param parameters list. Test parameters
#'
#' @return list. Normality test results
#'
test_normality = function(data, parameters = list()) {
<- list()
normality_results
# Identify numeric variables to test
if ("response" %in% names(data)) {
<- "response"
test_vars else {
} <- names(data)[sapply(data, is.numeric)]
numeric_vars <- numeric_vars[1:min(3, length(numeric_vars))] # Limit to 3 variables
test_vars
}
for (var_name in test_vars) {
<- data[[var_name]][!is.na(data[[var_name]])]
var_data
if (length(var_data) < 3) {
<- list(
normality_results[[var_name]] error = "Insufficient data for normality testing",
n = length(var_data)
)next
}
<- list()
var_results
# Shapiro-Wilk test (for n <= 5000)
if (length(var_data) <= 5000) {
tryCatch({
<- shapiro.test(var_data)
shapiro_result $shapiro_wilk <- list(
var_resultsstatistic = shapiro_result$statistic,
p_value = shapiro_result$p.value,
method = "Shapiro-Wilk",
conclusion = shapiro_result$p.value >= self$config$normality$alpha_level
)error = function(e) {
}, $shapiro_wilk <- list(error = e$message)
var_results
})
}
# Anderson-Darling test
if (requireNamespace("nortest", quietly = TRUE)) {
tryCatch({
<- nortest::ad.test(var_data)
ad_result $anderson_darling <- list(
var_resultsstatistic = ad_result$statistic,
p_value = ad_result$p.value,
method = "Anderson-Darling",
conclusion = ad_result$p.value >= self$config$normality$alpha_level
)error = function(e) {
}, $anderson_darling <- list(error = e$message)
var_results
})
}
# Kolmogorov-Smirnov test
tryCatch({
# Compare to normal distribution with sample mean and sd
<- ks.test(var_data, "pnorm", mean(var_data), sd(var_data))
ks_result $kolmogorov_smirnov <- list(
var_resultsstatistic = ks_result$statistic,
p_value = ks_result$p.value,
method = "Kolmogorov-Smirnov",
conclusion = ks_result$p.value >= self$config$normality$alpha_level
)error = function(e) {
}, $kolmogorov_smirnov <- list(error = e$message)
var_results
})
# Descriptive statistics for normality assessment
$descriptive_assessment <- private$assess_normality_descriptive(var_data)
var_results
# Overall normality conclusion for this variable
<- var_results[!sapply(var_results, function(x) "error" %in% names(x))]
valid_tests if (length(valid_tests) > 0) {
<- sapply(valid_tests, function(x) {
test_conclusions if ("conclusion" %in% names(x)) x$conclusion else NA
})
# Conservative approach: assume non-normal if any test rejects
$overall_conclusion <- list(
var_resultsassumes_normal = all(test_conclusions, na.rm = TRUE),
n_tests = sum(!is.na(test_conclusions)),
consensus = mean(test_conclusions, na.rm = TRUE)
)
}
<- var_results
normality_results[[var_name]]
}
return(normality_results)
},
#' Test homogeneity of variance assumption
#'
#' @param data data.frame. Input data with group and response variables
#' @param parameters list. Test parameters
#'
#' @return list. Homogeneity test results
#'
test_homogeneity = function(data, parameters = list()) {
if (!"group" %in% names(data) || !"response" %in% names(data)) {
return(list(error = "Data must contain 'group' and 'response' variables"))
}
# Remove missing values
<- data[!is.na(data$group) & !is.na(data$response), ]
complete_data
if (nrow(complete_data) < 6) {
return(list(error = "Insufficient data for homogeneity testing"))
}
<- table(complete_data$group)
group_counts if (any(group_counts < 2)) {
return(list(error = "Each group must have at least 2 observations"))
}
<- list()
homogeneity_results
# Levene's test (robust)
tryCatch({
if (requireNamespace("car", quietly = TRUE)) {
<- car::leveneTest(response ~ group, data = complete_data)
levene_result $levene <- list(
homogeneity_resultsstatistic = levene_result$`F value`[1],
p_value = levene_result$`Pr(>F)`[1],
df = levene_result$Df,
method = "Levene's Test",
conclusion = levene_result$`Pr(>F)`[1] >= self$config$homogeneity$alpha_level
)else {
} # Manual Levene's test implementation
<- private$manual_levene_test(complete_data)
levene_manual $levene <- levene_manual
homogeneity_results
}error = function(e) {
}, $levene <- list(error = e$message)
homogeneity_results
})
# Bartlett's test (assumes normality)
tryCatch({
<- bartlett.test(response ~ group, data = complete_data)
bartlett_result $bartlett <- list(
homogeneity_resultsstatistic = bartlett_result$statistic,
p_value = bartlett_result$p.value,
df = bartlett_result$parameter,
method = "Bartlett's Test",
conclusion = bartlett_result$p.value >= self$config$homogeneity$alpha_level,
note = "Assumes normality within groups"
)error = function(e) {
}, $bartlett <- list(error = e$message)
homogeneity_results
})
# Brown-Forsythe test (robust alternative to Levene's)
tryCatch({
<- private$brown_forsythe_test(complete_data)
bf_result $brown_forsythe <- bf_result
homogeneity_resultserror = function(e) {
}, $brown_forsythe <- list(error = e$message)
homogeneity_results
})
# Descriptive assessment
<- by(complete_data$response, complete_data$group, var, na.rm = TRUE)
group_variances $descriptive_assessment <- list(
homogeneity_resultsgroup_variances = as.list(group_variances),
variance_ratio = max(group_variances) / min(group_variances),
variance_ratio_interpretation = private$interpret_variance_ratio(
max(group_variances) / min(group_variances)
)
)
# Overall homogeneity conclusion
<- homogeneity_results[!sapply(homogeneity_results, function(x) "error" %in% names(x))]
valid_tests <- sapply(valid_tests, function(x) {
test_conclusions if ("conclusion" %in% names(x)) x$conclusion else NA
})
if (length(test_conclusions) > 0) {
$overall_conclusion <- list(
homogeneity_resultsassumes_homogeneity = mean(test_conclusions, na.rm = TRUE) >= 0.5,
n_tests = sum(!is.na(test_conclusions)),
consensus = mean(test_conclusions, na.rm = TRUE),
recommendation = if (mean(test_conclusions, na.rm = TRUE) >= 0.5) {
"Homogeneity assumption appears reasonable"
else {
} "Consider using methods that don't assume equal variances (e.g., Welch's t-test)"
}
)
}
return(homogeneity_results)
},
#' Test independence assumption
#'
#' @param data data.frame. Input data
#' @param parameters list. Test parameters
#'
#' @return list. Independence test results
#'
test_independence = function(data, parameters = list()) {
<- list()
independence_results
# Note about independence assumption
$note <- paste(
independence_results"Independence is primarily a design issue that should be ensured",
"through proper randomization and data collection procedures.",
"Statistical tests can only detect certain types of dependence."
)
# For time series or ordered data, test for autocorrelation
if ("response" %in% names(data) && nrow(data) >= 10) {
<- data$response[!is.na(data$response)]
response_data
# Durbin-Watson test for autocorrelation
if (requireNamespace("lmtest", quietly = TRUE)) {
tryCatch({
# Simple linear model for Durbin-Watson
<- data.frame(
temp_data y = response_data,
x = seq_along(response_data)
)<- lm(y ~ x, data = temp_data)
temp_model <- lmtest::dwtest(temp_model)
dw_result
$durbin_watson <- list(
independence_resultsstatistic = dw_result$statistic,
p_value = dw_result$p.value,
method = "Durbin-Watson Test",
conclusion = dw_result$p.value >= self$config$independence$alpha_level,
interpretation = "Tests for first-order autocorrelation"
)error = function(e) {
}, $durbin_watson <- list(error = e$message)
independence_results
})
}
# Runs test for randomness
tryCatch({
<- private$runs_test(response_data)
runs_result $runs_test <- runs_result
independence_resultserror = function(e) {
}, $runs_test <- list(error = e$message)
independence_results
})
}
# Design-based independence assessment
$design_assessment <- list(
independence_resultsdata_collection_method = parameters$data_collection_method %||% "Not specified",
randomization_used = parameters$randomization %||% "Not specified",
temporal_ordering = "order" %in% names(data) || "time" %in% names(data),
clustering_present = parameters$clustering %||% "Not specified",
repeated_measures = parameters$repeated_measures %||% "Not specified"
)
# Overall independence assessment
if (any(sapply(independence_results, function(x) "conclusion" %in% names(x)))) {
<- sapply(independence_results, function(x) {
test_conclusions if ("conclusion" %in% names(x)) x$conclusion else NA
})
$overall_conclusion <- list(
independence_resultsstatistical_evidence_for_independence = mean(test_conclusions, na.rm = TRUE) >= 0.5,
primary_concern = "Ensure independence through study design",
recommendation = paste(
"Independence is best ensured through proper study design.",
"If data collection involved clustering, temporal dependence,",
"or repeated measures, consider appropriate analytical methods."
)
)
}
return(independence_results)
},
#' Detect outliers in data
#'
#' @param data data.frame. Input data
#' @param parameters list. Detection parameters
#'
#' @return list. Outlier detection results
#'
detect_outliers = function(data, parameters = list()) {
<- list()
outlier_results
# Identify numeric variables for outlier detection
if ("response" %in% names(data)) {
<- "response"
test_vars else {
} <- names(data)[sapply(data, is.numeric)]
numeric_vars <- numeric_vars[1:min(3, length(numeric_vars))]
test_vars
}
for (var_name in test_vars) {
<- data[[var_name]][!is.na(data[[var_name]])]
var_data
if (length(var_data) < 5) {
<- list(
outlier_results[[var_name]] error = "Insufficient data for outlier detection",
n = length(var_data)
)next
}
<- list()
var_outlier_results
# IQR method (1.5 * IQR rule)
<- quantile(var_data, 0.25)
q1 <- quantile(var_data, 0.75)
q3 <- q3 - q1
iqr
<- q1 - 1.5 * iqr
iqr_lower <- q3 + 1.5 * iqr
iqr_upper <- which(var_data < iqr_lower | var_data > iqr_upper)
iqr_outliers
$iqr_method <- list(
var_outlier_resultsmethod = "IQR (1.5 * IQR rule)",
outlier_indices = iqr_outliers,
outlier_count = length(iqr_outliers),
outlier_proportion = length(iqr_outliers) / length(var_data),
bounds = c(iqr_lower, iqr_upper),
outlier_values = var_data[iqr_outliers]
)
# Modified Z-score method
<- median(var_data)
median_val <- mad(var_data)
mad_val
if (mad_val > 0) {
<- 0.6745 * (var_data - median_val) / mad_val
modified_z <- which(abs(modified_z) > 3.5)
mz_outliers
$modified_zscore <- list(
var_outlier_resultsmethod = "Modified Z-score",
outlier_indices = mz_outliers,
outlier_count = length(mz_outliers),
outlier_proportion = length(mz_outliers) / length(var_data),
threshold = 3.5,
outlier_values = var_data[mz_outliers],
modified_z_scores = modified_z[mz_outliers]
)
}
# Isolation Forest (if sufficient data)
if (length(var_data) >= 20 && requireNamespace("isotree", quietly = TRUE)) {
tryCatch({
<- isotree::isolation.forest(matrix(var_data, ncol = 1))
iso_model <- isotree::predict.isolation_forest(iso_model, matrix(var_data, ncol = 1))
anomaly_scores <- quantile(anomaly_scores, 0.95) # Top 5% as outliers
iso_threshold <- which(anomaly_scores > iso_threshold)
iso_outliers
$isolation_forest <- list(
var_outlier_resultsmethod = "Isolation Forest",
outlier_indices = iso_outliers,
outlier_count = length(iso_outliers),
outlier_proportion = length(iso_outliers) / length(var_data),
threshold = iso_threshold,
outlier_values = var_data[iso_outliers],
anomaly_scores = anomaly_scores[iso_outliers]
)error = function(e) {
}, $isolation_forest <- list(error = e$message)
var_outlier_results
})
}
# Consensus outlier identification
<- unique(c(
all_outlier_indices
iqr_outliers,if (exists("mz_outliers")) mz_outliers else c(),
if ("isolation_forest" %in% names(var_outlier_results) &&
!"error" %in% names(var_outlier_results$isolation_forest)) {
$isolation_forest$outlier_indices
var_outlier_resultselse c()
}
))
# Count how many methods identify each outlier
<- sapply(all_outlier_indices, function(idx) {
outlier_consensus <- 0
count if (idx %in% iqr_outliers) count <- count + 1
if (exists("mz_outliers") && idx %in% mz_outliers) count <- count + 1
if ("isolation_forest" %in% names(var_outlier_results) &&
!"error" %in% names(var_outlier_results$isolation_forest) &&
%in% var_outlier_results$isolation_forest$outlier_indices) count <- count + 1
idx
count
})
# High-confidence outliers (identified by multiple methods)
<- all_outlier_indices[outlier_consensus >= 2]
high_confidence_outliers
$consensus <- list(
var_outlier_resultsall_potential_outliers = all_outlier_indices,
high_confidence_outliers = high_confidence_outliers,
outlier_consensus_scores = outlier_consensus,
recommendation = if (length(high_confidence_outliers) > 0) {
paste("Consider investigating", length(high_confidence_outliers),
"high-confidence outliers")
else if (length(all_outlier_indices) > 0) {
} paste("Potential outliers detected but with low consensus.",
"Manual review recommended")
else {
} "No significant outliers detected"
}
)
<- var_outlier_results
outlier_results[[var_name]]
}
return(outlier_results)
}
),
private = list(
#' Default assumption testing configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete configuration
#'
default_assumption_config = function(config) {
<- list(
default_config normality = list(
tests = c("shapiro_wilk", "anderson_darling", "kolmogorov_smirnov"),
alpha_level = 0.05,
min_sample_size = 3,
max_sample_size_shapiro = 5000,
descriptive_thresholds = list(
skewness = 2.0,
kurtosis = 7.0
)
),
homogeneity = list(
tests = c("levene", "bartlett", "brown_forsythe"),
alpha_level = 0.05,
variance_ratio_concern = 4.0,
prefer_robust = TRUE
),
independence = list(
tests = c("durbin_watson", "runs_test"),
alpha_level = 0.05,
design_based_assessment = TRUE
),
outliers = list(
methods = c("iqr", "modified_zscore", "isolation_forest"),
iqr_multiplier = 1.5,
modified_z_threshold = 3.5,
isolation_forest_threshold = 0.95,
consensus_threshold = 2
)
)
modifyList(default_config, config)
},
#' Initialize test registry
#'
#' @return list. Available statistical tests
#'
initialize_test_registry = function() {
list(
normality = list(
shapiro_wilk = list(
function_name = "shapiro.test",
package = "stats",
sample_size_limits = c(3, 5000),
description = "Shapiro-Wilk test for normality"
),anderson_darling = list(
function_name = "ad.test",
package = "nortest",
sample_size_limits = c(7, Inf),
description = "Anderson-Darling test for normality"
),kolmogorov_smirnov = list(
function_name = "ks.test",
package = "stats",
sample_size_limits = c(3, Inf),
description = "Kolmogorov-Smirnov test for normality"
)
),
homogeneity = list(
levene = list(
function_name = "leveneTest",
package = "car",
description = "Levene's test for homogeneity of variance"
),bartlett = list(
function_name = "bartlett.test",
package = "stats",
description = "Bartlett's test for homogeneity of variance"
)
)
)
},
#' Get required assumptions for analysis type
#'
#' @param analysis_type character. Type of analysis
#'
#' @return character vector. Required assumptions
#'
get_required_assumptions = function(analysis_type) {
switch(analysis_type,
"independent_ttest" = c("normality", "homogeneity", "independence", "outliers"),
"paired_ttest" = c("normality", "independence", "outliers"),
"one_sample_ttest" = c("normality", "independence", "outliers"),
"anova" = c("normality", "homogeneity", "independence", "outliers"),
"correlation" = c("normality", "linearity", "independence", "outliers"),
"regression" = c("normality", "linearity", "independence", "homogeneity", "outliers"),
# Default assumptions for unknown analysis types
c("normality", "independence", "outliers")
)
},
#' Assess overall assumption validity
#'
#' @param assumption_results list. Results from individual assumption tests
#' @param analysis_type character. Type of analysis
#' @param parameters list. Analysis parameters
#'
#' @return list. Overall assessment
#'
assess_overall_assumptions = function(assumption_results, analysis_type, parameters) {
<- list()
violations <- list()
warnings <- list()
recommendations
# Check each assumption
for (assumption_name in names(assumption_results)) {
<- assumption_results[[assumption_name]]
assumption_result
if ("error" %in% names(assumption_result)) {
<- append(warnings,
warnings paste("Could not test", assumption_name, "assumption:",
$error))
assumption_resultnext
}
<- switch(assumption_name,
assumption_met "normality" = private$assess_normality_overall(assumption_result),
"homogeneity" = private$assess_homogeneity_overall(assumption_result),
"independence" = private$assess_independence_overall(assumption_result),
"outliers" = private$assess_outliers_overall(assumption_result),
TRUE # Default: assume met if unknown
)
if (!assumption_met$met) {
<- append(violations, assumption_name)
violations <- append(warnings, assumption_met$warning)
warnings <- append(recommendations, assumption_met$recommendation)
recommendations
}
}
# Overall suitability assessment
<- intersect(violations, c("normality", "independence"))
critical_violations <- length(critical_violations) == 0
suitable_for_analysis
# Generate method recommendations
<- private$generate_method_recommendations(
method_recommendations
violations, analysis_type, assumption_results
)
list(
suitable = suitable_for_analysis,
violations = violations,
critical_violations = critical_violations,
warnings = warnings,
recommendations = c(recommendations, method_recommendations),
assumption_summary = private$create_assumption_summary(assumption_results),
suggested_alternatives = if (!suitable_for_analysis) {
$suggest_alternative_methods(violations, analysis_type)
privateelse NULL
}
)
},
#' Assess normality assumption overall
#'
#' @param normality_result list. Normality test results
#'
#' @return list. Assessment result
#'
assess_normality_overall = function(normality_result) {
if (length(normality_result) == 0) {
return(list(met = FALSE, warning = "No normality tests completed"))
}
# Look for overall conclusions in each variable
<- sapply(normality_result, function(var_result) {
overall_conclusions if ("overall_conclusion" %in% names(var_result)) {
$overall_conclusion$assumes_normal
var_resultelse {
} NA
}
})
if (all(is.na(overall_conclusions))) {
return(list(
met = FALSE,
warning = "Could not determine normality status",
recommendation = "Manual inspection of data distribution recommended"
))
}
<- all(overall_conclusions, na.rm = TRUE)
normality_met
list(
met = normality_met,
warning = if (!normality_met) "Normality assumption appears violated" else NULL,
recommendation = if (!normality_met) {
"Consider non-parametric alternatives or data transformation"
else NULL
}
)
},
#' Assess homogeneity assumption overall
#'
#' @param homogeneity_result list. Homogeneity test results
#'
#' @return list. Assessment result
#'
assess_homogeneity_overall = function(homogeneity_result) {
if ("overall_conclusion" %in% names(homogeneity_result)) {
<- homogeneity_result$overall_conclusion$assumes_homogeneity
homogeneity_met
list(
met = homogeneity_met,
warning = if (!homogeneity_met) "Homogeneity of variance assumption appears violated" else NULL,
recommendation = if (!homogeneity_met) {
"Consider Welch's t-test or other methods that don't assume equal variances"
else NULL
}
)else {
} list(
met = FALSE,
warning = "Could not assess homogeneity of variance",
recommendation = "Manual inspection of group variances recommended"
)
}
},
#' Assess independence assumption overall
#'
#' @param independence_result list. Independence test results
#'
#' @return list. Assessment result
#'
assess_independence_overall = function(independence_result) {
if ("overall_conclusion" %in% names(independence_result)) {
<- independence_result$overall_conclusion$statistical_evidence_for_independence
independence_met
list(
met = independence_met,
warning = if (!independence_met) {
"Statistical evidence suggests potential dependence in data"
else {
} "Independence primarily depends on study design"
},recommendation = "Ensure independence through proper study design and data collection"
)else {
} list(
met = TRUE, # Assume met if cannot test
warning = "Independence assumption not statistically testable",
recommendation = "Ensure independence through study design"
)
}
},
#' Assess outliers overall
#'
#' @param outlier_result list. Outlier detection results
#'
#' @return list. Assessment result
#'
assess_outliers_overall = function(outlier_result) {
# Count high-confidence outliers across all variables
<- 0
total_high_confidence <- 0
total_observations
for (var_result in outlier_result) {
if ("consensus" %in% names(var_result)) {
<- total_high_confidence +
total_high_confidence length(var_result$consensus$high_confidence_outliers)
# Estimate total observations from first method
if ("iqr_method" %in% names(var_result)) {
<- total_observations +
total_observations length(var_result$iqr_method$outlier_values) +
$iqr_method$outlier_count
var_result
}
}
}
<- if (total_observations > 0) {
outlier_proportion / total_observations
total_high_confidence else 0
}
# Consider problematic if >5% are high-confidence outliers
<- outlier_proportion > 0.05
outliers_problematic
list(
met = !outliers_problematic,
warning = if (outliers_problematic) {
paste("High proportion of outliers detected:",
round(outlier_proportion * 100, 1), "%")
else if (total_high_confidence > 0) {
} paste("Some outliers detected but within acceptable range")
else NULL,
} recommendation = if (outliers_problematic) {
"Consider outlier treatment or robust statistical methods"
else if (total_high_confidence > 0) {
} "Review identified outliers for data entry errors or valid extreme values"
else NULL
}
)
},
#' Manual implementation of Levene's test
#'
#' @param data data.frame. Data with group and response variables
#'
#' @return list. Levene's test results
#'
manual_levene_test = function(data) {
# Calculate group medians
<- by(data$response, data$group, median, na.rm = TRUE)
group_medians
# Calculate absolute deviations from group medians
$abs_dev <- abs(data$response - group_medians[data$group])
data
# Perform ANOVA on absolute deviations
<- anova(lm(abs_dev ~ group, data = data))
anova_result
list(
statistic = anova_result$`F value`[1],
p_value = anova_result$`Pr(>F)`[1],
df = anova_result$Df,
method = "Levene's Test (manual implementation)",
conclusion = anova_result$`Pr(>F)`[1] >= self$config$homogeneity$alpha_level
)
},
#' Brown-Forsythe test implementation
#'
#' @param data data.frame. Data with group and response variables
#'
#' @return list. Brown-Forsythe test results
#'
brown_forsythe_test = function(data) {
# Similar to Levene's but uses group medians instead of means
# This is actually what our manual Levene's test implements
<- private$manual_levene_test(data)
levene_result
$method <- "Brown-Forsythe Test"
levene_resultreturn(levene_result)
},
#' Runs test for randomness
#'
#' @param x numeric vector. Data series
#'
#' @return list. Runs test results
#'
runs_test = function(x) {
if (length(x) < 10) {
return(list(error = "Insufficient data for runs test"))
}
# Convert to binary sequence (above/below median)
<- median(x, na.rm = TRUE)
median_x <- as.numeric(x > median_x)
binary_seq
# Count runs
<- 1
runs for (i in 2:length(binary_seq)) {
if (binary_seq[i] != binary_seq[i-1]) {
<- runs + 1
runs
}
}
# Calculate expected runs and standard deviation
<- sum(binary_seq)
n1 <- length(binary_seq) - n1
n2
if (n1 == 0 || n2 == 0) {
return(list(error = "All values are on one side of median"))
}
<- (2 * n1 * n2) / (n1 + n2) + 1
expected_runs <- (2 * n1 * n2 * (2 * n1 * n2 - n1 - n2)) /
var_runs + n2)^2 * (n1 + n2 - 1))
((n1
if (var_runs <= 0) {
return(list(error = "Cannot calculate variance for runs test"))
}
# Z-score and p-value
<- (runs - expected_runs) / sqrt(var_runs)
z_score <- 2 * (1 - pnorm(abs(z_score)))
p_value
list(
statistic = runs,
expected_runs = expected_runs,
z_score = z_score,
p_value = p_value,
method = "Runs Test for Randomness",
conclusion = p_value >= self$config$independence$alpha_level,
interpretation = "Tests whether sequence is random"
)
},
#' Assess normality using descriptive statistics
#'
#' @param x numeric vector. Data to assess
#'
#' @return list. Descriptive normality assessment
#'
assess_normality_descriptive = function(x) {
if (length(x) < 4) {
return(list(error = "Insufficient data for descriptive assessment"))
}
# Calculate skewness and kurtosis
<- mean(x)
mean_x <- sd(x)
sd_x <- length(x)
n
# Skewness
<- sum((x - mean_x)^3) / (n * sd_x^3)
skewness
# Kurtosis (excess kurtosis)
<- sum((x - mean_x)^4) / (n * sd_x^4) - 3
kurtosis
# Interpret skewness and kurtosis
<- if (abs(skewness) < 0.5) {
skew_interpretation "Approximately symmetric"
else if (abs(skewness) < 1.0) {
} "Moderately skewed"
else {
} "Highly skewed"
}
<- if (abs(kurtosis) < 0.5) {
kurt_interpretation "Normal kurtosis"
else if (kurtosis > 0.5) {
} "Heavy-tailed (leptokurtic)"
else {
} "Light-tailed (platykurtic)"
}
# Overall assessment
<- abs(skewness) < self$config$normality$descriptive_thresholds$skewness &&
likely_normal abs(kurtosis) < self$config$normality$descriptive_thresholds$kurtosis
list(
skewness = skewness,
kurtosis = kurtosis,
skew_interpretation = skew_interpretation,
kurt_interpretation = kurt_interpretation,
likely_normal = likely_normal,
sample_size = n
)
},
#' Interpret variance ratio
#'
#' @param ratio numeric. Variance ratio (max/min)
#'
#' @return character. Interpretation
#'
interpret_variance_ratio = function(ratio) {
if (ratio <= 2) {
"Variances are quite similar"
else if (ratio <= 4) {
} "Moderate difference in variances"
else if (ratio <= 10) {
} "Large difference in variances"
else {
} "Very large difference in variances"
}
}
) )
Intelligent Method Selection
Statistical Method Selector
Create an intelligent system for selecting optimal statistical methods:
# File: R/statistical_method_selector.R
#' Statistical Method Selector
#'
#' @description Intelligent method selection system that chooses optimal
#' statistical approaches based on data characteristics and assumption test results.
#'
#' @export
<- R6::R6Class(
StatisticalMethodSelector "StatisticalMethodSelector",
public = list(
#' @field config method selection configuration
config = NULL,
#' @field method_registry available statistical methods
method_registry = NULL,
#' Initialize method selector
#'
#' @param config list. Method selection configuration
#'
initialize = function(config = list()) {
$config <- private$default_selector_config(config)
self$method_registry <- private$initialize_method_registry()
self
},
#' Select optimal statistical method
#'
#' @param data data.frame. Analysis data
#' @param analysis_type character. Type of analysis requested
#' @param assumption_results AssumptionTestResults. Assumption test results
#' @param parameters list. Analysis parameters
#'
#' @return StatisticalMethod. Selected method with justification
#'
select_optimal_method = function(data, analysis_type, assumption_results, parameters = list()) {
# Get available methods for analysis type
<- private$get_available_methods(analysis_type)
available_methods
if (length(available_methods) == 0) {
stop(paste("No methods available for analysis type:", analysis_type))
}
# Score each method based on assumption compliance and other factors
<- list()
method_scores
for (method_name in available_methods) {
<- self$method_registry[[analysis_type]][[method_name]]
method_info
<- private$score_method(
score method_info = method_info,
data = data,
assumption_results = assumption_results,
parameters = parameters
)
<- score
method_scores[[method_name]]
}
# Select best method
<- names(method_scores)[which.max(sapply(method_scores, function(x) x$total_score))]
best_method_name <- method_scores[[best_method_name]]
best_score
# Create method object
$new(
StatisticalMethodmethod_name = best_method_name,
analysis_type = analysis_type,
method_info = self$method_registry[[analysis_type]][[best_method_name]],
selection_score = best_score,
assumption_compliance = best_score$assumption_compliance,
selection_rationale = private$generate_selection_rationale(
best_method_name,
best_score,
method_scores,
assumption_results
),alternative_methods = private$get_alternative_recommendations(
method_scores,
best_method_name
)
)
}
),
private = list(
#' Default method selector configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete configuration
#'
default_selector_config = function(config) {
<- list(
default_config preference_weights = list(
assumption_compliance = 0.4,
robustness = 0.3,
statistical_power = 0.2,
interpretability = 0.1
),
assumption_penalties = list(
normality_violation = -20,
homogeneity_violation = -15,
independence_violation = -30,
outlier_presence = -10
),
robustness_bonuses = list(
nonparametric = 15,
robust_parametric = 10,
bootstrap_based = 12
),
sample_size_considerations = list(
small_sample_threshold = 30,
very_small_threshold = 10,
small_sample_preference = "nonparametric"
)
)
modifyList(default_config, config)
},
#' Initialize method registry
#'
#' @return list. Available methods by analysis type
#'
initialize_method_registry = function() {
list(
independent_ttest = list(
independent_ttest_students = list(
name = "Student's Independent Samples t-Test",
type = "parametric",
required_assumptions = c("normality", "homogeneity", "independence"),
statistical_power = "high",
interpretability = "high",
robustness = "low",
sample_size_requirements = list(min = 6, recommended = 30),
effect_size_measures = c("cohens_d", "glass_delta"),
description = "Classic parametric test assuming equal variances"
),
independent_ttest_welch = list(
name = "Welch's Independent Samples t-Test",
type = "parametric_robust",
required_assumptions = c("normality", "independence"),
statistical_power = "high",
interpretability = "high",
robustness = "medium",
sample_size_requirements = list(min = 6, recommended = 30),
effect_size_measures = c("cohens_d", "glass_delta"),
description = "Robust parametric test not assuming equal variances"
),
mann_whitney_u = list(
name = "Mann-Whitney U Test",
type = "nonparametric",
required_assumptions = c("independence"),
statistical_power = "medium",
interpretability = "medium",
robustness = "high",
sample_size_requirements = list(min = 6, recommended = 20),
effect_size_measures = c("rank_biserial_r", "common_language_effect"),
description = "Distribution-free test comparing medians"
),
bootstrap_ttest = list(
name = "Bootstrap t-Test",
type = "resampling",
required_assumptions = c("independence"),
statistical_power = "high",
interpretability = "medium",
robustness = "high",
sample_size_requirements = list(min = 10, recommended = 50),
effect_size_measures = c("cohens_d", "bootstrap_ci"),
description = "Resampling-based test robust to distributional assumptions"
)
)
)
},
#' Get available methods for analysis type
#'
#' @param analysis_type character. Type of analysis
#'
#' @return character vector. Available method names
#'
get_available_methods = function(analysis_type) {
if (analysis_type %in% names(self$method_registry)) {
names(self$method_registry[[analysis_type]])
else {
} character(0)
}
},
#' Score a statistical method
#'
#' @param method_info list. Method information
#' @param data data.frame. Analysis data
#' @param assumption_results AssumptionTestResults. Assumption test results
#' @param parameters list. Analysis parameters
#'
#' @return list. Method scoring results
#'
score_method = function(method_info, data, assumption_results, parameters) {
<- list()
scores
# Assumption compliance score
<- private$score_assumption_compliance(
assumption_score $required_assumptions,
method_info
assumption_results
)$assumption_compliance <- assumption_score
scores
# Robustness score
<- private$score_robustness(method_info, data)
robustness_score $robustness <- robustness_score
scores
# Statistical power score
<- private$score_statistical_power(method_info, data, parameters)
power_score $statistical_power <- power_score
scores
# Interpretability score
<- private$score_interpretability(method_info, parameters)
interpretability_score $interpretability <- interpretability_score
scores
# Sample size appropriateness
<- private$score_sample_size_appropriateness(method_info, data)
sample_size_score $sample_size_appropriateness <- sample_size_score
scores
# Calculate weighted total score
<- self$config$preference_weights
weights
<- (
total_score * weights$assumption_compliance +
assumption_score * weights$robustness +
robustness_score * weights$statistical_power +
power_score * weights$interpretability
interpretability_score
)
list(
assumption_compliance = assumption_score,
robustness = robustness_score,
statistical_power = power_score,
interpretability = interpretability_score,
sample_size_appropriateness = sample_size_score,
total_score = total_score,
method_info = method_info
)
},
#' Score assumption compliance
#'
#' @param required_assumptions character vector. Required assumptions
#' @param assumption_results AssumptionTestResults. Test results
#'
#' @return numeric. Assumption compliance score
#'
score_assumption_compliance = function(required_assumptions, assumption_results) {
<- 100
base_score <- assumption_results$overall_assessment$violations
violated_assumptions
for (assumption in required_assumptions) {
if (assumption %in% violated_assumptions) {
<- self$config$assumption_penalties[[paste0(assumption, "_violation")]] %||% -10
penalty <- base_score + penalty
base_score
}
}
# Bonus for methods that don't require problematic assumptions
if (!"normality" %in% required_assumptions && "normality" %in% violated_assumptions) {
<- base_score + 15
base_score
}
if (!"homogeneity" %in% required_assumptions && "homogeneity" %in% violated_assumptions) {
<- base_score + 10
base_score
}
max(0, base_score)
},
#' Score robustness
#'
#' @param method_info list. Method information
#' @param data data.frame. Analysis data
#'
#' @return numeric. Robustness score
#'
score_robustness = function(method_info, data) {
<- 50
base_score
# Robustness bonuses
if (method_info$type == "nonparametric") {
<- base_score + self$config$robustness_bonuses$nonparametric
base_score else if (method_info$type == "parametric_robust") {
} <- base_score + self$config$robustness_bonuses$robust_parametric
base_score else if (method_info$type == "resampling") {
} <- base_score + self$config$robustness_bonuses$bootstrap_based
base_score
}
# Additional robustness considerations
if (method_info$robustness == "high") {
<- base_score + 10
base_score else if (method_info$robustness == "low") {
} <- base_score - 10
base_score
}
max(0, base_score)
},
#' Score statistical power
#'
#' @param method_info list. Method information
#' @param data data.frame. Analysis data
#' @param parameters list. Analysis parameters
#'
#' @return numeric. Statistical power score
#'
score_statistical_power = function(method_info, data, parameters) {
<- switch(method_info$statistical_power,
base_score "high" = 80,
"medium" = 60,
"low" = 40,
50 # default
)
# Adjust for sample size
<- nrow(data)
n if (n < self$config$sample_size_considerations$very_small_threshold) {
<- base_score - 20
base_score else if (n < self$config$sample_size_considerations$small_sample_threshold) {
} <- base_score - 10
base_score
}
max(0, base_score)
},
#' Score interpretability
#'
#' @param method_info list. Method information
#' @param parameters list. Analysis parameters
#'
#' @return numeric. Interpretability score
#'
score_interpretability = function(method_info, parameters) {
<- switch(method_info$interpretability,
base_score "high" = 80,
"medium" = 60,
"low" = 40,
50 # default
)
# Bonus for familiar methods in clinical contexts
if ("clinical_context" %in% names(parameters) && parameters$clinical_context) {
if (grepl("t-test", method_info$name, ignore.case = TRUE)) {
<- base_score + 10
base_score
}
}
max(0, base_score)
},
#' Score sample size appropriateness
#'
#' @param method_info list. Method information
#' @param data data.frame. Analysis data
#'
#' @return numeric. Sample size appropriateness score
#'
score_sample_size_appropriateness = function(method_info, data) {
<- nrow(data)
n <- method_info$sample_size_requirements$min %||% 6
min_required <- method_info$sample_size_requirements$recommended %||% 30
recommended
if (n < min_required) {
return(0) # Method not suitable
else if (n < recommended) {
} return(50) # Below recommended but acceptable
else {
} return(80) # Adequate sample size
}
},
#' Generate selection rationale
#'
#' @param selected_method character. Selected method name
#' @param selected_score list. Selected method score details
#' @param all_scores list. All method scores
#' @param assumption_results AssumptionTestResults. Assumption results
#'
#' @return character. Selection rationale
#'
generate_selection_rationale = function(selected_method, selected_score, all_scores, assumption_results) {
<- selected_score$method_info
method_info <- assumption_results$overall_assessment$violations
violations
<- c()
rationale_parts
# Primary selection reason
if (selected_score$assumption_compliance >= 80) {
<- c(rationale_parts,
rationale_parts paste("Selected", method_info$name, "because all required assumptions are met"))
else if (method_info$type == "nonparametric" && "normality" %in% violations) {
} <- c(rationale_parts,
rationale_parts paste("Selected", method_info$name, "because it doesn't require normality assumptions"))
else if (method_info$type == "parametric_robust" && "homogeneity" %in% violations) {
} <- c(rationale_parts,
rationale_parts paste("Selected", method_info$name, "because it's robust to unequal variances"))
else {
} <- c(rationale_parts,
rationale_parts paste("Selected", method_info$name, "as the best available option"))
}
# Additional justification
if (selected_score$robustness >= 70) {
<- c(rationale_parts, "This method provides good robustness to assumption violations")
rationale_parts
}
if (selected_score$statistical_power >= 70) {
<- c(rationale_parts, "This method maintains good statistical power")
rationale_parts
}
# Warnings if needed
if (length(violations) > 0) {
<- paste(violations, collapse = ", ")
violated_text <- c(rationale_parts,
rationale_parts paste("Note: Some assumptions are violated (", violated_text, "), but this method is relatively robust"))
}
paste(rationale_parts, collapse = ". ")
},
#' Get alternative method recommendations
#'
#' @param all_scores list. All method scores
#' @param selected_method character. Selected method name
#'
#' @return list. Alternative method recommendations
#'
get_alternative_recommendations = function(all_scores, selected_method) {
# Remove selected method and sort by score
<- all_scores[names(all_scores) != selected_method]
alternative_scores <- alternative_scores[order(sapply(alternative_scores, function(x) x$total_score), decreasing = TRUE)]
sorted_alternatives
# Return top 2 alternatives with brief justification
<- list()
alternatives
for (i in 1:min(2, length(sorted_alternatives))) {
<- names(sorted_alternatives)[i]
alt_name <- sorted_alternatives[[alt_name]]
alt_score <- alt_score$method_info
alt_info
# Brief justification for alternative
<- if (alt_info$type == "nonparametric") {
justification "Distribution-free alternative"
else if (alt_info$type == "resampling") {
} "Bootstrap-based robust alternative"
else if (alt_info$type == "parametric_robust") {
} "Parametric robust alternative"
else {
} "Classical parametric alternative"
}
<- list(
alternatives[[alt_name]] name = alt_info$name,
type = alt_info$type,
score = alt_score$total_score,
justification = justification
)
}
alternatives
}
) )
Implementing Statistical Rigor in Practice
Integration with Enhanced t-Test Application
Transform the existing sophisticated t-test application to leverage the statistical rigor framework:
# Enhanced server logic integrating statistical rigor
<- function(id, audit_logger = NULL) {
enhanced_ttest_server moduleServer(id, function(input, output, session) {
# Initialize statistical testing engine
<- StatisticalTestingEngine$new(
testing_engine config = list(
regulatory_compliance = list(
standards = c("FDA", "ICH_E9"),
documentation_level = "comprehensive"
)
),audit_logger = audit_logger
)
# Enhanced analysis execution
observeEvent(input$run_test, {
req(dataset())
showNotification("Conducting rigorous statistical analysis...",
type = "message", id = "analyzing")
# Prepare comprehensive analysis
<- testing_engine$conduct_analysis(
analysis_result data = dataset(),
analysis_type = "independent_ttest",
parameters = list(
alternative = input$alternative,
conf_level = input$conf_level,
clinical_context = TRUE
),context = list(
application = "enhanced_ttest",
user_id = session$userData$user_id %||% "anonymous"
)
)
$analysis_result <- analysis_result
valuesremoveNotification("analyzing")
})
# Display comprehensive results
$statistical_results <- renderUI({
output
req(values$analysis_result)
<- values$analysis_result
result
if (!result$success) {
return(div(class = "alert alert-danger",
h4("Analysis Issues"),
p(paste(result$errors, collapse = "; "))))
}
tagList(
# Executive Summary
card(
card_header("Statistical Analysis Summary"),
card_body(
div(class = "alert alert-info",
h5("Executive Summary"),
p(result$get_executive_summary())
),
# Key Results Dashboard
div(class = "row text-center",
div(class = "col-md-3",
div(class = "card bg-light",
div(class = "card-body",
h6("p-value"),
h4(format.pval(result$primary_analysis$p_value, digits = 4)),
small(class = "text-muted",
if (result$primary_analysis$p_value < 0.05) "Significant" else "Not Significant")
)
)
),div(class = "col-md-3",
div(class = "card bg-light",
div(class = "card-body",
h6("Effect Size"),
h4(round(result$effect_analysis$primary_effect_size$value, 3)),
small(class = "text-muted", result$effect_analysis$primary_effect_size$interpretation)
)
)
),div(class = "col-md-3",
div(class = "card bg-light",
div(class = "card-body",
h6("Method Used"),
h4(abbr(result$method_selection$method_info$type,
title = result$method_selection$method_info$name)),
small(class = "text-muted", "Selected Method")
)
)
),div(class = "col-md-3",
div(class = "card bg-light",
div(class = "card-body",
h6("Assumptions"),
h4(if (result$assumption_tests$overall_assessment$suitable) {
icon("check-circle", class = "text-success")
else {
} icon("exclamation-triangle", class = "text-warning")
}),small(class = "text-muted",
if (result$assumption_tests$overall_assessment$suitable) "Met" else "Issues")
)
)
)
)
)
),
# Method Selection and Justification
card(
class = "mt-3",
card_header("Method Selection and Validation"),
card_body(
h5(result$method_selection$method_info$name),
p(class = "text-muted", result$method_selection$method_info$description),
div(class = "alert alert-light",
strong("Selection Rationale: "),
$method_selection$selection_rationale
result
),
# Assumption Test Summary
if (length(result$assumption_tests$overall_assessment$violations) > 0) {
div(class = "alert alert-warning",
strong("Assumption Violations Detected: "),
paste(result$assumption_tests$overall_assessment$violations, collapse = ", "),
br(),
"The selected method accounts for these violations and provides robust results."
)else {
} div(class = "alert alert-success",
strong("All Required Assumptions Met: "),
"Statistical prerequisites are satisfied for reliable analysis."
)
}
)
)
)
})
# Enhanced assumption testing display
$assumption_results <- renderPrint({
output
req(values$analysis_result)
<- values$analysis_result
result
if (!result$success) return("Analysis failed - assumption testing unavailable")
<- result$assumption_tests
assumption_tests
cat("COMPREHENSIVE ASSUMPTION TESTING REPORT\n")
cat("=====================================\n\n")
# Overall Assessment
cat("OVERALL ASSESSMENT:\n")
cat("------------------\n")
if (assumption_tests$overall_assessment$suitable) {
cat("✓ All required assumptions are met for reliable analysis\n\n")
else {
} cat("⚠ Some assumptions are violated:\n")
for (violation in assumption_tests$overall_assessment$violations) {
cat(" • ", violation, "\n")
}cat("\nHowever, the selected method is robust to these violations.\n\n")
}
# Detailed Results by Assumption
for (assumption_name in names(assumption_tests$assumption_results)) {
<- assumption_tests$assumption_results[[assumption_name]]
assumption_result
cat(toupper(assumption_name), " TESTING:\n")
cat(paste(rep("-", nchar(assumption_name) + 9), collapse = ""), "\n")
if ("error" %in% names(assumption_result)) {
cat("Error:", assumption_result$error, "\n\n")
next
}
# Print assumption-specific results
if (assumption_name == "normality") {
for (var_name in names(assumption_result)) {
if (var_name == "error") next
<- assumption_result[[var_name]]
var_result cat("Variable:", var_name, "\n")
if ("shapiro_wilk" %in% names(var_result)) {
<- var_result$shapiro_wilk
sw cat(" Shapiro-Wilk: W =", round(sw$statistic, 4),
", p =", round(sw$p_value, 4))
if (sw$conclusion) cat(" ✓") else cat(" ✗")
cat("\n")
}
if ("overall_conclusion" %in% names(var_result)) {
<- var_result$overall_conclusion
conclusion cat(" Overall conclusion:",
if (conclusion$assumes_normal) "Normal distribution" else "Non-normal distribution")
cat(" (", conclusion$n_tests, " tests)\n")
}cat("\n")
}
}
if (assumption_name == "homogeneity") {
if ("levene" %in% names(assumption_result)) {
<- assumption_result$levene
levene cat("Levene's Test: F =", round(levene$statistic, 4),
", p =", round(levene$p_value, 4))
if (levene$conclusion) cat(" ✓") else cat(" ✗")
cat("\n")
}
if ("overall_conclusion" %in% names(assumption_result)) {
<- assumption_result$overall_conclusion
conclusion cat("Overall conclusion:", conclusion$recommendation, "\n")
}cat("\n")
}
if (assumption_name == "outliers") {
<- 0
total_outliers for (var_name in names(assumption_result)) {
<- assumption_result[[var_name]]
var_result if ("consensus" %in% names(var_result)) {
<- length(var_result$consensus$high_confidence_outliers)
n_outliers <- total_outliers + n_outliers
total_outliers if (n_outliers > 0) {
cat("Variable", var_name, ":", n_outliers, "high-confidence outliers detected\n")
}
}
}if (total_outliers == 0) {
cat("No significant outliers detected across variables\n")
}cat("\n")
}
}
# Recommendations
if (length(assumption_tests$overall_assessment$recommendations) > 0) {
cat("RECOMMENDATIONS:\n")
cat("---------------\n")
for (rec in assumption_tests$overall_assessment$recommendations) {
cat("•", rec, "\n")
}cat("\n")
}
})
# Enhanced interpretation guidance
$interpretation_guidance <- renderUI({
output
req(values$analysis_result)
<- values$analysis_result
result
if (!result$success) return(NULL)
<- result$interpretation
interpretation
tagList(
card(
card_header("Statistical Interpretation and Clinical Guidance"),
card_body(
# Statistical Significance Section
div(class = "mb-4",
h5("Statistical Significance"),
div(
class = if (interpretation$statistical_significance$is_significant) {
"alert alert-success"
else {
} "alert alert-secondary"
},p(interpretation$statistical_significance$interpretation)
)
),
# Effect Size and Practical Importance
div(class = "mb-4",
h5("Effect Size and Practical Importance"),
div(class = "alert alert-info",
p(strong("Effect Size: "),
$effect_size$magnitude,
interpretation" (", round(result$effect_analysis$primary_effect_size$value, 3), ")"),
p(strong("Practical Importance: "),
$effect_size$practical_importance$magnitude),
interpretationp(interpretation$effect_size$practical_importance$recommendation)
)
),
# Overall Conclusion
div(class = "mb-4",
h5("Overall Conclusion"),
div(class = "alert alert-primary",
p(interpretation$conclusion$summary)
)
),
# Limitations and Considerations
if (length(interpretation$limitations) > 0) {
div(class = "mb-4",
h5("Limitations and Considerations"),
div(class = "alert alert-light",
$ul(
tagslapply(interpretation$limitations, function(lim) tags$li(lim))
)
)
)
}
)
)
)
})
}) }
Common Questions About Statistical Rigor
Enterprise applications require statistical validity that can withstand regulatory scrutiny, peer review, and legal challenges. Comprehensive assumption testing provides documented evidence that statistical methods are appropriate for the data and analysis context. This documentation becomes crucial for regulatory submissions, clinical trial reports, and publications where statistical methodology must be transparent and defensible. Additionally, assumption violations can lead to incorrect conclusions with significant business or clinical consequences, making thorough validation essential for responsible decision-making.
Intelligent method selection eliminates human bias and inconsistency in statistical method choice while ensuring optimal approaches for specific data characteristics. Manual selection often relies on default methods or analyst preferences, potentially missing robust alternatives when assumptions are violated. Automated selection considers multiple factors simultaneously - assumption compliance, statistical power, robustness, and interpretability - using consistent, documented criteria. This approach also provides audit trails and justification required for regulatory compliance while reducing analysis time and improving reproducibility across different analysts and studies.
This framework addresses specific requirements of regulated industries including FDA guidance compliance, ICH E9 statistical principles, and Good Clinical Practice standards. The comprehensive documentation automatically generated meets regulatory expectations for statistical analysis plans, assumption testing evidence, and method selection justification. The audit trail capabilities support 21 CFR Part 11 requirements for electronic records, while the sensitivity analysis features address regulatory concerns about analytical robustness. Additionally, the clinical significance assessment and effect size interpretation provide context essential for regulatory decision-making and clinical interpretation.
The system recognizes that real-world data often violates theoretical assumptions and provides intelligent responses rather than simply flagging violations. When assumptions are violated, the method selector automatically identifies robust alternatives that remain valid under the observed data conditions. The sensitivity analysis quantifies how assumption violations affect conclusions, while the interpretation guidance explains the practical implications. This approach enables confident analysis of imperfect data while maintaining statistical rigor and providing transparent documentation of analytical decisions and their justifications.
The framework is designed for users with basic statistical knowledge who need enterprise-grade rigor without deep statistical programming expertise. Users should understand fundamental concepts like p-values, effect sizes, and basic assumptions, but the system provides interpretation guidance and context-aware explanations. For advanced users, the framework offers complete customization and detailed diagnostic information. The key advantage is that it elevates the statistical rigor of any analysis regardless of user expertise level, while providing educational value through comprehensive explanations and best practice guidance.
Test Your Understanding
Which components are essential for enterprise-grade statistical rigor in clinical applications? Select all that apply:
- Comprehensive assumption testing with multiple methods
- Intelligent method selection based on data characteristics
- Sensitivity analysis across alternative approaches
- Regulatory compliance documentation
- User-friendly interface design
- Automated interpretation guidance
- Consider what regulatory bodies require for statistical submissions
- Think about what makes analysis defensible under scrutiny
- Focus on components that ensure analytical validity
A, B, C, D, and F are all essential components.
Essential components explained:
- A) Comprehensive assumption testing: Required to validate that statistical methods are appropriate for the data and provide documented evidence for regulatory review
- B) Intelligent method selection: Ensures optimal analytical approaches while eliminating bias and providing audit trails for decision rationale
- C) Sensitivity analysis: Demonstrates analytical robustness and addresses regulatory concerns about conclusion stability across methods
- D) Regulatory compliance documentation: Essential for FDA submissions, clinical trials, and peer review processes
- F) Automated interpretation guidance: Provides context-aware explanations essential for clinical significance assessment and decision-making
E) User-friendly interface design is important for usability but not essential for statistical rigor itself. The rigor comes from the analytical framework, not the presentation layer.
Your data violates normality assumptions but meets independence and has equal variances. The intelligent method selector chooses Welch’s t-test over Student’s t-test. Why might this decision be suboptimal, and what would be a better choice?
- Welch’s t-test is always preferred over Student’s t-test
- Should use Mann-Whitney U test since normality is violated
- Student’s t-test would be better since variances are equal
- Should use bootstrap methods for any assumption violations
- Consider which assumptions each method requires
- Think about the trade-offs between different approaches when assumptions are partially met
- Remember that robustness vs. power is a key consideration
B) Should use Mann-Whitney U test since normality is violated
Detailed reasoning:
When normality is violated, the most appropriate response is to use a method that doesn’t assume normality, regardless of whether other assumptions are met. Here’s why:
- Welch’s t-test still assumes normality - it only relaxes the equal variance assumption
- Student’s t-test assumes both normality and equal variances - having equal variances doesn’t help if normality fails
- Mann-Whitney U test only requires independence and ordinal data - it’s robust to non-normality
- Bootstrap methods are also appropriate but may be unnecessarily complex for this scenario
The optimal method selection should prioritize assumption compliance first, then consider power and interpretability among valid options. A properly configured intelligent selector would weight assumption violations heavily and select the Mann-Whitney U test in this scenario.
You’re implementing this statistical rigor framework in a pharmaceutical company. The statisticians are concerned about losing control over method selection, while regulatory affairs wants comprehensive documentation. How should you configure the system to address both concerns?
- Consider how to balance automation with expert oversight
- Think about documentation requirements for regulatory compliance
- Consider different user roles and their needs
Implement a hybrid approach with expert oversight and comprehensive audit trails:
Configuration strategy:
Expert Review Mode: Allow statisticians to review and override automated method selection with required justification that becomes part of the audit trail
Tiered Automation:
- Fully automated for routine analyses with clear data characteristics
- Expert consultation required for edge cases or conflicting indicators
- Manual override available with documented rationale
Comprehensive Documentation: Generate regulatory-compliant reports regardless of selection method (automated vs. manual) including:
- Complete assumption testing evidence
- Method selection rationale (whether automated or expert-driven)
- Sensitivity analysis results
- Expert review comments and overrides
Role-Based Configuration:
- Statisticians: Full access to detailed diagnostics, override capabilities, and custom configuration
- Regulatory Affairs: Focus on compliance documentation and audit trail completeness
- Clinical Teams: Simplified interface with comprehensive interpretation guidance
This approach maintains statistical expertise while ensuring consistent rigor and documentation standards required for regulatory compliance.
Conclusion
Statistical rigor forms the cornerstone of credible biostatistics applications, transforming routine analyses into enterprise-grade systems that meet the demanding standards of regulated industries. This comprehensive framework elevates your t-test application from a basic statistical tool to a sophisticated analytical platform capable of supporting regulatory submissions, clinical decision-making, and peer-reviewed research.
The automated assumption testing, intelligent method selection, and comprehensive documentation systems you’ve implemented provide the foundation for any statistical application requiring professional reliability. These patterns extend far beyond t-tests to encompass the full spectrum of biostatistical analyses while maintaining the accessibility and user experience that stakeholders expect from modern analytical tools.
Your enhanced application now demonstrates how enterprise development transcends basic functionality to address the complex requirements of real-world statistical practice, where analytical decisions must be defensible, reproducible, and aligned with regulatory standards while remaining accessible to clinical and research teams.
Next Steps
Based on your mastery of statistical rigor principles, here are the recommended paths for continuing your enterprise development journey:
Immediate Next Steps (Complete These First)
- Advanced Visualizations for Statistical Results - Transform your rigorous analyses into publication-ready visualizations that communicate statistical findings effectively
- Testing Framework and Validation - Implement comprehensive testing strategies that validate your statistical computations and ensure reliability across scenarios
- Practice Exercise: Extend your statistical rigor framework to include ANOVA or regression analysis, implementing the same comprehensive validation and intelligent method selection patterns
Building Your Enterprise Platform (Choose Your Focus)
For Visualization Excellence: - Professional Statistical Plot Generation - Building Interactive Dashboards
For Production Reliability:
For Clinical Integration:
Long-term Goals (2-4 Weeks)
- Develop a complete biostatistics platform with multiple statistical modules using your rigor framework
- Create enterprise deployment strategies for regulated environments with full validation documentation
- Build custom statistical method libraries that extend the framework for specialized clinical applications
- Contribute to the biostatistics community by sharing validated, enterprise-grade statistical tools
Explore More Articles
Here are more articles from the same category to help you dive deeper into enterprise Shiny development.
Reuse
Citation
@online{kassambara2025,
author = {Kassambara, Alboukadel},
title = {Statistical {Rigor} and {Assumption} {Testing:} {Enterprise}
{Statistical} {Validation}},
date = {2025-05-23},
url = {https://www.datanovia.com/learn/tools/shiny-apps/enterprise-development/statistical-rigor.html},
langid = {en}
}