flowchart TD
A[Raw Data Input] --> B[Data Quality Assessment]
B --> C[Assumption Testing]
C --> D[Method Selection]
D --> E[Statistical Analysis]
E --> F[Result Validation]
F --> G[Interpretation Guidance]
G --> H[Documentation Generation]
B --> B1[Missing data patterns]
B --> B2[Outlier detection]
B --> B3[Distribution assessment]
B --> B4[Sample size adequacy]
C --> C1[Normality testing]
C --> C2[Homogeneity of variance]
C --> C3[Independence verification]
C --> C4[Linearity assessment]
D --> D1[Automatic method selection]
D --> D2[Assumption-based routing]
D --> D3[Robust alternatives]
D --> D4[Sensitivity analysis]
E --> E1[Primary analysis]
E --> E2[Sensitivity testing]
E --> E3[Effect size calculation]
E --> E4[Confidence intervals]
F --> F1[Result consistency checks]
F --> F2[Computational validation]
F --> F3[Cross-validation]
F --> F4[Simulation verification]
G --> G1[Clinical significance]
G --> G2[Statistical interpretation]
G --> G3[Limitation assessment]
G --> G4[Recommendation generation]
H --> H1[Assumption test reports]
H --> H2[Method justification]
H --> H3[Validation evidence]
H --> H4[Regulatory documentation]
style A fill:#ffebee
style H fill:#e8f5e8
style C fill:#fff3e0
style E fill:#f3e5f5
style G fill:#e3f2fd
Key Takeaways
- Automated Validation: Implement comprehensive assumption checking that automatically validates statistical prerequisites with intelligent method selection and warning systems
- Clinical Standards: Master biostatistics protocols that meet pharmaceutical and clinical research standards for regulatory submission and peer review
- Intelligent Guidance: Create interpretation systems that provide context-aware statistical guidance, effect size interpretation, and clinical significance assessment
- Regulatory Compliance: Build statistical workflows that generate validation documentation, assumption testing evidence, and audit trails for regulatory compliance
- Professional Reliability: Establish enterprise-grade statistical computing with robust error handling, reproducible results, and comprehensive diagnostic reporting
Introduction
Statistical rigor forms the foundation of credible biostatistics applications, particularly in regulated industries where statistical methodology directly impacts regulatory approval, clinical decision-making, and patient safety. Enterprise statistical applications must go beyond basic calculations to implement comprehensive assumption checking, automated validation workflows, and intelligent interpretation guidance that meets professional research standards.
This tutorial transforms our secure t-test application into a statistically rigorous system that automatically validates assumptions, provides intelligent method selection, and generates comprehensive diagnostic reports. You’ll learn to implement clinical-grade statistical workflows that support regulatory submission, peer review, and professional research standards while maintaining the accessibility and user experience that stakeholders expect.
The statistical validation patterns you’ll master apply universally to biostatistics applications while addressing specific requirements for clinical trials, pharmaceutical research, and healthcare data analysis where statistical validity directly impacts research conclusions and regulatory decisions.
Understanding Statistical Rigor Requirements
Enterprise Statistical Standards
Professional statistical applications require comprehensive validation beyond basic calculations:
Critical Statistical Validation Components:
Assumption Testing Framework:
Comprehensive validation of statistical prerequisites including normality, homogeneity of variance, independence, and linearity with automated testing and intelligent interpretation.
Method Selection Intelligence:
Automated selection of appropriate statistical methods based on data characteristics, assumption test results, and analysis objectives with fallback options for assumption violations.
Diagnostic Reporting:
Generation of comprehensive diagnostic reports that document assumption testing, method selection rationale, and validation evidence for regulatory review.
Clinical Interpretation:
Context-aware interpretation guidance that connects statistical results to clinical significance, practical importance, and regulatory implications.
Regulatory Statistical Requirements
Pharmaceutical and clinical research applications must meet specific statistical standards:
FDA Guidance Compliance:
- Comprehensive assumption testing with documentation
- Method selection justification with sensitivity analysis
- Effect size reporting with clinical significance assessment
- Multiple comparison adjustments where appropriate
ICH E9 Statistical Principles:
- Pre-specified statistical analysis plans
- Assumption validation with appropriate testing
- Robust analysis methods for assumption violations
- Comprehensive sensitivity analysis documentation
Good Clinical Practice (GCP):
- Audit trail of statistical decisions and methods
- Validation of statistical software and procedures
- Documentation of assumption testing and results
- Reproducible analysis with version control
Comprehensive Assumption Testing Framework
Advanced Statistical Testing Engine
Create a robust statistical testing system that automatically validates assumptions:
# File: R/statistical_testing_engine.R
#' Enterprise Statistical Testing Engine
#'
#' @description Comprehensive statistical testing framework with automated
#' assumption checking, intelligent method selection, and regulatory compliance
#' documentation for clinical and pharmaceutical applications.
#'
#' @export
StatisticalTestingEngine <- R6::R6Class(
"StatisticalTestingEngine",
public = list(
#' @field config statistical testing configuration
config = NULL,
#' @field audit_logger audit logging system
audit_logger = NULL,
#' @field assumption_tests collection of assumption testing methods
assumption_tests = NULL,
#' @field method_selector intelligent method selection system
method_selector = NULL,
#' Initialize statistical testing engine
#'
#' @param config list. Statistical testing configuration
#' @param audit_logger AuditLogger. Audit logging instance
#'
initialize = function(config = list(), audit_logger = NULL) {
self$config <- private$default_testing_config(config)
self$audit_logger <- audit_logger
self$assumption_tests <- AssumptionTestingFramework$new(self$config$assumptions)
self$method_selector <- StatisticalMethodSelector$new(self$config$method_selection)
if (!is.null(self$audit_logger)) {
self$audit_logger$log_event(
"StatisticalTestingEngine initialized",
level = "INFO",
category = "STATISTICAL_ANALYSIS"
)
}
},
#' Conduct comprehensive statistical analysis
#'
#' @param data data.frame. Input data for analysis
#' @param analysis_type character. Type of statistical analysis
#' @param parameters list. Analysis parameters and options
#' @param context list. Analysis context for regulatory documentation
#'
#' @return StatisticalAnalysisResult. Comprehensive analysis results
#'
conduct_analysis = function(data, analysis_type, parameters = list(),
context = list()) {
analysis_id <- uuid::UUIDgenerate()
start_time <- Sys.time()
tryCatch({
# Log analysis start
if (!is.null(self$audit_logger)) {
self$audit_logger$log_event(
"Statistical analysis initiated",
level = "INFO",
category = "STATISTICAL_ANALYSIS",
details = list(
analysis_id = analysis_id,
analysis_type = analysis_type,
sample_size = nrow(data),
parameters = parameters,
context = context
)
)
}
# Step 1: Data quality assessment
quality_assessment <- private$assess_data_quality(data, context)
if (!quality_assessment$suitable_for_analysis) {
return(StatisticalAnalysisResult$new(
analysis_id = analysis_id,
success = FALSE,
errors = quality_assessment$critical_issues,
warnings = quality_assessment$warnings,
recommendations = quality_assessment$recommendations
))
}
# Step 2: Comprehensive assumption testing
assumption_results <- self$assumption_tests$test_all_assumptions(
data = data,
analysis_type = analysis_type,
parameters = parameters
)
# Step 3: Intelligent method selection
selected_method <- self$method_selector$select_optimal_method(
data = data,
analysis_type = analysis_type,
assumption_results = assumption_results,
parameters = parameters
)
# Step 4: Execute primary analysis
primary_results <- private$execute_primary_analysis(
data = data,
method = selected_method,
parameters = parameters
)
# Step 5: Sensitivity analysis
sensitivity_results <- private$conduct_sensitivity_analysis(
data = data,
primary_method = selected_method,
primary_results = primary_results,
parameters = parameters
)
# Step 6: Effect size and clinical significance
effect_analysis <- private$analyze_effect_sizes(
data = data,
results = primary_results,
context = context
)
# Step 7: Generate interpretation guidance
interpretation <- private$generate_interpretation_guidance(
primary_results = primary_results,
assumption_results = assumption_results,
effect_analysis = effect_analysis,
context = context
)
# Step 8: Create comprehensive result object
analysis_result <- StatisticalAnalysisResult$new(
analysis_id = analysis_id,
success = TRUE,
data_quality = quality_assessment,
assumption_tests = assumption_results,
method_selection = selected_method,
primary_analysis = primary_results,
sensitivity_analysis = sensitivity_results,
effect_analysis = effect_analysis,
interpretation = interpretation,
execution_time = as.numeric(Sys.time() - start_time, units = "secs"),
parameters = parameters,
context = context
)
# Log successful completion
if (!is.null(self$audit_logger)) {
self$audit_logger$log_event(
"Statistical analysis completed successfully",
level = "INFO",
category = "STATISTICAL_ANALYSIS",
details = list(
analysis_id = analysis_id,
method_used = selected_method$method_name,
p_value = primary_results$p_value,
effect_size = effect_analysis$primary_effect_size,
assumptions_met = assumption_results$overall_assessment$suitable,
execution_time = analysis_result$execution_time
)
)
}
return(analysis_result)
}, error = function(e) {
# Log analysis error
if (!is.null(self$audit_logger)) {
self$audit_logger$log_event(
paste("Statistical analysis failed:", e$message),
level = "ERROR",
category = "STATISTICAL_ANALYSIS",
details = list(
analysis_id = analysis_id,
error_message = e$message,
analysis_type = analysis_type
)
)
}
return(StatisticalAnalysisResult$new(
analysis_id = analysis_id,
success = FALSE,
errors = list(paste("Analysis failed:", e$message)),
warnings = list(),
recommendations = list("Review data quality and analysis parameters")
))
})
},
#' Generate regulatory compliance report
#'
#' @param analysis_result StatisticalAnalysisResult. Analysis results
#' @param report_type character. Type of regulatory report
#'
#' @return RegulatoryReport. Formatted compliance report
#'
generate_regulatory_report = function(analysis_result, report_type = "FDA") {
RegulatoryReportGenerator$new()$generate_report(
analysis_result = analysis_result,
report_type = report_type,
compliance_standards = self$config$regulatory_compliance
)
}
),
private = list(
#' Default statistical testing configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete configuration
#'
default_testing_config = function(config) {
default_config <- list(
assumptions = list(
normality = list(
tests = c("shapiro_wilk", "anderson_darling", "kolmogorov_smirnov"),
alpha_level = 0.05,
minimum_sample_size = 3,
maximum_sample_size = 5000
),
homogeneity = list(
tests = c("levene", "bartlett", "brown_forsythe"),
alpha_level = 0.05,
robust_default = TRUE
),
independence = list(
tests = c("durbin_watson", "runs_test"),
alpha_level = 0.05,
auto_detect = TRUE
),
outliers = list(
methods = c("iqr", "zscore", "modified_zscore", "isolation_forest"),
action_threshold = 0.05,
documentation_required = TRUE
)
),
method_selection = list(
prefer_robust = TRUE,
require_assumption_testing = TRUE,
sensitivity_analysis_required = TRUE,
effect_size_mandatory = TRUE
),
regulatory_compliance = list(
standards = c("FDA", "EMA", "ICH_E9"),
documentation_level = "comprehensive",
validation_required = TRUE,
audit_trail = TRUE
),
interpretation = list(
clinical_context_required = TRUE,
practical_significance_assessment = TRUE,
limitation_documentation = TRUE,
recommendation_generation = TRUE
)
)
modifyList(default_config, config)
},
#' Assess data quality for statistical analysis
#'
#' @param data data.frame. Input data
#' @param context list. Analysis context
#'
#' @return list. Data quality assessment results
#'
assess_data_quality = function(data, context) {
quality_issues <- list()
warnings <- list()
recommendations <- list()
# Sample size assessment
if (nrow(data) < self$config$assumptions$normality$minimum_sample_size) {
quality_issues <- append(quality_issues,
"Sample size below minimum for reliable analysis")
} else if (nrow(data) < 30) {
warnings <- append(warnings,
"Small sample size may affect test reliability")
recommendations <- append(recommendations,
"Consider non-parametric alternatives")
}
# Missing data assessment
missing_props <- sapply(data, function(x) sum(is.na(x)) / length(x))
high_missing <- missing_props > 0.3
if (any(high_missing)) {
quality_issues <- append(quality_issues,
paste("High missing data in variables:",
paste(names(data)[high_missing], collapse = ", ")))
}
# Data type validation
numeric_vars <- sapply(data, is.numeric)
if (!any(numeric_vars)) {
quality_issues <- append(quality_issues,
"No numeric variables found for analysis")
}
# Extreme value detection
for (var_name in names(data)[numeric_vars]) {
var_data <- data[[var_name]][!is.na(data[[var_name]])]
if (length(var_data) > 0) {
# Check for extreme outliers (beyond 3 IQRs)
q1 <- quantile(var_data, 0.25)
q3 <- quantile(var_data, 0.75)
iqr <- q3 - q1
extreme_outliers <- sum(var_data < (q1 - 3*iqr) | var_data > (q3 + 3*iqr))
if (extreme_outliers > 0) {
warnings <- append(warnings,
paste("Extreme outliers detected in", var_name,
":", extreme_outliers, "observations"))
recommendations <- append(recommendations,
paste("Consider robust methods or outlier treatment for", var_name))
}
}
}
list(
suitable_for_analysis = length(quality_issues) == 0,
critical_issues = quality_issues,
warnings = warnings,
recommendations = recommendations,
data_summary = list(
n_observations = nrow(data),
n_variables = ncol(data),
n_numeric = sum(numeric_vars),
missing_data_summary = missing_props
)
)
},
#' Execute primary statistical analysis
#'
#' @param data data.frame. Analysis data
#' @param method StatisticalMethod. Selected statistical method
#' @param parameters list. Analysis parameters
#'
#' @return list. Primary analysis results
#'
execute_primary_analysis = function(data, method, parameters) {
switch(method$method_name,
"independent_ttest_students" = private$execute_students_ttest(data, parameters),
"independent_ttest_welch" = private$execute_welch_ttest(data, parameters),
"mann_whitney_u" = private$execute_mann_whitney(data, parameters),
"bootstrap_ttest" = private$execute_bootstrap_ttest(data, parameters),
# Default fallback
stop(paste("Unknown method:", method$method_name))
)
},
#' Execute Student's t-test
#'
#' @param data data.frame. Analysis data
#' @param parameters list. Test parameters
#'
#' @return list. T-test results
#'
execute_students_ttest = function(data, parameters) {
# Validate data structure
if (!"group" %in% names(data) || !"response" %in% names(data)) {
stop("Data must contain 'group' and 'response' variables")
}
# Execute t-test
test_result <- t.test(
response ~ group,
data = data,
var.equal = TRUE,
alternative = parameters$alternative %||% "two.sided",
conf.level = parameters$conf_level %||% 0.95
)
# Calculate additional statistics
group_data <- split(data$response, data$group)
group_stats <- lapply(group_data, function(x) {
list(
n = length(x[!is.na(x)]),
mean = mean(x, na.rm = TRUE),
sd = sd(x, na.rm = TRUE),
se = sd(x, na.rm = TRUE) / sqrt(length(x[!is.na(x)])),
median = median(x, na.rm = TRUE),
iqr = IQR(x, na.rm = TRUE)
)
})
# Effect size calculation (Cohen's d)
pooled_sd <- sqrt(((group_stats[[1]]$n - 1) * group_stats[[1]]$sd^2 +
(group_stats[[2]]$n - 1) * group_stats[[2]]$sd^2) /
(group_stats[[1]]$n + group_stats[[2]]$n - 2))
cohens_d <- abs(group_stats[[1]]$mean - group_stats[[2]]$mean) / pooled_sd
list(
method_name = "Student's Independent Samples t-Test",
test_statistic = test_result$statistic,
degrees_freedom = test_result$parameter,
p_value = test_result$p.value,
confidence_interval = test_result$conf.int,
mean_difference = diff(test_result$estimate),
group_statistics = group_stats,
effect_size = list(
cohens_d = cohens_d,
interpretation = private$interpret_cohens_d(cohens_d)
),
assumptions_required = c("normality", "homogeneity", "independence"),
raw_result = test_result
)
},
#' Execute Welch's t-test
#'
#' @param data data.frame. Analysis data
#' @param parameters list. Test parameters
#'
#' @return list. Welch t-test results
#'
execute_welch_ttest = function(data, parameters) {
# Execute Welch t-test
test_result <- t.test(
response ~ group,
data = data,
var.equal = FALSE, # Key difference from Student's t-test
alternative = parameters$alternative %||% "two.sided",
conf.level = parameters$conf_level %||% 0.95
)
# Calculate group statistics
group_data <- split(data$response, data$group)
group_stats <- lapply(group_data, function(x) {
list(
n = length(x[!is.na(x)]),
mean = mean(x, na.rm = TRUE),
sd = sd(x, na.rm = TRUE),
se = sd(x, na.rm = TRUE) / sqrt(length(x[!is.na(x)])),
median = median(x, na.rm = TRUE),
iqr = IQR(x, na.rm = TRUE)
)
})
# Effect size calculation (using pooled SD for comparability)
pooled_sd <- sqrt((group_stats[[1]]$sd^2 + group_stats[[2]]$sd^2) / 2)
cohens_d <- abs(group_stats[[1]]$mean - group_stats[[2]]$mean) / pooled_sd
list(
method_name = "Welch's Independent Samples t-Test",
test_statistic = test_result$statistic,
degrees_freedom = test_result$parameter,
p_value = test_result$p.value,
confidence_interval = test_result$conf.int,
mean_difference = diff(test_result$estimate),
group_statistics = group_stats,
effect_size = list(
cohens_d = cohens_d,
interpretation = private$interpret_cohens_d(cohens_d)
),
assumptions_required = c("normality", "independence"),
raw_result = test_result
)
},
#' Execute Mann-Whitney U test
#'
#' @param data data.frame. Analysis data
#' @param parameters list. Test parameters
#'
#' @return list. Mann-Whitney U test results
#'
execute_mann_whitney = function(data, parameters) {
group_data <- split(data$response, data$group)
# Execute Mann-Whitney U test
test_result <- wilcox.test(
group_data[[1]],
group_data[[2]],
alternative = parameters$alternative %||% "two.sided",
conf.int = TRUE,
conf.level = parameters$conf_level %||% 0.95
)
# Calculate group statistics
group_stats <- lapply(group_data, function(x) {
list(
n = length(x[!is.na(x)]),
median = median(x, na.rm = TRUE),
iqr = IQR(x, na.rm = TRUE),
mean_rank = mean(rank(c(group_data[[1]], group_data[[2]]))[seq_along(x)]),
q1 = quantile(x, 0.25, na.rm = TRUE),
q3 = quantile(x, 0.75, na.rm = TRUE)
)
})
# Effect size calculation (rank-biserial correlation)
n1 <- group_stats[[1]]$n
n2 <- group_stats[[2]]$n
U <- test_result$statistic
r_rb <- 1 - (2 * U) / (n1 * n2) # Rank-biserial correlation
list(
method_name = "Mann-Whitney U Test",
test_statistic = test_result$statistic,
p_value = test_result$p.value,
confidence_interval = test_result$conf.int,
median_difference = test_result$estimate,
group_statistics = group_stats,
effect_size = list(
rank_biserial_r = r_rb,
interpretation = private$interpret_rank_biserial(abs(r_rb))
),
assumptions_required = c("independence", "ordinal_data"),
raw_result = test_result
)
},
#' Conduct sensitivity analysis
#'
#' @param data data.frame. Analysis data
#' @param primary_method StatisticalMethod. Primary analysis method
#' @param primary_results list. Primary analysis results
#' @param parameters list. Analysis parameters
#'
#' @return list. Sensitivity analysis results
#'
conduct_sensitivity_analysis = function(data, primary_method, primary_results, parameters) {
sensitivity_results <- list()
# Alternative methods sensitivity
alternative_methods <- private$get_alternative_methods(primary_method$method_name)
for (alt_method in alternative_methods) {
tryCatch({
alt_result <- private$execute_primary_analysis(data,
list(method_name = alt_method),
parameters)
sensitivity_results[[alt_method]] <- list(
p_value = alt_result$p_value,
effect_size = alt_result$effect_size,
conclusion_consistent = abs(alt_result$p_value - primary_results$p_value) < 0.05
)
}, error = function(e) {
sensitivity_results[[alt_method]] <- list(
error = e$message,
conclusion_consistent = NA
)
})
}
# Outlier removal sensitivity
if (nrow(data) > 10) {
outlier_removed_data <- private$remove_extreme_outliers(data)
if (nrow(outlier_removed_data) >= nrow(data) * 0.8) { # If less than 20% removed
tryCatch({
outlier_result <- private$execute_primary_analysis(outlier_removed_data,
primary_method,
parameters)
sensitivity_results$outlier_removal <- list(
n_removed = nrow(data) - nrow(outlier_removed_data),
p_value = outlier_result$p_value,
effect_size = outlier_result$effect_size,
conclusion_consistent = (primary_results$p_value < 0.05) == (outlier_result$p_value < 0.05)
)
}, error = function(e) {
sensitivity_results$outlier_removal <- list(error = e$message)
})
}
}
# Bootstrap sensitivity (if applicable)
if (primary_method$method_name %in% c("independent_ttest_students", "independent_ttest_welch")) {
tryCatch({
bootstrap_result <- private$bootstrap_ttest_sensitivity(data, parameters)
sensitivity_results$bootstrap <- bootstrap_result
}, error = function(e) {
sensitivity_results$bootstrap <- list(error = e$message)
})
}
# Overall sensitivity assessment
consistent_results <- sapply(sensitivity_results, function(x) {
if ("conclusion_consistent" %in% names(x)) x$conclusion_consistent else NA
})
sensitivity_summary <- list(
methods_tested = length(sensitivity_results),
consistent_conclusions = sum(consistent_results, na.rm = TRUE),
consistency_rate = mean(consistent_results, na.rm = TRUE),
robust_conclusion = mean(consistent_results, na.rm = TRUE) >= 0.75,
detailed_results = sensitivity_results
)
return(sensitivity_summary)
},
#' Analyze effect sizes and practical significance
#'
#' @param data data.frame. Analysis data
#' @param results list. Primary analysis results
#' @param context list. Analysis context
#'
#' @return list. Effect size analysis
#'
analyze_effect_sizes = function(data, results, context) {
effect_analysis <- list()
# Primary effect size from results
primary_effect <- if ("cohens_d" %in% names(results$effect_size)) {
list(
name = "Cohen's d",
value = results$effect_size$cohens_d,
interpretation = results$effect_size$interpretation,
type = "standardized_mean_difference"
)
} else if ("rank_biserial_r" %in% names(results$effect_size)) {
list(
name = "Rank-biserial correlation",
value = results$effect_size$rank_biserial_r,
interpretation = results$effect_size$interpretation,
type = "rank_based"
)
} else {
list(
name = "Unknown",
value = NA,
interpretation = "Not calculated",
type = "undefined"
)
}
effect_analysis$primary_effect_size <- primary_effect
# Additional effect size calculations
if ("group_statistics" %in% names(results)) {
group_stats <- results$group_statistics
# Glass's delta (using control group SD)
if (length(group_stats) == 2 && all(sapply(group_stats, function(x) "sd" %in% names(x)))) {
# Assume first group is control
glass_delta <- abs(group_stats[[1]]$mean - group_stats[[2]]$mean) / group_stats[[1]]$sd
effect_analysis$glass_delta <- list(
value = glass_delta,
interpretation = private$interpret_cohens_d(glass_delta), # Same interpretation
description = "Effect size using control group standard deviation"
)
}
# Overlap statistics
if (all(sapply(group_stats, function(x) all(c("mean", "sd") %in% names(x))))) {
overlap_stats <- private$calculate_distribution_overlap(group_stats)
effect_analysis$overlap_statistics <- overlap_stats
}
}
# Clinical significance assessment
if ("clinical_threshold" %in% names(context)) {
clinical_assessment <- private$assess_clinical_significance(
effect_size = primary_effect$value,
threshold = context$clinical_threshold,
context = context
)
effect_analysis$clinical_significance <- clinical_assessment
}
# Confidence intervals for effect sizes
if (!is.null(context$effect_size_ci) && context$effect_size_ci) {
effect_ci <- private$calculate_effect_size_ci(data, results, primary_effect)
effect_analysis$confidence_intervals <- effect_ci
}
return(effect_analysis)
},
#' Generate comprehensive interpretation guidance
#'
#' @param primary_results list. Primary analysis results
#' @param assumption_results list. Assumption test results
#' @param effect_analysis list. Effect size analysis
#' @param context list. Analysis context
#'
#' @return list. Interpretation guidance
#'
generate_interpretation_guidance = function(primary_results, assumption_results,
effect_analysis, context) {
interpretation <- list()
# Statistical significance interpretation
alpha_level <- context$alpha_level %||% 0.05
is_significant <- primary_results$p_value < alpha_level
interpretation$statistical_significance <- list(
is_significant = is_significant,
p_value = primary_results$p_value,
alpha_level = alpha_level,
interpretation = if (is_significant) {
paste0("The result is statistically significant at the ",
alpha_level, " level (p = ", round(primary_results$p_value, 4), ").")
} else {
paste0("The result is not statistically significant at the ",
alpha_level, " level (p = ", round(primary_results$p_value, 4), ").")
}
)
# Effect size interpretation
effect_size_value <- effect_analysis$primary_effect_size$value
interpretation$effect_size <- list(
magnitude = effect_analysis$primary_effect_size$interpretation,
value = effect_size_value,
practical_importance = private$assess_practical_importance(
effect_size_value,
effect_analysis$primary_effect_size$type,
context
)
)
# Assumption validity assessment
interpretation$assumption_validity <- list(
overall_assessment = assumption_results$overall_assessment$suitable,
violated_assumptions = assumption_results$overall_assessment$violations,
impact_on_results = private$assess_assumption_impact(
assumption_results,
primary_results$method_name
),
recommendations = assumption_results$overall_assessment$recommendations
)
# Clinical significance (if applicable)
if ("clinical_significance" %in% names(effect_analysis)) {
interpretation$clinical_significance <- effect_analysis$clinical_significance
}
# Confidence interval interpretation
if ("confidence_interval" %in% names(primary_results)) {
interpretation$confidence_interval <- list(
interval = primary_results$confidence_interval,
level = context$conf_level %||% 0.95,
interpretation = private$interpret_confidence_interval(
primary_results$confidence_interval,
primary_results$method_name
)
)
}
# Overall conclusion and recommendations
interpretation$conclusion <- private$generate_overall_conclusion(
statistical_significance = interpretation$statistical_significance,
effect_size = interpretation$effect_size,
assumption_validity = interpretation$assumption_validity,
clinical_significance = interpretation$clinical_significance %||% NULL,
context = context
)
# Limitations and caveats
interpretation$limitations <- private$identify_analysis_limitations(
primary_results,
assumption_results,
effect_analysis,
context
)
return(interpretation)
},
#' Helper function to interpret Cohen's d
#'
#' @param d numeric. Cohen's d value
#'
#' @return character. Interpretation
#'
interpret_cohens_d = function(d) {
abs_d <- abs(d)
if (abs_d < 0.2) {
"Very small effect"
} else if (abs_d < 0.5) {
"Small effect"
} else if (abs_d < 0.8) {
"Medium effect"
} else {
"Large effect"
}
},
#' Helper function to interpret rank-biserial correlation
#'
#' @param r numeric. Rank-biserial correlation value
#'
#' @return character. Interpretation
#'
interpret_rank_biserial = function(r) {
abs_r <- abs(r)
if (abs_r < 0.1) {
"Very small effect"
} else if (abs_r < 0.3) {
"Small effect"
} else if (abs_r < 0.5) {
"Medium effect"
} else {
"Large effect"
}
},
#' Get alternative methods for sensitivity analysis
#'
#' @param primary_method character. Primary method name
#'
#' @return character vector. Alternative method names
#'
get_alternative_methods = function(primary_method) {
switch(primary_method,
"independent_ttest_students" = c("independent_ttest_welch", "mann_whitney_u"),
"independent_ttest_welch" = c("independent_ttest_students", "mann_whitney_u"),
"mann_whitney_u" = c("independent_ttest_welch", "bootstrap_ttest"),
"bootstrap_ttest" = c("independent_ttest_welch", "mann_whitney_u"),
c() # Default: no alternatives
)
},
#' Remove extreme outliers from data
#'
#' @param data data.frame. Input data
#'
#' @return data.frame. Data with extreme outliers removed
#'
remove_extreme_outliers = function(data) {
if (!"response" %in% names(data)) {
return(data)
}
response_data <- data$response[!is.na(data$response)]
if (length(response_data) < 5) {
return(data) # Don't remove outliers for very small samples
}
# Use 3 IQR rule for extreme outliers
q1 <- quantile(response_data, 0.25)
q3 <- quantile(response_data, 0.75)
iqr <- q3 - q1
lower_bound <- q1 - 3 * iqr
upper_bound <- q3 + 3 * iqr
# Keep only non-extreme values
keep_indices <- !is.na(data$response) &
data$response >= lower_bound &
data$response <= upper_bound
return(data[keep_indices, ])
},
#' Bootstrap t-test sensitivity analysis
#'
#' @param data data.frame. Analysis data
#' @param parameters list. Analysis parameters
#' @param n_bootstrap integer. Number of bootstrap samples
#'
#' @return list. Bootstrap sensitivity results
#'
bootstrap_ttest_sensitivity = function(data, parameters, n_bootstrap = 1000) {
group_data <- split(data$response, data$group)
group1 <- group_data[[1]][!is.na(group_data[[1]])]
group2 <- group_data[[2]][!is.na(group_data[[2]])]
# Bootstrap sampling
bootstrap_results <- replicate(n_bootstrap, {
# Resample with replacement
boot_group1 <- sample(group1, length(group1), replace = TRUE)
boot_group2 <- sample(group2, length(group2), replace = TRUE)
# Calculate statistics
mean_diff <- mean(boot_group1) - mean(boot_group2)
pooled_sd <- sqrt((var(boot_group1) + var(boot_group2)) / 2)
cohens_d <- mean_diff / pooled_sd
# Bootstrap t-test
t_result <- t.test(boot_group1, boot_group2)
list(
mean_difference = mean_diff,
cohens_d = cohens_d,
p_value = t_result$p.value,
t_statistic = t_result$statistic
)
}, simplify = FALSE)
# Summarize bootstrap results
p_values <- sapply(bootstrap_results, function(x) x$p_value)
cohens_d_values <- sapply(bootstrap_results, function(x) x$cohens_d)
list(
n_bootstrap = n_bootstrap,
p_value_median = median(p_values, na.rm = TRUE),
p_value_ci = quantile(p_values, c(0.025, 0.975), na.rm = TRUE),
cohens_d_median = median(cohens_d_values, na.rm = TRUE),
cohens_d_ci = quantile(cohens_d_values, c(0.025, 0.975), na.rm = TRUE),
proportion_significant = mean(p_values < 0.05, na.rm = TRUE)
)
},
#' Calculate distribution overlap statistics
#'
#' @param group_stats list. Group statistics
#'
#' @return list. Overlap statistics
#'
calculate_distribution_overlap = function(group_stats) {
if (length(group_stats) != 2) {
return(list(error = "Overlap calculation requires exactly 2 groups"))
}
# Extract means and SDs
mean1 <- group_stats[[1]]$mean
mean2 <- group_stats[[2]]$mean
sd1 <- group_stats[[1]]$sd
sd2 <- group_stats[[2]]$sd
# Cohen's U3 (percentage of group 2 below mean of group 1)
u3 <- pnorm(mean1, mean2, sd2)
# Probability of superiority
prob_superiority <- pnorm((mean1 - mean2) / sqrt(sd1^2 + sd2^2))
# Overlap coefficient (approximate)
if (sd1 > 0 && sd2 > 0) {
overlap_coef <- 2 * pnorm(-abs(mean1 - mean2) / (2 * sqrt((sd1^2 + sd2^2) / 2)))
} else {
overlap_coef <- NA
}
list(
cohens_u3 = u3,
probability_superiority = prob_superiority,
overlap_coefficient = overlap_coef,
interpretation = list(
u3_meaning = paste0(round(u3 * 100, 1), "% of group 2 scores below group 1 mean"),
superiority_meaning = paste0(round(prob_superiority * 100, 1),
"% probability that group 1 score > group 2 score")
)
)
},
#' Assess clinical significance
#'
#' @param effect_size numeric. Effect size value
#' @param threshold numeric. Clinical significance threshold
#' @param context list. Analysis context
#'
#' @return list. Clinical significance assessment
#'
assess_clinical_significance = function(effect_size, threshold, context) {
is_clinically_significant <- abs(effect_size) >= threshold
list(
is_clinically_significant = is_clinically_significant,
effect_size = effect_size,
threshold = threshold,
interpretation = if (is_clinically_significant) {
paste0("The effect size (", round(effect_size, 3),
") meets or exceeds the clinical significance threshold (",
threshold, ").")
} else {
paste0("The effect size (", round(effect_size, 3),
") does not meet the clinical significance threshold (",
threshold, ").")
},
clinical_context = context$clinical_context %||% "No specific context provided"
)
},
#' Assess practical importance of effect size
#'
#' @param effect_size numeric. Effect size value
#' @param effect_type character. Type of effect size
#' @param context list. Analysis context
#'
#' @return list. Practical importance assessment
#'
assess_practical_importance = function(effect_size, effect_type, context) {
# Domain-specific thresholds
domain_thresholds <- context$domain_thresholds %||% list(
clinical = list(small = 0.2, medium = 0.5, large = 0.8),
educational = list(small = 0.25, medium = 0.6, large = 1.0),
psychological = list(small = 0.2, medium = 0.5, large = 0.8)
)
domain <- context$domain %||% "clinical"
thresholds <- domain_thresholds[[domain]] %||% domain_thresholds$clinical
abs_effect <- abs(effect_size)
practical_magnitude <- if (abs_effect < thresholds$small) {
"Minimal practical importance"
} else if (abs_effect < thresholds$medium) {
"Small practical importance"
} else if (abs_effect < thresholds$large) {
"Moderate practical importance"
} else {
"Large practical importance"
}
list(
magnitude = practical_magnitude,
domain_context = domain,
thresholds_used = thresholds,
recommendation = private$generate_practical_recommendation(
practical_magnitude,
context
)
)
},
#' Generate practical recommendation based on effect size
#'
#' @param magnitude character. Practical magnitude assessment
#' @param context list. Analysis context
#'
#' @return character. Practical recommendation
#'
generate_practical_recommendation = function(magnitude, context) {
switch(magnitude,
"Minimal practical importance" =
"The observed difference may have limited practical significance. Consider whether the effect is meaningful in the specific context.",
"Small practical importance" =
"The observed difference suggests a small but potentially meaningful effect. Evaluate cost-benefit considerations for implementation.",
"Moderate practical importance" =
"The observed difference indicates a moderate effect that is likely to be practically meaningful. Strong consideration for implementation is warranted.",
"Large practical importance" =
"The observed difference indicates a large effect with substantial practical importance. Implementation should be strongly considered.",
"Unknown practical importance"
)
}
)
)Assumption Testing Framework
Comprehensive Assumption Validation
Create a sophisticated assumption testing system for statistical rigor:
# File: R/assumption_testing_framework.R
#' Comprehensive Assumption Testing Framework
#'
#' @description Advanced assumption testing system with automated validation,
#' intelligent interpretation, and regulatory compliance documentation.
#'
#' @export
AssumptionTestingFramework <- R6::R6Class(
"AssumptionTestingFramework",
public = list(
#' @field config assumption testing configuration
config = NULL,
#' @field test_registry collection of available assumption tests
test_registry = NULL,
#' Initialize assumption testing framework
#'
#' @param config list. Assumption testing configuration
#'
initialize = function(config = list()) {
self$config <- private$default_assumption_config(config)
self$test_registry <- private$initialize_test_registry()
},
#' Test all relevant assumptions for analysis
#'
#' @param data data.frame. Analysis data
#' @param analysis_type character. Type of statistical analysis
#' @param parameters list. Analysis parameters
#'
#' @return AssumptionTestResults. Comprehensive assumption test results
#'
test_all_assumptions = function(data, analysis_type, parameters = list()) {
# Determine required assumptions for analysis type
required_assumptions <- private$get_required_assumptions(analysis_type)
assumption_results <- list()
# Test each required assumption
for (assumption in required_assumptions) {
test_result <- switch(assumption,
"normality" = self$test_normality(data, parameters),
"homogeneity" = self$test_homogeneity(data, parameters),
"independence" = self$test_independence(data, parameters),
"linearity" = self$test_linearity(data, parameters),
"outliers" = self$detect_outliers(data, parameters),
list(error = paste("Unknown assumption:", assumption))
)
assumption_results[[assumption]] <- test_result
}
# Generate overall assessment
overall_assessment <- private$assess_overall_assumptions(
assumption_results,
analysis_type,
parameters
)
AssumptionTestResults$new(
analysis_type = analysis_type,
assumption_results = assumption_results,
overall_assessment = overall_assessment,
parameters = parameters
)
},
#' Test normality assumption
#'
#' @param data data.frame. Input data
#' @param parameters list. Test parameters
#'
#' @return list. Normality test results
#'
test_normality = function(data, parameters = list()) {
normality_results <- list()
# Identify numeric variables to test
if ("response" %in% names(data)) {
test_vars <- "response"
} else {
numeric_vars <- names(data)[sapply(data, is.numeric)]
test_vars <- numeric_vars[1:min(3, length(numeric_vars))] # Limit to 3 variables
}
for (var_name in test_vars) {
var_data <- data[[var_name]][!is.na(data[[var_name]])]
if (length(var_data) < 3) {
normality_results[[var_name]] <- list(
error = "Insufficient data for normality testing",
n = length(var_data)
)
next
}
var_results <- list()
# Shapiro-Wilk test (for n <= 5000)
if (length(var_data) <= 5000) {
tryCatch({
shapiro_result <- shapiro.test(var_data)
var_results$shapiro_wilk <- list(
statistic = shapiro_result$statistic,
p_value = shapiro_result$p.value,
method = "Shapiro-Wilk",
conclusion = shapiro_result$p.value >= self$config$normality$alpha_level
)
}, error = function(e) {
var_results$shapiro_wilk <- list(error = e$message)
})
}
# Anderson-Darling test
if (requireNamespace("nortest", quietly = TRUE)) {
tryCatch({
ad_result <- nortest::ad.test(var_data)
var_results$anderson_darling <- list(
statistic = ad_result$statistic,
p_value = ad_result$p.value,
method = "Anderson-Darling",
conclusion = ad_result$p.value >= self$config$normality$alpha_level
)
}, error = function(e) {
var_results$anderson_darling <- list(error = e$message)
})
}
# Kolmogorov-Smirnov test
tryCatch({
# Compare to normal distribution with sample mean and sd
ks_result <- ks.test(var_data, "pnorm", mean(var_data), sd(var_data))
var_results$kolmogorov_smirnov <- list(
statistic = ks_result$statistic,
p_value = ks_result$p.value,
method = "Kolmogorov-Smirnov",
conclusion = ks_result$p.value >= self$config$normality$alpha_level
)
}, error = function(e) {
var_results$kolmogorov_smirnov <- list(error = e$message)
})
# Descriptive statistics for normality assessment
var_results$descriptive_assessment <- private$assess_normality_descriptive(var_data)
# Overall normality conclusion for this variable
valid_tests <- var_results[!sapply(var_results, function(x) "error" %in% names(x))]
if (length(valid_tests) > 0) {
test_conclusions <- sapply(valid_tests, function(x) {
if ("conclusion" %in% names(x)) x$conclusion else NA
})
# Conservative approach: assume non-normal if any test rejects
var_results$overall_conclusion <- list(
assumes_normal = all(test_conclusions, na.rm = TRUE),
n_tests = sum(!is.na(test_conclusions)),
consensus = mean(test_conclusions, na.rm = TRUE)
)
}
normality_results[[var_name]] <- var_results
}
return(normality_results)
},
#' Test homogeneity of variance assumption
#'
#' @param data data.frame. Input data with group and response variables
#' @param parameters list. Test parameters
#'
#' @return list. Homogeneity test results
#'
test_homogeneity = function(data, parameters = list()) {
if (!"group" %in% names(data) || !"response" %in% names(data)) {
return(list(error = "Data must contain 'group' and 'response' variables"))
}
# Remove missing values
complete_data <- data[!is.na(data$group) & !is.na(data$response), ]
if (nrow(complete_data) < 6) {
return(list(error = "Insufficient data for homogeneity testing"))
}
group_counts <- table(complete_data$group)
if (any(group_counts < 2)) {
return(list(error = "Each group must have at least 2 observations"))
}
homogeneity_results <- list()
# Levene's test (robust)
tryCatch({
if (requireNamespace("car", quietly = TRUE)) {
levene_result <- car::leveneTest(response ~ group, data = complete_data)
homogeneity_results$levene <- list(
statistic = levene_result$`F value`[1],
p_value = levene_result$`Pr(>F)`[1],
df = levene_result$Df,
method = "Levene's Test",
conclusion = levene_result$`Pr(>F)`[1] >= self$config$homogeneity$alpha_level
)
} else {
# Manual Levene's test implementation
levene_manual <- private$manual_levene_test(complete_data)
homogeneity_results$levene <- levene_manual
}
}, error = function(e) {
homogeneity_results$levene <- list(error = e$message)
})
# Bartlett's test (assumes normality)
tryCatch({
bartlett_result <- bartlett.test(response ~ group, data = complete_data)
homogeneity_results$bartlett <- list(
statistic = bartlett_result$statistic,
p_value = bartlett_result$p.value,
df = bartlett_result$parameter,
method = "Bartlett's Test",
conclusion = bartlett_result$p.value >= self$config$homogeneity$alpha_level,
note = "Assumes normality within groups"
)
}, error = function(e) {
homogeneity_results$bartlett <- list(error = e$message)
})
# Brown-Forsythe test (robust alternative to Levene's)
tryCatch({
bf_result <- private$brown_forsythe_test(complete_data)
homogeneity_results$brown_forsythe <- bf_result
}, error = function(e) {
homogeneity_results$brown_forsythe <- list(error = e$message)
})
# Descriptive assessment
group_variances <- by(complete_data$response, complete_data$group, var, na.rm = TRUE)
homogeneity_results$descriptive_assessment <- list(
group_variances = as.list(group_variances),
variance_ratio = max(group_variances) / min(group_variances),
variance_ratio_interpretation = private$interpret_variance_ratio(
max(group_variances) / min(group_variances)
)
)
# Overall homogeneity conclusion
valid_tests <- homogeneity_results[!sapply(homogeneity_results, function(x) "error" %in% names(x))]
test_conclusions <- sapply(valid_tests, function(x) {
if ("conclusion" %in% names(x)) x$conclusion else NA
})
if (length(test_conclusions) > 0) {
homogeneity_results$overall_conclusion <- list(
assumes_homogeneity = mean(test_conclusions, na.rm = TRUE) >= 0.5,
n_tests = sum(!is.na(test_conclusions)),
consensus = mean(test_conclusions, na.rm = TRUE),
recommendation = if (mean(test_conclusions, na.rm = TRUE) >= 0.5) {
"Homogeneity assumption appears reasonable"
} else {
"Consider using methods that don't assume equal variances (e.g., Welch's t-test)"
}
)
}
return(homogeneity_results)
},
#' Test independence assumption
#'
#' @param data data.frame. Input data
#' @param parameters list. Test parameters
#'
#' @return list. Independence test results
#'
test_independence = function(data, parameters = list()) {
independence_results <- list()
# Note about independence assumption
independence_results$note <- paste(
"Independence is primarily a design issue that should be ensured",
"through proper randomization and data collection procedures.",
"Statistical tests can only detect certain types of dependence."
)
# For time series or ordered data, test for autocorrelation
if ("response" %in% names(data) && nrow(data) >= 10) {
response_data <- data$response[!is.na(data$response)]
# Durbin-Watson test for autocorrelation
if (requireNamespace("lmtest", quietly = TRUE)) {
tryCatch({
# Simple linear model for Durbin-Watson
temp_data <- data.frame(
y = response_data,
x = seq_along(response_data)
)
temp_model <- lm(y ~ x, data = temp_data)
dw_result <- lmtest::dwtest(temp_model)
independence_results$durbin_watson <- list(
statistic = dw_result$statistic,
p_value = dw_result$p.value,
method = "Durbin-Watson Test",
conclusion = dw_result$p.value >= self$config$independence$alpha_level,
interpretation = "Tests for first-order autocorrelation"
)
}, error = function(e) {
independence_results$durbin_watson <- list(error = e$message)
})
}
# Runs test for randomness
tryCatch({
runs_result <- private$runs_test(response_data)
independence_results$runs_test <- runs_result
}, error = function(e) {
independence_results$runs_test <- list(error = e$message)
})
}
# Design-based independence assessment
independence_results$design_assessment <- list(
data_collection_method = parameters$data_collection_method %||% "Not specified",
randomization_used = parameters$randomization %||% "Not specified",
temporal_ordering = "order" %in% names(data) || "time" %in% names(data),
clustering_present = parameters$clustering %||% "Not specified",
repeated_measures = parameters$repeated_measures %||% "Not specified"
)
# Overall independence assessment
if (any(sapply(independence_results, function(x) "conclusion" %in% names(x)))) {
test_conclusions <- sapply(independence_results, function(x) {
if ("conclusion" %in% names(x)) x$conclusion else NA
})
independence_results$overall_conclusion <- list(
statistical_evidence_for_independence = mean(test_conclusions, na.rm = TRUE) >= 0.5,
primary_concern = "Ensure independence through study design",
recommendation = paste(
"Independence is best ensured through proper study design.",
"If data collection involved clustering, temporal dependence,",
"or repeated measures, consider appropriate analytical methods."
)
)
}
return(independence_results)
},
#' Detect outliers in data
#'
#' @param data data.frame. Input data
#' @param parameters list. Detection parameters
#'
#' @return list. Outlier detection results
#'
detect_outliers = function(data, parameters = list()) {
outlier_results <- list()
# Identify numeric variables for outlier detection
if ("response" %in% names(data)) {
test_vars <- "response"
} else {
numeric_vars <- names(data)[sapply(data, is.numeric)]
test_vars <- numeric_vars[1:min(3, length(numeric_vars))]
}
for (var_name in test_vars) {
var_data <- data[[var_name]][!is.na(data[[var_name]])]
if (length(var_data) < 5) {
outlier_results[[var_name]] <- list(
error = "Insufficient data for outlier detection",
n = length(var_data)
)
next
}
var_outlier_results <- list()
# IQR method (1.5 * IQR rule)
q1 <- quantile(var_data, 0.25)
q3 <- quantile(var_data, 0.75)
iqr <- q3 - q1
iqr_lower <- q1 - 1.5 * iqr
iqr_upper <- q3 + 1.5 * iqr
iqr_outliers <- which(var_data < iqr_lower | var_data > iqr_upper)
var_outlier_results$iqr_method <- list(
method = "IQR (1.5 * IQR rule)",
outlier_indices = iqr_outliers,
outlier_count = length(iqr_outliers),
outlier_proportion = length(iqr_outliers) / length(var_data),
bounds = c(iqr_lower, iqr_upper),
outlier_values = var_data[iqr_outliers]
)
# Modified Z-score method
median_val <- median(var_data)
mad_val <- mad(var_data)
if (mad_val > 0) {
modified_z <- 0.6745 * (var_data - median_val) / mad_val
mz_outliers <- which(abs(modified_z) > 3.5)
var_outlier_results$modified_zscore <- list(
method = "Modified Z-score",
outlier_indices = mz_outliers,
outlier_count = length(mz_outliers),
outlier_proportion = length(mz_outliers) / length(var_data),
threshold = 3.5,
outlier_values = var_data[mz_outliers],
modified_z_scores = modified_z[mz_outliers]
)
}
# Isolation Forest (if sufficient data)
if (length(var_data) >= 20 && requireNamespace("isotree", quietly = TRUE)) {
tryCatch({
iso_model <- isotree::isolation.forest(matrix(var_data, ncol = 1))
anomaly_scores <- isotree::predict.isolation_forest(iso_model, matrix(var_data, ncol = 1))
iso_threshold <- quantile(anomaly_scores, 0.95) # Top 5% as outliers
iso_outliers <- which(anomaly_scores > iso_threshold)
var_outlier_results$isolation_forest <- list(
method = "Isolation Forest",
outlier_indices = iso_outliers,
outlier_count = length(iso_outliers),
outlier_proportion = length(iso_outliers) / length(var_data),
threshold = iso_threshold,
outlier_values = var_data[iso_outliers],
anomaly_scores = anomaly_scores[iso_outliers]
)
}, error = function(e) {
var_outlier_results$isolation_forest <- list(error = e$message)
})
}
# Consensus outlier identification
all_outlier_indices <- unique(c(
iqr_outliers,
if (exists("mz_outliers")) mz_outliers else c(),
if ("isolation_forest" %in% names(var_outlier_results) &&
!"error" %in% names(var_outlier_results$isolation_forest)) {
var_outlier_results$isolation_forest$outlier_indices
} else c()
))
# Count how many methods identify each outlier
outlier_consensus <- sapply(all_outlier_indices, function(idx) {
count <- 0
if (idx %in% iqr_outliers) count <- count + 1
if (exists("mz_outliers") && idx %in% mz_outliers) count <- count + 1
if ("isolation_forest" %in% names(var_outlier_results) &&
!"error" %in% names(var_outlier_results$isolation_forest) &&
idx %in% var_outlier_results$isolation_forest$outlier_indices) count <- count + 1
count
})
# High-confidence outliers (identified by multiple methods)
high_confidence_outliers <- all_outlier_indices[outlier_consensus >= 2]
var_outlier_results$consensus <- list(
all_potential_outliers = all_outlier_indices,
high_confidence_outliers = high_confidence_outliers,
outlier_consensus_scores = outlier_consensus,
recommendation = if (length(high_confidence_outliers) > 0) {
paste("Consider investigating", length(high_confidence_outliers),
"high-confidence outliers")
} else if (length(all_outlier_indices) > 0) {
paste("Potential outliers detected but with low consensus.",
"Manual review recommended")
} else {
"No significant outliers detected"
}
)
outlier_results[[var_name]] <- var_outlier_results
}
return(outlier_results)
}
),
private = list(
#' Default assumption testing configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete configuration
#'
default_assumption_config = function(config) {
default_config <- list(
normality = list(
tests = c("shapiro_wilk", "anderson_darling", "kolmogorov_smirnov"),
alpha_level = 0.05,
min_sample_size = 3,
max_sample_size_shapiro = 5000,
descriptive_thresholds = list(
skewness = 2.0,
kurtosis = 7.0
)
),
homogeneity = list(
tests = c("levene", "bartlett", "brown_forsythe"),
alpha_level = 0.05,
variance_ratio_concern = 4.0,
prefer_robust = TRUE
),
independence = list(
tests = c("durbin_watson", "runs_test"),
alpha_level = 0.05,
design_based_assessment = TRUE
),
outliers = list(
methods = c("iqr", "modified_zscore", "isolation_forest"),
iqr_multiplier = 1.5,
modified_z_threshold = 3.5,
isolation_forest_threshold = 0.95,
consensus_threshold = 2
)
)
modifyList(default_config, config)
},
#' Initialize test registry
#'
#' @return list. Available statistical tests
#'
initialize_test_registry = function() {
list(
normality = list(
shapiro_wilk = list(
function_name = "shapiro.test",
package = "stats",
sample_size_limits = c(3, 5000),
description = "Shapiro-Wilk test for normality"
),
anderson_darling = list(
function_name = "ad.test",
package = "nortest",
sample_size_limits = c(7, Inf),
description = "Anderson-Darling test for normality"
),
kolmogorov_smirnov = list(
function_name = "ks.test",
package = "stats",
sample_size_limits = c(3, Inf),
description = "Kolmogorov-Smirnov test for normality"
)
),
homogeneity = list(
levene = list(
function_name = "leveneTest",
package = "car",
description = "Levene's test for homogeneity of variance"
),
bartlett = list(
function_name = "bartlett.test",
package = "stats",
description = "Bartlett's test for homogeneity of variance"
)
)
)
},
#' Get required assumptions for analysis type
#'
#' @param analysis_type character. Type of analysis
#'
#' @return character vector. Required assumptions
#'
get_required_assumptions = function(analysis_type) {
switch(analysis_type,
"independent_ttest" = c("normality", "homogeneity", "independence", "outliers"),
"paired_ttest" = c("normality", "independence", "outliers"),
"one_sample_ttest" = c("normality", "independence", "outliers"),
"anova" = c("normality", "homogeneity", "independence", "outliers"),
"correlation" = c("normality", "linearity", "independence", "outliers"),
"regression" = c("normality", "linearity", "independence", "homogeneity", "outliers"),
# Default assumptions for unknown analysis types
c("normality", "independence", "outliers")
)
},
#' Assess overall assumption validity
#'
#' @param assumption_results list. Results from individual assumption tests
#' @param analysis_type character. Type of analysis
#' @param parameters list. Analysis parameters
#'
#' @return list. Overall assessment
#'
assess_overall_assumptions = function(assumption_results, analysis_type, parameters) {
violations <- list()
warnings <- list()
recommendations <- list()
# Check each assumption
for (assumption_name in names(assumption_results)) {
assumption_result <- assumption_results[[assumption_name]]
if ("error" %in% names(assumption_result)) {
warnings <- append(warnings,
paste("Could not test", assumption_name, "assumption:",
assumption_result$error))
next
}
assumption_met <- switch(assumption_name,
"normality" = private$assess_normality_overall(assumption_result),
"homogeneity" = private$assess_homogeneity_overall(assumption_result),
"independence" = private$assess_independence_overall(assumption_result),
"outliers" = private$assess_outliers_overall(assumption_result),
TRUE # Default: assume met if unknown
)
if (!assumption_met$met) {
violations <- append(violations, assumption_name)
warnings <- append(warnings, assumption_met$warning)
recommendations <- append(recommendations, assumption_met$recommendation)
}
}
# Overall suitability assessment
critical_violations <- intersect(violations, c("normality", "independence"))
suitable_for_analysis <- length(critical_violations) == 0
# Generate method recommendations
method_recommendations <- private$generate_method_recommendations(
violations, analysis_type, assumption_results
)
list(
suitable = suitable_for_analysis,
violations = violations,
critical_violations = critical_violations,
warnings = warnings,
recommendations = c(recommendations, method_recommendations),
assumption_summary = private$create_assumption_summary(assumption_results),
suggested_alternatives = if (!suitable_for_analysis) {
private$suggest_alternative_methods(violations, analysis_type)
} else NULL
)
},
#' Assess normality assumption overall
#'
#' @param normality_result list. Normality test results
#'
#' @return list. Assessment result
#'
assess_normality_overall = function(normality_result) {
if (length(normality_result) == 0) {
return(list(met = FALSE, warning = "No normality tests completed"))
}
# Look for overall conclusions in each variable
overall_conclusions <- sapply(normality_result, function(var_result) {
if ("overall_conclusion" %in% names(var_result)) {
var_result$overall_conclusion$assumes_normal
} else {
NA
}
})
if (all(is.na(overall_conclusions))) {
return(list(
met = FALSE,
warning = "Could not determine normality status",
recommendation = "Manual inspection of data distribution recommended"
))
}
normality_met <- all(overall_conclusions, na.rm = TRUE)
list(
met = normality_met,
warning = if (!normality_met) "Normality assumption appears violated" else NULL,
recommendation = if (!normality_met) {
"Consider non-parametric alternatives or data transformation"
} else NULL
)
},
#' Assess homogeneity assumption overall
#'
#' @param homogeneity_result list. Homogeneity test results
#'
#' @return list. Assessment result
#'
assess_homogeneity_overall = function(homogeneity_result) {
if ("overall_conclusion" %in% names(homogeneity_result)) {
homogeneity_met <- homogeneity_result$overall_conclusion$assumes_homogeneity
list(
met = homogeneity_met,
warning = if (!homogeneity_met) "Homogeneity of variance assumption appears violated" else NULL,
recommendation = if (!homogeneity_met) {
"Consider Welch's t-test or other methods that don't assume equal variances"
} else NULL
)
} else {
list(
met = FALSE,
warning = "Could not assess homogeneity of variance",
recommendation = "Manual inspection of group variances recommended"
)
}
},
#' Assess independence assumption overall
#'
#' @param independence_result list. Independence test results
#'
#' @return list. Assessment result
#'
assess_independence_overall = function(independence_result) {
if ("overall_conclusion" %in% names(independence_result)) {
independence_met <- independence_result$overall_conclusion$statistical_evidence_for_independence
list(
met = independence_met,
warning = if (!independence_met) {
"Statistical evidence suggests potential dependence in data"
} else {
"Independence primarily depends on study design"
},
recommendation = "Ensure independence through proper study design and data collection"
)
} else {
list(
met = TRUE, # Assume met if cannot test
warning = "Independence assumption not statistically testable",
recommendation = "Ensure independence through study design"
)
}
},
#' Assess outliers overall
#'
#' @param outlier_result list. Outlier detection results
#'
#' @return list. Assessment result
#'
assess_outliers_overall = function(outlier_result) {
# Count high-confidence outliers across all variables
total_high_confidence <- 0
total_observations <- 0
for (var_result in outlier_result) {
if ("consensus" %in% names(var_result)) {
total_high_confidence <- total_high_confidence +
length(var_result$consensus$high_confidence_outliers)
# Estimate total observations from first method
if ("iqr_method" %in% names(var_result)) {
total_observations <- total_observations +
length(var_result$iqr_method$outlier_values) +
var_result$iqr_method$outlier_count
}
}
}
outlier_proportion <- if (total_observations > 0) {
total_high_confidence / total_observations
} else 0
# Consider problematic if >5% are high-confidence outliers
outliers_problematic <- outlier_proportion > 0.05
list(
met = !outliers_problematic,
warning = if (outliers_problematic) {
paste("High proportion of outliers detected:",
round(outlier_proportion * 100, 1), "%")
} else if (total_high_confidence > 0) {
paste("Some outliers detected but within acceptable range")
} else NULL,
recommendation = if (outliers_problematic) {
"Consider outlier treatment or robust statistical methods"
} else if (total_high_confidence > 0) {
"Review identified outliers for data entry errors or valid extreme values"
} else NULL
)
},
#' Manual implementation of Levene's test
#'
#' @param data data.frame. Data with group and response variables
#'
#' @return list. Levene's test results
#'
manual_levene_test = function(data) {
# Calculate group medians
group_medians <- by(data$response, data$group, median, na.rm = TRUE)
# Calculate absolute deviations from group medians
data$abs_dev <- abs(data$response - group_medians[data$group])
# Perform ANOVA on absolute deviations
anova_result <- anova(lm(abs_dev ~ group, data = data))
list(
statistic = anova_result$`F value`[1],
p_value = anova_result$`Pr(>F)`[1],
df = anova_result$Df,
method = "Levene's Test (manual implementation)",
conclusion = anova_result$`Pr(>F)`[1] >= self$config$homogeneity$alpha_level
)
},
#' Brown-Forsythe test implementation
#'
#' @param data data.frame. Data with group and response variables
#'
#' @return list. Brown-Forsythe test results
#'
brown_forsythe_test = function(data) {
# Similar to Levene's but uses group medians instead of means
# This is actually what our manual Levene's test implements
levene_result <- private$manual_levene_test(data)
levene_result$method <- "Brown-Forsythe Test"
return(levene_result)
},
#' Runs test for randomness
#'
#' @param x numeric vector. Data series
#'
#' @return list. Runs test results
#'
runs_test = function(x) {
if (length(x) < 10) {
return(list(error = "Insufficient data for runs test"))
}
# Convert to binary sequence (above/below median)
median_x <- median(x, na.rm = TRUE)
binary_seq <- as.numeric(x > median_x)
# Count runs
runs <- 1
for (i in 2:length(binary_seq)) {
if (binary_seq[i] != binary_seq[i-1]) {
runs <- runs + 1
}
}
# Calculate expected runs and standard deviation
n1 <- sum(binary_seq)
n2 <- length(binary_seq) - n1
if (n1 == 0 || n2 == 0) {
return(list(error = "All values are on one side of median"))
}
expected_runs <- (2 * n1 * n2) / (n1 + n2) + 1
var_runs <- (2 * n1 * n2 * (2 * n1 * n2 - n1 - n2)) /
((n1 + n2)^2 * (n1 + n2 - 1))
if (var_runs <= 0) {
return(list(error = "Cannot calculate variance for runs test"))
}
# Z-score and p-value
z_score <- (runs - expected_runs) / sqrt(var_runs)
p_value <- 2 * (1 - pnorm(abs(z_score)))
list(
statistic = runs,
expected_runs = expected_runs,
z_score = z_score,
p_value = p_value,
method = "Runs Test for Randomness",
conclusion = p_value >= self$config$independence$alpha_level,
interpretation = "Tests whether sequence is random"
)
},
#' Assess normality using descriptive statistics
#'
#' @param x numeric vector. Data to assess
#'
#' @return list. Descriptive normality assessment
#'
assess_normality_descriptive = function(x) {
if (length(x) < 4) {
return(list(error = "Insufficient data for descriptive assessment"))
}
# Calculate skewness and kurtosis
mean_x <- mean(x)
sd_x <- sd(x)
n <- length(x)
# Skewness
skewness <- sum((x - mean_x)^3) / (n * sd_x^3)
# Kurtosis (excess kurtosis)
kurtosis <- sum((x - mean_x)^4) / (n * sd_x^4) - 3
# Interpret skewness and kurtosis
skew_interpretation <- if (abs(skewness) < 0.5) {
"Approximately symmetric"
} else if (abs(skewness) < 1.0) {
"Moderately skewed"
} else {
"Highly skewed"
}
kurt_interpretation <- if (abs(kurtosis) < 0.5) {
"Normal kurtosis"
} else if (kurtosis > 0.5) {
"Heavy-tailed (leptokurtic)"
} else {
"Light-tailed (platykurtic)"
}
# Overall assessment
likely_normal <- abs(skewness) < self$config$normality$descriptive_thresholds$skewness &&
abs(kurtosis) < self$config$normality$descriptive_thresholds$kurtosis
list(
skewness = skewness,
kurtosis = kurtosis,
skew_interpretation = skew_interpretation,
kurt_interpretation = kurt_interpretation,
likely_normal = likely_normal,
sample_size = n
)
},
#' Interpret variance ratio
#'
#' @param ratio numeric. Variance ratio (max/min)
#'
#' @return character. Interpretation
#'
interpret_variance_ratio = function(ratio) {
if (ratio <= 2) {
"Variances are quite similar"
} else if (ratio <= 4) {
"Moderate difference in variances"
} else if (ratio <= 10) {
"Large difference in variances"
} else {
"Very large difference in variances"
}
}
)
)Intelligent Method Selection
Statistical Method Selector
Create an intelligent system for selecting optimal statistical methods:
# File: R/statistical_method_selector.R
#' Statistical Method Selector
#'
#' @description Intelligent method selection system that chooses optimal
#' statistical approaches based on data characteristics and assumption test results.
#'
#' @export
StatisticalMethodSelector <- R6::R6Class(
"StatisticalMethodSelector",
public = list(
#' @field config method selection configuration
config = NULL,
#' @field method_registry available statistical methods
method_registry = NULL,
#' Initialize method selector
#'
#' @param config list. Method selection configuration
#'
initialize = function(config = list()) {
self$config <- private$default_selector_config(config)
self$method_registry <- private$initialize_method_registry()
},
#' Select optimal statistical method
#'
#' @param data data.frame. Analysis data
#' @param analysis_type character. Type of analysis requested
#' @param assumption_results AssumptionTestResults. Assumption test results
#' @param parameters list. Analysis parameters
#'
#' @return StatisticalMethod. Selected method with justification
#'
select_optimal_method = function(data, analysis_type, assumption_results, parameters = list()) {
# Get available methods for analysis type
available_methods <- private$get_available_methods(analysis_type)
if (length(available_methods) == 0) {
stop(paste("No methods available for analysis type:", analysis_type))
}
# Score each method based on assumption compliance and other factors
method_scores <- list()
for (method_name in available_methods) {
method_info <- self$method_registry[[analysis_type]][[method_name]]
score <- private$score_method(
method_info = method_info,
data = data,
assumption_results = assumption_results,
parameters = parameters
)
method_scores[[method_name]] <- score
}
# Select best method
best_method_name <- names(method_scores)[which.max(sapply(method_scores, function(x) x$total_score))]
best_score <- method_scores[[best_method_name]]
# Create method object
StatisticalMethod$new(
method_name = best_method_name,
analysis_type = analysis_type,
method_info = self$method_registry[[analysis_type]][[best_method_name]],
selection_score = best_score,
assumption_compliance = best_score$assumption_compliance,
selection_rationale = private$generate_selection_rationale(
best_method_name,
best_score,
method_scores,
assumption_results
),
alternative_methods = private$get_alternative_recommendations(
method_scores,
best_method_name
)
)
}
),
private = list(
#' Default method selector configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete configuration
#'
default_selector_config = function(config) {
default_config <- list(
preference_weights = list(
assumption_compliance = 0.4,
robustness = 0.3,
statistical_power = 0.2,
interpretability = 0.1
),
assumption_penalties = list(
normality_violation = -20,
homogeneity_violation = -15,
independence_violation = -30,
outlier_presence = -10
),
robustness_bonuses = list(
nonparametric = 15,
robust_parametric = 10,
bootstrap_based = 12
),
sample_size_considerations = list(
small_sample_threshold = 30,
very_small_threshold = 10,
small_sample_preference = "nonparametric"
)
)
modifyList(default_config, config)
},
#' Initialize method registry
#'
#' @return list. Available methods by analysis type
#'
initialize_method_registry = function() {
list(
independent_ttest = list(
independent_ttest_students = list(
name = "Student's Independent Samples t-Test",
type = "parametric",
required_assumptions = c("normality", "homogeneity", "independence"),
statistical_power = "high",
interpretability = "high",
robustness = "low",
sample_size_requirements = list(min = 6, recommended = 30),
effect_size_measures = c("cohens_d", "glass_delta"),
description = "Classic parametric test assuming equal variances"
),
independent_ttest_welch = list(
name = "Welch's Independent Samples t-Test",
type = "parametric_robust",
required_assumptions = c("normality", "independence"),
statistical_power = "high",
interpretability = "high",
robustness = "medium",
sample_size_requirements = list(min = 6, recommended = 30),
effect_size_measures = c("cohens_d", "glass_delta"),
description = "Robust parametric test not assuming equal variances"
),
mann_whitney_u = list(
name = "Mann-Whitney U Test",
type = "nonparametric",
required_assumptions = c("independence"),
statistical_power = "medium",
interpretability = "medium",
robustness = "high",
sample_size_requirements = list(min = 6, recommended = 20),
effect_size_measures = c("rank_biserial_r", "common_language_effect"),
description = "Distribution-free test comparing medians"
),
bootstrap_ttest = list(
name = "Bootstrap t-Test",
type = "resampling",
required_assumptions = c("independence"),
statistical_power = "high",
interpretability = "medium",
robustness = "high",
sample_size_requirements = list(min = 10, recommended = 50),
effect_size_measures = c("cohens_d", "bootstrap_ci"),
description = "Resampling-based test robust to distributional assumptions"
)
)
)
},
#' Get available methods for analysis type
#'
#' @param analysis_type character. Type of analysis
#'
#' @return character vector. Available method names
#'
get_available_methods = function(analysis_type) {
if (analysis_type %in% names(self$method_registry)) {
names(self$method_registry[[analysis_type]])
} else {
character(0)
}
},
#' Score a statistical method
#'
#' @param method_info list. Method information
#' @param data data.frame. Analysis data
#' @param assumption_results AssumptionTestResults. Assumption test results
#' @param parameters list. Analysis parameters
#'
#' @return list. Method scoring results
#'
score_method = function(method_info, data, assumption_results, parameters) {
scores <- list()
# Assumption compliance score
assumption_score <- private$score_assumption_compliance(
method_info$required_assumptions,
assumption_results
)
scores$assumption_compliance <- assumption_score
# Robustness score
robustness_score <- private$score_robustness(method_info, data)
scores$robustness <- robustness_score
# Statistical power score
power_score <- private$score_statistical_power(method_info, data, parameters)
scores$statistical_power <- power_score
# Interpretability score
interpretability_score <- private$score_interpretability(method_info, parameters)
scores$interpretability <- interpretability_score
# Sample size appropriateness
sample_size_score <- private$score_sample_size_appropriateness(method_info, data)
scores$sample_size_appropriateness <- sample_size_score
# Calculate weighted total score
weights <- self$config$preference_weights
total_score <- (
assumption_score * weights$assumption_compliance +
robustness_score * weights$robustness +
power_score * weights$statistical_power +
interpretability_score * weights$interpretability
)
list(
assumption_compliance = assumption_score,
robustness = robustness_score,
statistical_power = power_score,
interpretability = interpretability_score,
sample_size_appropriateness = sample_size_score,
total_score = total_score,
method_info = method_info
)
},
#' Score assumption compliance
#'
#' @param required_assumptions character vector. Required assumptions
#' @param assumption_results AssumptionTestResults. Test results
#'
#' @return numeric. Assumption compliance score
#'
score_assumption_compliance = function(required_assumptions, assumption_results) {
base_score <- 100
violated_assumptions <- assumption_results$overall_assessment$violations
for (assumption in required_assumptions) {
if (assumption %in% violated_assumptions) {
penalty <- self$config$assumption_penalties[[paste0(assumption, "_violation")]] %||% -10
base_score <- base_score + penalty
}
}
# Bonus for methods that don't require problematic assumptions
if (!"normality" %in% required_assumptions && "normality" %in% violated_assumptions) {
base_score <- base_score + 15
}
if (!"homogeneity" %in% required_assumptions && "homogeneity" %in% violated_assumptions) {
base_score <- base_score + 10
}
max(0, base_score)
},
#' Score robustness
#'
#' @param method_info list. Method information
#' @param data data.frame. Analysis data
#'
#' @return numeric. Robustness score
#'
score_robustness = function(method_info, data) {
base_score <- 50
# Robustness bonuses
if (method_info$type == "nonparametric") {
base_score <- base_score + self$config$robustness_bonuses$nonparametric
} else if (method_info$type == "parametric_robust") {
base_score <- base_score + self$config$robustness_bonuses$robust_parametric
} else if (method_info$type == "resampling") {
base_score <- base_score + self$config$robustness_bonuses$bootstrap_based
}
# Additional robustness considerations
if (method_info$robustness == "high") {
base_score <- base_score + 10
} else if (method_info$robustness == "low") {
base_score <- base_score - 10
}
max(0, base_score)
},
#' Score statistical power
#'
#' @param method_info list. Method information
#' @param data data.frame. Analysis data
#' @param parameters list. Analysis parameters
#'
#' @return numeric. Statistical power score
#'
score_statistical_power = function(method_info, data, parameters) {
base_score <- switch(method_info$statistical_power,
"high" = 80,
"medium" = 60,
"low" = 40,
50 # default
)
# Adjust for sample size
n <- nrow(data)
if (n < self$config$sample_size_considerations$very_small_threshold) {
base_score <- base_score - 20
} else if (n < self$config$sample_size_considerations$small_sample_threshold) {
base_score <- base_score - 10
}
max(0, base_score)
},
#' Score interpretability
#'
#' @param method_info list. Method information
#' @param parameters list. Analysis parameters
#'
#' @return numeric. Interpretability score
#'
score_interpretability = function(method_info, parameters) {
base_score <- switch(method_info$interpretability,
"high" = 80,
"medium" = 60,
"low" = 40,
50 # default
)
# Bonus for familiar methods in clinical contexts
if ("clinical_context" %in% names(parameters) && parameters$clinical_context) {
if (grepl("t-test", method_info$name, ignore.case = TRUE)) {
base_score <- base_score + 10
}
}
max(0, base_score)
},
#' Score sample size appropriateness
#'
#' @param method_info list. Method information
#' @param data data.frame. Analysis data
#'
#' @return numeric. Sample size appropriateness score
#'
score_sample_size_appropriateness = function(method_info, data) {
n <- nrow(data)
min_required <- method_info$sample_size_requirements$min %||% 6
recommended <- method_info$sample_size_requirements$recommended %||% 30
if (n < min_required) {
return(0) # Method not suitable
} else if (n < recommended) {
return(50) # Below recommended but acceptable
} else {
return(80) # Adequate sample size
}
},
#' Generate selection rationale
#'
#' @param selected_method character. Selected method name
#' @param selected_score list. Selected method score details
#' @param all_scores list. All method scores
#' @param assumption_results AssumptionTestResults. Assumption results
#'
#' @return character. Selection rationale
#'
generate_selection_rationale = function(selected_method, selected_score, all_scores, assumption_results) {
method_info <- selected_score$method_info
violations <- assumption_results$overall_assessment$violations
rationale_parts <- c()
# Primary selection reason
if (selected_score$assumption_compliance >= 80) {
rationale_parts <- c(rationale_parts,
paste("Selected", method_info$name, "because all required assumptions are met"))
} else if (method_info$type == "nonparametric" && "normality" %in% violations) {
rationale_parts <- c(rationale_parts,
paste("Selected", method_info$name, "because it doesn't require normality assumptions"))
} else if (method_info$type == "parametric_robust" && "homogeneity" %in% violations) {
rationale_parts <- c(rationale_parts,
paste("Selected", method_info$name, "because it's robust to unequal variances"))
} else {
rationale_parts <- c(rationale_parts,
paste("Selected", method_info$name, "as the best available option"))
}
# Additional justification
if (selected_score$robustness >= 70) {
rationale_parts <- c(rationale_parts, "This method provides good robustness to assumption violations")
}
if (selected_score$statistical_power >= 70) {
rationale_parts <- c(rationale_parts, "This method maintains good statistical power")
}
# Warnings if needed
if (length(violations) > 0) {
violated_text <- paste(violations, collapse = ", ")
rationale_parts <- c(rationale_parts,
paste("Note: Some assumptions are violated (", violated_text, "), but this method is relatively robust"))
}
paste(rationale_parts, collapse = ". ")
},
#' Get alternative method recommendations
#'
#' @param all_scores list. All method scores
#' @param selected_method character. Selected method name
#'
#' @return list. Alternative method recommendations
#'
get_alternative_recommendations = function(all_scores, selected_method) {
# Remove selected method and sort by score
alternative_scores <- all_scores[names(all_scores) != selected_method]
sorted_alternatives <- alternative_scores[order(sapply(alternative_scores, function(x) x$total_score), decreasing = TRUE)]
# Return top 2 alternatives with brief justification
alternatives <- list()
for (i in 1:min(2, length(sorted_alternatives))) {
alt_name <- names(sorted_alternatives)[i]
alt_score <- sorted_alternatives[[alt_name]]
alt_info <- alt_score$method_info
# Brief justification for alternative
justification <- if (alt_info$type == "nonparametric") {
"Distribution-free alternative"
} else if (alt_info$type == "resampling") {
"Bootstrap-based robust alternative"
} else if (alt_info$type == "parametric_robust") {
"Parametric robust alternative"
} else {
"Classical parametric alternative"
}
alternatives[[alt_name]] <- list(
name = alt_info$name,
type = alt_info$type,
score = alt_score$total_score,
justification = justification
)
}
alternatives
}
)
)Implementing Statistical Rigor in Practice
Integration with Enhanced t-Test Application
Transform the existing sophisticated t-test application to leverage the statistical rigor framework:
# Enhanced server logic integrating statistical rigor
enhanced_ttest_server <- function(id, audit_logger = NULL) {
moduleServer(id, function(input, output, session) {
# Initialize statistical testing engine
testing_engine <- StatisticalTestingEngine$new(
config = list(
regulatory_compliance = list(
standards = c("FDA", "ICH_E9"),
documentation_level = "comprehensive"
)
),
audit_logger = audit_logger
)
# Enhanced analysis execution
observeEvent(input$run_test, {
req(dataset())
showNotification("Conducting rigorous statistical analysis...",
type = "message", id = "analyzing")
# Prepare comprehensive analysis
analysis_result <- testing_engine$conduct_analysis(
data = dataset(),
analysis_type = "independent_ttest",
parameters = list(
alternative = input$alternative,
conf_level = input$conf_level,
clinical_context = TRUE
),
context = list(
application = "enhanced_ttest",
user_id = session$userData$user_id %||% "anonymous"
)
)
values$analysis_result <- analysis_result
removeNotification("analyzing")
})
# Display comprehensive results
output$statistical_results <- renderUI({
req(values$analysis_result)
result <- values$analysis_result
if (!result$success) {
return(div(class = "alert alert-danger",
h4("Analysis Issues"),
p(paste(result$errors, collapse = "; "))))
}
tagList(
# Executive Summary
card(
card_header("Statistical Analysis Summary"),
card_body(
div(class = "alert alert-info",
h5("Executive Summary"),
p(result$get_executive_summary())
),
# Key Results Dashboard
div(class = "row text-center",
div(class = "col-md-3",
div(class = "card bg-light",
div(class = "card-body",
h6("p-value"),
h4(format.pval(result$primary_analysis$p_value, digits = 4)),
small(class = "text-muted",
if (result$primary_analysis$p_value < 0.05) "Significant" else "Not Significant")
)
)
),
div(class = "col-md-3",
div(class = "card bg-light",
div(class = "card-body",
h6("Effect Size"),
h4(round(result$effect_analysis$primary_effect_size$value, 3)),
small(class = "text-muted", result$effect_analysis$primary_effect_size$interpretation)
)
)
),
div(class = "col-md-3",
div(class = "card bg-light",
div(class = "card-body",
h6("Method Used"),
h4(abbr(result$method_selection$method_info$type,
title = result$method_selection$method_info$name)),
small(class = "text-muted", "Selected Method")
)
)
),
div(class = "col-md-3",
div(class = "card bg-light",
div(class = "card-body",
h6("Assumptions"),
h4(if (result$assumption_tests$overall_assessment$suitable) {
icon("check-circle", class = "text-success")
} else {
icon("exclamation-triangle", class = "text-warning")
}),
small(class = "text-muted",
if (result$assumption_tests$overall_assessment$suitable) "Met" else "Issues")
)
)
)
)
)
),
# Method Selection and Justification
card(
class = "mt-3",
card_header("Method Selection and Validation"),
card_body(
h5(result$method_selection$method_info$name),
p(class = "text-muted", result$method_selection$method_info$description),
div(class = "alert alert-light",
strong("Selection Rationale: "),
result$method_selection$selection_rationale
),
# Assumption Test Summary
if (length(result$assumption_tests$overall_assessment$violations) > 0) {
div(class = "alert alert-warning",
strong("Assumption Violations Detected: "),
paste(result$assumption_tests$overall_assessment$violations, collapse = ", "),
br(),
"The selected method accounts for these violations and provides robust results."
)
} else {
div(class = "alert alert-success",
strong("All Required Assumptions Met: "),
"Statistical prerequisites are satisfied for reliable analysis."
)
}
)
)
)
})
# Enhanced assumption testing display
output$assumption_results <- renderPrint({
req(values$analysis_result)
result <- values$analysis_result
if (!result$success) return("Analysis failed - assumption testing unavailable")
assumption_tests <- result$assumption_tests
cat("COMPREHENSIVE ASSUMPTION TESTING REPORT\n")
cat("=====================================\n\n")
# Overall Assessment
cat("OVERALL ASSESSMENT:\n")
cat("------------------\n")
if (assumption_tests$overall_assessment$suitable) {
cat("✓ All required assumptions are met for reliable analysis\n\n")
} else {
cat("⚠ Some assumptions are violated:\n")
for (violation in assumption_tests$overall_assessment$violations) {
cat(" • ", violation, "\n")
}
cat("\nHowever, the selected method is robust to these violations.\n\n")
}
# Detailed Results by Assumption
for (assumption_name in names(assumption_tests$assumption_results)) {
assumption_result <- assumption_tests$assumption_results[[assumption_name]]
cat(toupper(assumption_name), " TESTING:\n")
cat(paste(rep("-", nchar(assumption_name) + 9), collapse = ""), "\n")
if ("error" %in% names(assumption_result)) {
cat("Error:", assumption_result$error, "\n\n")
next
}
# Print assumption-specific results
if (assumption_name == "normality") {
for (var_name in names(assumption_result)) {
if (var_name == "error") next
var_result <- assumption_result[[var_name]]
cat("Variable:", var_name, "\n")
if ("shapiro_wilk" %in% names(var_result)) {
sw <- var_result$shapiro_wilk
cat(" Shapiro-Wilk: W =", round(sw$statistic, 4),
", p =", round(sw$p_value, 4))
if (sw$conclusion) cat(" ✓") else cat(" ✗")
cat("\n")
}
if ("overall_conclusion" %in% names(var_result)) {
conclusion <- var_result$overall_conclusion
cat(" Overall conclusion:",
if (conclusion$assumes_normal) "Normal distribution" else "Non-normal distribution")
cat(" (", conclusion$n_tests, " tests)\n")
}
cat("\n")
}
}
if (assumption_name == "homogeneity") {
if ("levene" %in% names(assumption_result)) {
levene <- assumption_result$levene
cat("Levene's Test: F =", round(levene$statistic, 4),
", p =", round(levene$p_value, 4))
if (levene$conclusion) cat(" ✓") else cat(" ✗")
cat("\n")
}
if ("overall_conclusion" %in% names(assumption_result)) {
conclusion <- assumption_result$overall_conclusion
cat("Overall conclusion:", conclusion$recommendation, "\n")
}
cat("\n")
}
if (assumption_name == "outliers") {
total_outliers <- 0
for (var_name in names(assumption_result)) {
var_result <- assumption_result[[var_name]]
if ("consensus" %in% names(var_result)) {
n_outliers <- length(var_result$consensus$high_confidence_outliers)
total_outliers <- total_outliers + n_outliers
if (n_outliers > 0) {
cat("Variable", var_name, ":", n_outliers, "high-confidence outliers detected\n")
}
}
}
if (total_outliers == 0) {
cat("No significant outliers detected across variables\n")
}
cat("\n")
}
}
# Recommendations
if (length(assumption_tests$overall_assessment$recommendations) > 0) {
cat("RECOMMENDATIONS:\n")
cat("---------------\n")
for (rec in assumption_tests$overall_assessment$recommendations) {
cat("•", rec, "\n")
}
cat("\n")
}
})
# Enhanced interpretation guidance
output$interpretation_guidance <- renderUI({
req(values$analysis_result)
result <- values$analysis_result
if (!result$success) return(NULL)
interpretation <- result$interpretation
tagList(
card(
card_header("Statistical Interpretation and Clinical Guidance"),
card_body(
# Statistical Significance Section
div(class = "mb-4",
h5("Statistical Significance"),
div(
class = if (interpretation$statistical_significance$is_significant) {
"alert alert-success"
} else {
"alert alert-secondary"
},
p(interpretation$statistical_significance$interpretation)
)
),
# Effect Size and Practical Importance
div(class = "mb-4",
h5("Effect Size and Practical Importance"),
div(class = "alert alert-info",
p(strong("Effect Size: "),
interpretation$effect_size$magnitude,
" (", round(result$effect_analysis$primary_effect_size$value, 3), ")"),
p(strong("Practical Importance: "),
interpretation$effect_size$practical_importance$magnitude),
p(interpretation$effect_size$practical_importance$recommendation)
)
),
# Overall Conclusion
div(class = "mb-4",
h5("Overall Conclusion"),
div(class = "alert alert-primary",
p(interpretation$conclusion$summary)
)
),
# Limitations and Considerations
if (length(interpretation$limitations) > 0) {
div(class = "mb-4",
h5("Limitations and Considerations"),
div(class = "alert alert-light",
tags$ul(
lapply(interpretation$limitations, function(lim) tags$li(lim))
)
)
)
}
)
)
)
})
})
}Common Questions About Statistical Rigor
Enterprise applications require statistical validity that can withstand regulatory scrutiny, peer review, and legal challenges. Comprehensive assumption testing provides documented evidence that statistical methods are appropriate for the data and analysis context. This documentation becomes crucial for regulatory submissions, clinical trial reports, and publications where statistical methodology must be transparent and defensible. Additionally, assumption violations can lead to incorrect conclusions with significant business or clinical consequences, making thorough validation essential for responsible decision-making.
Intelligent method selection eliminates human bias and inconsistency in statistical method choice while ensuring optimal approaches for specific data characteristics. Manual selection often relies on default methods or analyst preferences, potentially missing robust alternatives when assumptions are violated. Automated selection considers multiple factors simultaneously - assumption compliance, statistical power, robustness, and interpretability - using consistent, documented criteria. This approach also provides audit trails and justification required for regulatory compliance while reducing analysis time and improving reproducibility across different analysts and studies.
This framework addresses specific requirements of regulated industries including FDA guidance compliance, ICH E9 statistical principles, and Good Clinical Practice standards. The comprehensive documentation automatically generated meets regulatory expectations for statistical analysis plans, assumption testing evidence, and method selection justification. The audit trail capabilities support 21 CFR Part 11 requirements for electronic records, while the sensitivity analysis features address regulatory concerns about analytical robustness. Additionally, the clinical significance assessment and effect size interpretation provide context essential for regulatory decision-making and clinical interpretation.
The system recognizes that real-world data often violates theoretical assumptions and provides intelligent responses rather than simply flagging violations. When assumptions are violated, the method selector automatically identifies robust alternatives that remain valid under the observed data conditions. The sensitivity analysis quantifies how assumption violations affect conclusions, while the interpretation guidance explains the practical implications. This approach enables confident analysis of imperfect data while maintaining statistical rigor and providing transparent documentation of analytical decisions and their justifications.
The framework is designed for users with basic statistical knowledge who need enterprise-grade rigor without deep statistical programming expertise. Users should understand fundamental concepts like p-values, effect sizes, and basic assumptions, but the system provides interpretation guidance and context-aware explanations. For advanced users, the framework offers complete customization and detailed diagnostic information. The key advantage is that it elevates the statistical rigor of any analysis regardless of user expertise level, while providing educational value through comprehensive explanations and best practice guidance.
Test Your Understanding
Which components are essential for enterprise-grade statistical rigor in clinical applications? Select all that apply:
- Comprehensive assumption testing with multiple methods
- Intelligent method selection based on data characteristics
- Sensitivity analysis across alternative approaches
- Regulatory compliance documentation
- User-friendly interface design
- Automated interpretation guidance
- Consider what regulatory bodies require for statistical submissions
- Think about what makes analysis defensible under scrutiny
- Focus on components that ensure analytical validity
A, B, C, D, and F are all essential components.
Essential components explained:
- A) Comprehensive assumption testing: Required to validate that statistical methods are appropriate for the data and provide documented evidence for regulatory review
- B) Intelligent method selection: Ensures optimal analytical approaches while eliminating bias and providing audit trails for decision rationale
- C) Sensitivity analysis: Demonstrates analytical robustness and addresses regulatory concerns about conclusion stability across methods
- D) Regulatory compliance documentation: Essential for FDA submissions, clinical trials, and peer review processes
- F) Automated interpretation guidance: Provides context-aware explanations essential for clinical significance assessment and decision-making
E) User-friendly interface design is important for usability but not essential for statistical rigor itself. The rigor comes from the analytical framework, not the presentation layer.
Your data violates normality assumptions but meets independence and has equal variances. The intelligent method selector chooses Welch’s t-test over Student’s t-test. Why might this decision be suboptimal, and what would be a better choice?
- Welch’s t-test is always preferred over Student’s t-test
- Should use Mann-Whitney U test since normality is violated
- Student’s t-test would be better since variances are equal
- Should use bootstrap methods for any assumption violations
- Consider which assumptions each method requires
- Think about the trade-offs between different approaches when assumptions are partially met
- Remember that robustness vs. power is a key consideration
B) Should use Mann-Whitney U test since normality is violated
Detailed reasoning:
When normality is violated, the most appropriate response is to use a method that doesn’t assume normality, regardless of whether other assumptions are met. Here’s why:
- Welch’s t-test still assumes normality - it only relaxes the equal variance assumption
- Student’s t-test assumes both normality and equal variances - having equal variances doesn’t help if normality fails
- Mann-Whitney U test only requires independence and ordinal data - it’s robust to non-normality
- Bootstrap methods are also appropriate but may be unnecessarily complex for this scenario
The optimal method selection should prioritize assumption compliance first, then consider power and interpretability among valid options. A properly configured intelligent selector would weight assumption violations heavily and select the Mann-Whitney U test in this scenario.
You’re implementing this statistical rigor framework in a pharmaceutical company. The statisticians are concerned about losing control over method selection, while regulatory affairs wants comprehensive documentation. How should you configure the system to address both concerns?
- Consider how to balance automation with expert oversight
- Think about documentation requirements for regulatory compliance
- Consider different user roles and their needs
Implement a hybrid approach with expert oversight and comprehensive audit trails:
Configuration strategy:
Expert Review Mode: Allow statisticians to review and override automated method selection with required justification that becomes part of the audit trail
Tiered Automation:
- Fully automated for routine analyses with clear data characteristics
- Expert consultation required for edge cases or conflicting indicators
- Manual override available with documented rationale
Comprehensive Documentation: Generate regulatory-compliant reports regardless of selection method (automated vs. manual) including:
- Complete assumption testing evidence
- Method selection rationale (whether automated or expert-driven)
- Sensitivity analysis results
- Expert review comments and overrides
Role-Based Configuration:
- Statisticians: Full access to detailed diagnostics, override capabilities, and custom configuration
- Regulatory Affairs: Focus on compliance documentation and audit trail completeness
- Clinical Teams: Simplified interface with comprehensive interpretation guidance
This approach maintains statistical expertise while ensuring consistent rigor and documentation standards required for regulatory compliance.
Conclusion
Statistical rigor forms the cornerstone of credible biostatistics applications, transforming routine analyses into enterprise-grade systems that meet the demanding standards of regulated industries. This comprehensive framework elevates your t-test application from a basic statistical tool to a sophisticated analytical platform capable of supporting regulatory submissions, clinical decision-making, and peer-reviewed research.
The automated assumption testing, intelligent method selection, and comprehensive documentation systems you’ve implemented provide the foundation for any statistical application requiring professional reliability. These patterns extend far beyond t-tests to encompass the full spectrum of biostatistical analyses while maintaining the accessibility and user experience that stakeholders expect from modern analytical tools.
Your enhanced application now demonstrates how enterprise development transcends basic functionality to address the complex requirements of real-world statistical practice, where analytical decisions must be defensible, reproducible, and aligned with regulatory standards while remaining accessible to clinical and research teams.
Next Steps
Based on your mastery of statistical rigor principles, here are the recommended paths for continuing your enterprise development journey:
Immediate Next Steps (Complete These First)
- Advanced Visualizations for Statistical Results - Transform your rigorous analyses into publication-ready visualizations that communicate statistical findings effectively
- Testing Framework and Validation - Implement comprehensive testing strategies that validate your statistical computations and ensure reliability across scenarios
- Practice Exercise: Extend your statistical rigor framework to include ANOVA or regression analysis, implementing the same comprehensive validation and intelligent method selection patterns
Building Your Enterprise Platform (Choose Your Focus)
For Visualization Excellence: - Professional Statistical Plot Generation - Building Interactive Dashboards
For Production Reliability:
For Clinical Integration:
Long-term Goals (2-4 Weeks)
- Develop a complete biostatistics platform with multiple statistical modules using your rigor framework
- Create enterprise deployment strategies for regulated environments with full validation documentation
- Build custom statistical method libraries that extend the framework for specialized clinical applications
- Contribute to the biostatistics community by sharing validated, enterprise-grade statistical tools
Explore More Articles
Here are more articles from the same category to help you dive deeper into enterprise Shiny development.
Reuse
Citation
@online{kassambara2025,
author = {Kassambara, Alboukadel},
title = {Statistical {Rigor} and {Assumption} {Testing:} {Enterprise}
{Statistical} {Validation}},
date = {2025-05-23},
url = {https://www.datanovia.com/learn/tools/shiny-apps/enterprise-development/statistical-rigor.html},
langid = {en}
}
