flowchart TD
A[User Input] --> B[Client-Side Validation]
B --> C[Transport Security]
C --> D[Server-Side Validation]
D --> E[Business Logic Validation]
E --> F[Database Validation]
F --> G[Output Validation]
G --> H[Audit Logging]
B --> B1[Format checking]
B --> B2[Range validation]
B --> B3[Required fields]
D --> D1[Input sanitization]
D --> D2[Type checking]
D --> D3[Size limits]
D --> D4[Injection prevention]
E --> E1[Statistical validity]
E --> E2[Domain constraints]
E --> E3[Cross-field validation]
E --> E4[Business rules]
F --> F1[Data type constraints]
F --> F2[Referential integrity]
F --> F3[Unique constraints]
G --> G1[Output sanitization]
G --> G2[Format validation]
G --> G3[Content filtering]
H --> H1[Access logging]
H --> H2[Change tracking]
H --> H3[Error recording]
H --> H4[Compliance documentation]
style A fill:#ffebee
style H fill:#e8f5e8
style D fill:#fff3e0
style E fill:#f3e5f5
Key Takeaways
- Comprehensive Validation: Implement multi-layered validation systems that prevent data corruption, injection attacks, and user errors at input, processing, and output stages
- Security Protocols: Master enterprise security standards including input sanitization, session management, and audit logging required for regulated clinical research environments
- Error Boundaries: Create robust error handling systems that gracefully manage failures while maintaining application stability and user trust
- Regulatory Compliance: Build validation frameworks that support 21 CFR Part 11 compliance, audit trails, and documentation requirements for pharmaceutical applications
- Performance Optimization: Implement efficient validation strategies that maintain security without compromising application responsiveness or user experience
Introduction
Data validation and security form the foundation of enterprise statistical applications, particularly in regulated industries where data integrity directly impacts patient safety, regulatory compliance, and business operations. Unlike basic applications that rely on simple input checking, enterprise systems require comprehensive validation frameworks that prevent corruption, detect anomalies, and maintain audit trails for regulatory review.
This tutorial transforms our professionally designed t-test application into a secure, validated system that meets pharmaceutical and clinical research standards. You’ll learn to implement multi-layered validation, comprehensive error handling, security protocols, and audit logging systems that protect sensitive data while maintaining user productivity and regulatory compliance.
The validation and security patterns you’ll master apply universally to statistical applications while addressing specific requirements for biostatistics, clinical trials, and healthcare data where validation failures can have serious consequences for research integrity and patient safety.
Understanding Enterprise Validation Requirements
Multi-Layered Validation Architecture
Enterprise applications require comprehensive validation at multiple levels to ensure data integrity and security:
Critical Validation Layers:
Client-Side Validation (User Experience):
Immediate feedback for users to prevent basic errors and improve usability, while understanding that client-side validation alone is insufficient for security.
Server-Side Validation (Security Foundation):
Comprehensive validation on the server that cannot be bypassed, including input sanitization, type checking, and injection attack prevention.
Business Logic Validation (Domain Integrity):
Statistical and domain-specific validation that ensures data meets analytical requirements and business rules.
Audit and Compliance (Regulatory Requirements):
Comprehensive logging and documentation that supports regulatory validation and audit trail requirements.
Security Threat Model for Statistical Applications
Understanding common security threats helps design appropriate protection strategies:
Data Injection Attacks:
- SQL injection through database queries
- Script injection through user inputs
- File path manipulation through upload features
- Cross-site scripting (XSS) through output display
Data Integrity Threats:
- Unauthorized data modification
- Concurrent access conflicts
- Data corruption during processing
- Incomplete transaction processing
Privacy and Confidentiality Risks:
- Unauthorized data access
- Session hijacking and impersonation
- Data leakage through error messages
- Insufficient access controls
Compliance Violations:
- Inadequate audit trails
- Missing change documentation
- Insufficient validation evidence
- Non-compliant data handling
Comprehensive Input Validation Framework
Core Validation Infrastructure
Create a robust validation system that handles all input types with enterprise-grade security:
# File: R/validation_framework.R
#' Enterprise Validation Framework
#'
#' @description Comprehensive validation system for enterprise statistical
#' applications with security, audit logging, and regulatory compliance support.
#'
#' @details This framework provides multi-layered validation including input
#' sanitization, type checking, business rule validation, and audit logging.
#'
#' @export
ValidationFramework <- R6::R6Class(
"ValidationFramework",
public = list(
#' @field validators list of registered validators
validators = NULL,
#' @field audit_logger audit logging instance
audit_logger = NULL,
#' @field security_config security configuration settings
security_config = NULL,
#' Initialize validation framework
#'
#' @param config list. Configuration settings for validation and security
#'
initialize = function(config = list()) {
self$validators <- list()
self$security_config <- private$default_security_config(config)
self$audit_logger <- AuditLogger$new(self$security_config$audit)
# Register default validators
private$register_default_validators()
self$audit_logger$log_event("ValidationFramework initialized",
level = "INFO", category = "SYSTEM")
},
#' Validate input data with comprehensive checking
#'
#' @param data any. Data to validate
#' @param validator_name character. Name of validator to use
#' @param context list. Additional context for validation
#'
#' @return ValidationResult object with validation outcome
#'
validate = function(data, validator_name, context = list()) {
start_time <- Sys.time()
tryCatch({
# Input sanitization
sanitized_data <- private$sanitize_input(data)
# Get validator
validator <- self$validators[[validator_name]]
if (is.null(validator)) {
stop(paste("Validator not found:", validator_name))
}
# Perform validation
result <- validator$validate(sanitized_data, context)
# Log validation attempt
self$audit_logger$log_validation(
validator_name = validator_name,
input_hash = private$hash_input(data),
result = result,
duration = as.numeric(Sys.time() - start_time, units = "secs"),
context = context
)
return(result)
}, error = function(e) {
# Log validation error
self$audit_logger$log_event(
paste("Validation error:", e$message),
level = "ERROR",
category = "VALIDATION",
details = list(
validator_name = validator_name,
error = e$message,
context = context
)
)
# Return error result
ValidationResult$new(
valid = FALSE,
errors = list(paste("Validation failed:", e$message)),
warnings = list(),
data = NULL
)
})
},
#' Register custom validator
#'
#' @param name character. Validator name
#' @param validator Validator. Validator instance
#'
register_validator = function(name, validator) {
self$validators[[name]] <- validator
self$audit_logger$log_event(
paste("Validator registered:", name),
level = "INFO",
category = "CONFIGURATION"
)
}
),
private = list(
#' Sanitize input data
#'
#' @param data any. Input data to sanitize
#'
#' @return Sanitized data
#'
sanitize_input = function(data) {
if (is.character(data)) {
# Remove potentially dangerous characters
data <- gsub("[<>\"'&]", "", data)
# Limit length to prevent buffer overflow
max_length <- self$security_config$max_string_length
if (nchar(data) > max_length) {
data <- substr(data, 1, max_length)
}
# Remove control characters
data <- gsub("[[:cntrl:]]", "", data)
}
if (is.numeric(data)) {
# Check for infinite or NaN values
if (any(is.infinite(data)) || any(is.nan(data))) {
stop("Invalid numeric values detected (Inf/NaN)")
}
# Check range limits
max_value <- self$security_config$max_numeric_value
if (any(abs(data) > max_value, na.rm = TRUE)) {
stop("Numeric values exceed allowed range")
}
}
return(data)
},
#' Generate hash of input for audit trail
#'
#' @param data any. Input data
#'
#' @return character. Hash of input data
#'
hash_input = function(data) {
digest::digest(data, algo = "sha256")
},
#' Default security configuration
#'
#' @param config list. User-provided configuration
#'
#' @return list. Complete security configuration
#'
default_security_config = function(config) {
default_config <- list(
max_string_length = 10000,
max_numeric_value = 1e10,
max_file_size = 50 * 1024 * 1024, # 50MB
allowed_file_types = c("csv", "txt", "xlsx"),
session_timeout = 3600, # 1 hour
max_login_attempts = 5,
audit = list(
enabled = TRUE,
log_level = "INFO",
log_file = "audit.log",
retention_days = 365
)
)
# Merge with user configuration
modifyList(default_config, config)
},
#' Register default validators
#'
register_default_validators = function() {
# Numeric data validator
self$register_validator(
"numeric_data",
NumericDataValidator$new()
)
# Text data validator
self$register_validator(
"text_data",
TextDataValidator$new()
)
# File upload validator
self$register_validator(
"file_upload",
FileUploadValidator$new()
)
# Statistical data validator
self$register_validator(
"statistical_data",
StatisticalDataValidator$new()
)
# T-test specific validator
self$register_validator(
"ttest_data",
TTestDataValidator$new()
)
}
)
)
#' Validation Result Class
#'
#' @description Container for validation results with comprehensive information
#' about validation outcome, errors, warnings, and processed data.
#'
#' @export
ValidationResult <- R6::R6Class(
"ValidationResult",
public = list(
#' @field valid logical. Whether validation passed
valid = NULL,
#' @field errors list. Validation errors
errors = NULL,
#' @field warnings list. Validation warnings
warnings = NULL,
#' @field data any. Validated and processed data
data = NULL,
#' @field metadata list. Additional validation metadata
metadata = NULL,
#' Initialize validation result
#'
#' @param valid logical. Validation success status
#' @param errors list. List of error messages
#' @param warnings list. List of warning messages
#' @param data any. Validated data
#' @param metadata list. Additional metadata
#'
initialize = function(valid, errors = list(), warnings = list(),
data = NULL, metadata = list()) {
self$valid <- valid
self$errors <- errors
self$warnings <- warnings
self$data <- data
self$metadata <- modifyList(
list(timestamp = Sys.time(),
validation_id = uuid::UUIDgenerate()),
metadata
)
},
#' Get formatted error message
#'
#' @return character. Formatted error message
#'
get_error_message = function() {
if (length(self$errors) == 0) {
return("No errors")
}
paste(self$errors, collapse = "; ")
},
#' Get formatted warning message
#'
#' @return character. Formatted warning message
#'
get_warning_message = function() {
if (length(self$warnings) == 0) {
return("No warnings")
}
paste(self$warnings, collapse = "; ")
},
#' Check if result has errors
#'
#' @return logical. TRUE if errors exist
#'
has_errors = function() {
length(self$errors) > 0
},
#' Check if result has warnings
#'
#' @return logical. TRUE if warnings exist
#'
has_warnings = function() {
length(self$warnings) > 0
}
)
)Specialized Validators
Create domain-specific validators for statistical applications:
# File: R/validators_statistical.R
#' Statistical Data Validator
#'
#' @description Comprehensive validator for statistical data with checks for
#' data quality, statistical assumptions, and analytical requirements.
#'
#' @export
StatisticalDataValidator <- R6::R6Class(
"StatisticalDataValidator",
public = list(
#' Validate statistical data
#'
#' @param data data.frame or numeric vector. Data to validate
#' @param context list. Validation context and requirements
#'
#' @return ValidationResult. Validation outcome
#'
validate = function(data, context = list()) {
errors <- list()
warnings <- list()
metadata <- list()
# Basic data structure validation
structure_result <- private$validate_data_structure(data)
errors <- c(errors, structure_result$errors)
warnings <- c(warnings, structure_result$warnings)
# Statistical quality checks
quality_result <- private$validate_data_quality(data)
errors <- c(errors, quality_result$errors)
warnings <- c(warnings, quality_result$warnings)
# Sample size validation
sample_result <- private$validate_sample_size(data, context)
errors <- c(errors, sample_result$errors)
warnings <- c(warnings, sample_result$warnings)
# Missing data validation
missing_result <- private$validate_missing_data(data, context)
errors <- c(errors, missing_result$errors)
warnings <- c(warnings, missing_result$warnings)
# Outlier detection
outlier_result <- private$detect_outliers(data)
warnings <- c(warnings, outlier_result$warnings)
metadata$outliers <- outlier_result$metadata
# Compile validation result
ValidationResult$new(
valid = length(errors) == 0,
errors = errors,
warnings = warnings,
data = if (length(errors) == 0) data else NULL,
metadata = metadata
)
}
),
private = list(
#' Validate data structure
#'
#' @param data any. Input data
#'
#' @return list. Validation results
#'
validate_data_structure = function(data) {
errors <- list()
warnings <- list()
# Check if data exists
if (is.null(data)) {
errors <- append(errors, "Data is null")
return(list(errors = errors, warnings = warnings))
}
# Check data type
if (!is.data.frame(data) && !is.numeric(data)) {
errors <- append(errors, "Data must be a data.frame or numeric vector")
}
# Check for empty data
if (is.data.frame(data)) {
if (nrow(data) == 0) {
errors <- append(errors, "Data frame is empty")
}
if (ncol(data) == 0) {
errors <- append(errors, "Data frame has no columns")
}
} else if (is.numeric(data)) {
if (length(data) == 0) {
errors <- append(errors, "Numeric vector is empty")
}
}
list(errors = errors, warnings = warnings)
},
#' Validate data quality
#'
#' @param data data.frame or numeric. Data to check
#'
#' @return list. Quality validation results
#'
validate_data_quality = function(data) {
errors <- list()
warnings <- list()
if (is.data.frame(data)) {
# Check for numeric columns
numeric_cols <- sapply(data, is.numeric)
if (!any(numeric_cols)) {
errors <- append(errors, "Data frame contains no numeric columns")
}
# Check for constant columns
constant_cols <- sapply(data[numeric_cols], function(x) {
if (length(unique(x[!is.na(x)])) <= 1) TRUE else FALSE
})
if (any(constant_cols)) {
const_names <- names(data)[numeric_cols][constant_cols]
warnings <- append(warnings,
paste("Constant columns detected:",
paste(const_names, collapse = ", ")))
}
} else if (is.numeric(data)) {
# Check for infinite values
if (any(is.infinite(data))) {
errors <- append(errors, "Data contains infinite values")
}
# Check for all identical values
if (length(unique(data[!is.na(data)])) <= 1) {
warnings <- append(warnings, "All non-missing values are identical")
}
}
list(errors = errors, warnings = warnings)
},
#' Validate sample size requirements
#'
#' @param data data.frame or numeric. Data to check
#' @param context list. Validation context
#'
#' @return list. Sample size validation results
#'
validate_sample_size = function(data, context) {
errors <- list()
warnings <- list()
min_sample_size <- context$min_sample_size %||% 3
recommended_size <- context$recommended_sample_size %||% 30
if (is.data.frame(data)) {
n <- nrow(data)
} else {
n <- length(data)
}
if (n < min_sample_size) {
errors <- append(errors,
paste("Sample size", n, "is below minimum required:",
min_sample_size))
} else if (n < recommended_size) {
warnings <- append(warnings,
paste("Sample size", n, "is below recommended size:",
recommended_size, "for reliable statistical inference"))
}
list(errors = errors, warnings = warnings)
},
#' Validate missing data patterns
#'
#' @param data data.frame or numeric. Data to check
#' @param context list. Validation context
#'
#' @return list. Missing data validation results
#'
validate_missing_data = function(data, context) {
errors <- list()
warnings <- list()
max_missing_prop <- context$max_missing_proportion %||% 0.5
if (is.data.frame(data)) {
missing_props <- sapply(data, function(x) sum(is.na(x)) / length(x))
if (any(missing_props > max_missing_prop)) {
high_missing <- names(missing_props)[missing_props > max_missing_prop]
errors <- append(errors,
paste("High missing data proportion in columns:",
paste(high_missing, collapse = ", ")))
}
if (any(missing_props > 0.1)) {
some_missing <- names(missing_props)[missing_props > 0.1]
warnings <- append(warnings,
paste("Notable missing data in columns:",
paste(some_missing, collapse = ", ")))
}
} else if (is.numeric(data)) {
missing_prop <- sum(is.na(data)) / length(data)
if (missing_prop > max_missing_prop) {
errors <- append(errors,
paste("High missing data proportion:",
round(missing_prop * 100, 1), "%"))
} else if (missing_prop > 0.1) {
warnings <- append(warnings,
paste("Notable missing data:",
round(missing_prop * 100, 1), "%"))
}
}
list(errors = errors, warnings = warnings)
},
#' Detect statistical outliers
#'
#' @param data data.frame or numeric. Data to check
#'
#' @return list. Outlier detection results
#'
detect_outliers = function(data) {
warnings <- list()
metadata <- list()
if (is.data.frame(data)) {
numeric_cols <- sapply(data, is.numeric)
outlier_info <- list()
for (col_name in names(data)[numeric_cols]) {
col_data <- data[[col_name]]
outliers <- private$detect_univariate_outliers(col_data)
if (outliers$count > 0) {
outlier_info[[col_name]] <- outliers
warnings <- append(warnings,
paste("Potential outliers detected in", col_name,
":", outliers$count, "observations"))
}
}
metadata$outlier_details <- outlier_info
} else if (is.numeric(data)) {
outliers <- private$detect_univariate_outliers(data)
if (outliers$count > 0) {
warnings <- append(warnings,
paste("Potential outliers detected:",
outliers$count, "observations"))
metadata$outlier_details <- outliers
}
}
list(warnings = warnings, metadata = metadata)
},
#' Detect univariate outliers using IQR method
#'
#' @param x numeric. Numeric vector
#'
#' @return list. Outlier information
#'
detect_univariate_outliers = function(x) {
if (length(x) < 4 || all(is.na(x))) {
return(list(count = 0, indices = integer(0), method = "insufficient_data"))
}
# Remove missing values for calculation
x_clean <- x[!is.na(x)]
# Calculate IQR-based outlier bounds
q1 <- quantile(x_clean, 0.25, na.rm = TRUE)
q3 <- quantile(x_clean, 0.75, na.rm = TRUE)
iqr <- q3 - q1
lower_bound <- q1 - 1.5 * iqr
upper_bound <- q3 + 1.5 * iqr
# Find outliers
outlier_logical <- x < lower_bound | x > upper_bound
outlier_indices <- which(outlier_logical)
list(
count = length(outlier_indices),
indices = outlier_indices,
method = "IQR",
bounds = c(lower_bound, upper_bound),
values = x[outlier_indices]
)
}
)
)
#' T-Test Specific Data Validator
#'
#' @description Specialized validator for independent samples t-test data
#' with checks for grouping variables, sample sizes, and statistical assumptions.
#'
#' @export
TTestDataValidator <- R6::R6Class(
"TTestDataValidator",
inherit = StatisticalDataValidator,
public = list(
#' Validate t-test specific requirements
#'
#' @param data data.frame. Data with group and response variables
#' @param context list. Validation context
#'
#' @return ValidationResult. Validation outcome
#'
validate = function(data, context = list()) {
# Call parent validation first
parent_result <- super$validate(data, context)
if (!parent_result$valid) {
return(parent_result)
}
errors <- parent_result$errors
warnings <- parent_result$warnings
metadata <- parent_result$metadata
# T-test specific validations
ttest_result <- private$validate_ttest_structure(data, context)
errors <- c(errors, ttest_result$errors)
warnings <- c(warnings, ttest_result$warnings)
metadata <- modifyList(metadata, ttest_result$metadata)
# Group-specific validations
if (length(errors) == 0) {
group_result <- private$validate_group_requirements(data, context)
errors <- c(errors, group_result$errors)
warnings <- c(warnings, group_result$warnings)
metadata <- modifyList(metadata, group_result$metadata)
}
ValidationResult$new(
valid = length(errors) == 0,
errors = errors,
warnings = warnings,
data = if (length(errors) == 0) data else NULL,
metadata = metadata
)
}
),
private = list(
#' Validate t-test data structure
#'
#' @param data data.frame. Input data
#' @param context list. Validation context
#'
#' @return list. Validation results
#'
validate_ttest_structure = function(data, context) {
errors <- list()
warnings <- list()
metadata <- list()
# Check for required columns
required_cols <- c("group", "response")
missing_cols <- setdiff(required_cols, names(data))
if (length(missing_cols) > 0) {
errors <- append(errors,
paste("Missing required columns:",
paste(missing_cols, collapse = ", ")))
return(list(errors = errors, warnings = warnings, metadata = metadata))
}
# Validate response variable
if (!is.numeric(data$response)) {
errors <- append(errors, "Response variable must be numeric")
}
# Validate group variable
unique_groups <- unique(data$group[!is.na(data$group)])
if (length(unique_groups) != 2) {
errors <- append(errors,
paste("Group variable must have exactly 2 levels, found:",
length(unique_groups)))
}
metadata$group_levels <- unique_groups
metadata$total_observations <- nrow(data)
list(errors = errors, warnings = warnings, metadata = metadata)
},
#' Validate group-specific requirements
#'
#' @param data data.frame. Input data
#' @param context list. Validation context
#'
#' @return list. Validation results
#'
validate_group_requirements = function(data, context) {
errors <- list()
warnings <- list()
metadata <- list()
min_group_size <- context$min_group_size %||% 3
recommended_group_size <- context$recommended_group_size %||% 15
# Calculate group sizes
group_sizes <- table(data$group)
metadata$group_sizes <- as.list(group_sizes)
# Check minimum group sizes
# Warning for recommended group sizes
small_recommended <- group_sizes < recommended_group_size
if (any(small_recommended)) {
small_rec_names <- names(group_sizes)[small_recommended]
warnings <- append(warnings,
paste("Groups below recommended sample size:",
paste(small_rec_names, collapse = ", "),
"(recommended:", recommended_group_size, ")"))
}
# Check for balanced design
size_ratio <- max(group_sizes) / min(group_sizes)
if (size_ratio > 3) {
warnings <- append(warnings,
paste("Unbalanced group sizes detected (ratio:",
round(size_ratio, 2), "). Consider balanced design for optimal power."))
}
# Validate within-group data quality
for (group_name in names(group_sizes)) {
group_data <- data$response[data$group == group_name]
# Check for constant values within group
if (length(unique(group_data[!is.na(group_data)])) <= 1) {
errors <- append(errors,
paste("Group", group_name, "has constant values"))
}
# Check for excessive missing data within group
missing_prop <- sum(is.na(group_data)) / length(group_data)
if (missing_prop > 0.3) {
warnings <- append(warnings,
paste("High missing data in group", group_name,
":", round(missing_prop * 100, 1), "%"))
}
}
list(errors = errors, warnings = warnings, metadata = metadata)
}
)
)Security Protocol Implementation
Comprehensive Security Framework
Implement enterprise-grade security measures for statistical applications:
# File: R/security_framework.R
#' Enterprise Security Manager
#'
#' @description Comprehensive security framework for Shiny applications with
#' session management, access control, and threat protection.
#'
#' @export
SecurityManager <- R6::R6Class(
"SecurityManager",
public = list(
#' @field config security configuration
config = NULL,
#' @field audit_logger audit logging instance
audit_logger = NULL,
#' @field session_manager session management instance
session_manager = NULL,
#' Initialize security manager
#'
#' @param config list. Security configuration
#'
initialize = function(config = list()) {
self$config <- private$default_security_config(config)
self$audit_logger <- AuditLogger$new(self$config$audit)
self$session_manager <- SessionManager$new(self$config$session)
self$audit_logger$log_event("SecurityManager initialized",
level = "INFO", category = "SECURITY")
},
#' Sanitize user input to prevent injection attacks
#'
#' @param input any. User input to sanitize
#' @param type character. Type of input (text, numeric, file, etc.)
#'
#' @return Sanitized input or error
#'
sanitize_input = function(input, type = "text") {
tryCatch({
if (is.null(input) || length(input) == 0) {
return(input)
}
switch(type,
"text" = private$sanitize_text_input(input),
"numeric" = private$sanitize_numeric_input(input),
"file" = private$sanitize_file_input(input),
"sql" = private$sanitize_sql_input(input),
{
self$audit_logger$log_event(
paste("Unknown input type:", type),
level = "WARNING", category = "SECURITY"
)
private$sanitize_text_input(input)
}
)
}, error = function(e) {
self$audit_logger$log_event(
paste("Input sanitization failed:", e$message),
level = "ERROR", category = "SECURITY"
)
stop("Input sanitization failed")
})
},
#' Validate session security
#'
#' @param session shiny session object
#'
#' @return logical. TRUE if session is valid and secure
#'
validate_session = function(session) {
session_id <- session$token
# Check session existence and validity
if (!self$session_manager$is_valid_session(session_id)) {
self$audit_logger$log_event(
"Invalid session access attempt",
level = "WARNING",
category = "SECURITY",
details = list(session_id = session_id)
)
return(FALSE)
}
# Check session timeout
if (self$session_manager$is_session_expired(session_id)) {
self$audit_logger$log_event(
"Expired session access attempt",
level = "INFO",
category = "SECURITY",
details = list(session_id = session_id)
)
self$session_manager$invalidate_session(session_id)
return(FALSE)
}
# Update session activity
self$session_manager$update_session_activity(session_id)
return(TRUE)
},
#' Check rate limiting for requests
#'
#' @param identifier character. User/session identifier
#' @param action character. Action being performed
#'
#' @return logical. TRUE if request is allowed
#'
check_rate_limit = function(identifier, action = "general") {
current_time <- Sys.time()
# Get rate limit configuration for action
rate_config <- self$config$rate_limits[[action]] %||%
self$config$rate_limits$default
# Check if rate limit exceeded
if (private$is_rate_limited(identifier, action, current_time, rate_config)) {
self$audit_logger$log_event(
paste("Rate limit exceeded for", action),
level = "WARNING",
category = "SECURITY",
details = list(
identifier = identifier,
action = action,
limit = rate_config
)
)
return(FALSE)
}
# Record request
private$record_request(identifier, action, current_time)
return(TRUE)
},
#' Generate secure random token
#'
#' @param length integer. Token length
#'
#' @return character. Secure random token
#'
generate_secure_token = function(length = 32) {
# Use cryptographically secure random number generation
raw_bytes <- openssl::rand_bytes(length)
openssl::base64_encode(raw_bytes)
}
),
private = list(
#' Default security configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete security configuration
#'
default_security_config = function(config) {
default_config <- list(
input_limits = list(
max_string_length = 10000,
max_numeric_value = 1e10,
max_file_size = 50 * 1024 * 1024, # 50MB
allowed_file_types = c("csv", "txt", "xlsx", "rds")
),
session = list(
timeout_minutes = 60,
max_concurrent_sessions = 100,
require_csrf_token = TRUE
),
rate_limits = list(
default = list(requests = 100, window_minutes = 15),
file_upload = list(requests = 10, window_minutes = 60),
analysis = list(requests = 50, window_minutes = 10)
),
encryption = list(
algorithm = "AES-256-GCM",
key_rotation_days = 90
),
audit = list(
enabled = TRUE,
log_level = "INFO",
retention_days = 365,
encrypt_logs = TRUE
)
)
modifyList(default_config, config)
},
#' Sanitize text input
#'
#' @param input character. Text input
#'
#' @return character. Sanitized text
#'
sanitize_text_input = function(input) {
if (!is.character(input)) {
input <- as.character(input)
}
# Remove HTML tags and potentially dangerous characters
input <- gsub("<[^>]*>", "", input)
input <- gsub("[<>\"'&]", "", input)
# Remove script-related content
input <- gsub("(?i)javascript:", "", input, perl = TRUE)
input <- gsub("(?i)<script", "", input, perl = TRUE)
input <- gsub("(?i)on\\w+\\s*=", "", input, perl = TRUE)
# Limit length
max_length <- self$config$input_limits$max_string_length
if (nchar(input) > max_length) {
input <- substr(input, 1, max_length)
}
# Remove control characters except tabs and newlines
input <- gsub("[[:cntrl:]&&[^\t\n]]", "", input)
return(input)
},
#' Sanitize numeric input
#'
#' @param input numeric. Numeric input
#'
#' @return numeric. Validated numeric input
#'
sanitize_numeric_input = function(input) {
if (!is.numeric(input)) {
# Try to convert to numeric
input <- suppressWarnings(as.numeric(input))
if (any(is.na(input))) {
stop("Invalid numeric input")
}
}
# Check for infinite or NaN values
if (any(is.infinite(input)) || any(is.nan(input))) {
stop("Invalid numeric values (Inf/NaN) detected")
}
# Check value limits
max_value <- self$config$input_limits$max_numeric_value
if (any(abs(input) > max_value, na.rm = TRUE)) {
stop("Numeric values exceed allowed range")
}
return(input)
},
#' Sanitize file input
#'
#' @param file_info list. File information from Shiny
#'
#' @return list. Validated file information
#'
sanitize_file_input = function(file_info) {
if (is.null(file_info) || is.null(file_info$datapath)) {
stop("Invalid file input")
}
# Check file size
file_size <- file.info(file_info$datapath)$size
max_size <- self$config$input_limits$max_file_size
if (file_size > max_size) {
stop(paste("File size", file_size, "exceeds maximum allowed:", max_size))
}
# Check file extension
file_ext <- tolower(tools::file_ext(file_info$name))
allowed_types <- self$config$input_limits$allowed_file_types
if (!file_ext %in% allowed_types) {
stop(paste("File type", file_ext, "not allowed"))
}
# Sanitize filename
safe_name <- gsub("[^a-zA-Z0-9._-]", "", file_info$name)
file_info$name <- safe_name
return(file_info)
},
#' Sanitize SQL input to prevent injection
#'
#' @param input character. SQL input
#'
#' @return character. Sanitized SQL
#'
sanitize_sql_input = function(input) {
# Remove dangerous SQL keywords and characters
dangerous_patterns <- c(
"(?i)\\b(DROP|DELETE|UPDATE|INSERT|ALTER|CREATE|EXEC|EXECUTE)\\b",
"(?i)\\b(UNION|SELECT.*FROM)\\b",
"[';\"\\\\]",
"--",
"/\\*",
"\\*/"
)
for (pattern in dangerous_patterns) {
input <- gsub(pattern, "", input, perl = TRUE)
}
return(input)
},
#' Check if rate limit is exceeded
#'
#' @param identifier character. Request identifier
#' @param action character. Action type
#' @param current_time POSIXct. Current timestamp
#' @param rate_config list. Rate limit configuration
#'
#' @return logical. TRUE if rate limited
#'
is_rate_limited = function(identifier, action, current_time, rate_config) {
# Get request history for this identifier and action
request_key <- paste(identifier, action, sep = "_")
# This would typically use Redis or similar in production
# For demo purposes, using environment variable storage
request_history <- get(request_key, envir = private$.request_cache,
inherits = FALSE)
if (is.null(request_history)) {
return(FALSE)
}
# Filter requests within the time window
window_start <- current_time - (rate_config$window_minutes * 60)
recent_requests <- request_history[request_history >= window_start]
# Check if limit exceeded
return(length(recent_requests) >= rate_config$requests)
},
#' Record request for rate limiting
#'
#' @param identifier character. Request identifier
#' @param action character. Action type
#' @param timestamp POSIXct. Request timestamp
#'
record_request = function(identifier, action, timestamp) {
request_key <- paste(identifier, action, sep = "_")
# Get existing history
existing_history <- get(request_key, envir = private$.request_cache,
inherits = FALSE)
if (is.null(existing_history)) {
existing_history <- c()
}
# Add new request
updated_history <- c(existing_history, timestamp)
# Keep only recent requests (last 24 hours)
cutoff_time <- timestamp - (24 * 3600)
updated_history <- updated_history[updated_history >= cutoff_time]
# Store updated history
assign(request_key, updated_history, envir = private$.request_cache)
},
#' Request cache environment
.request_cache = new.env(parent = emptyenv())
)
)
#' Session Manager for Security
#'
#' @description Manages user sessions with security features including
#' timeout handling, concurrent session limits, and session validation.
#'
#' @export
SessionManager <- R6::R6Class(
"SessionManager",
public = list(
#' @field config session configuration
config = NULL,
#' @field sessions session storage
sessions = NULL,
#' Initialize session manager
#'
#' @param config list. Session configuration
#'
initialize = function(config = list()) {
self$config <- config
self$sessions <- new.env(parent = emptyenv())
},
#' Create new session
#'
#' @param session_id character. Session identifier
#' @param user_info list. User information
#'
#' @return logical. TRUE if session created successfully
#'
create_session = function(session_id, user_info = list()) {
# Check concurrent session limits
if (private$count_active_sessions() >= self$config$max_concurrent_sessions) {
return(FALSE)
}
session_data <- list(
id = session_id,
created = Sys.time(),
last_activity = Sys.time(),
user_info = user_info,
csrf_token = private$generate_csrf_token()
)
assign(session_id, session_data, envir = self$sessions)
return(TRUE)
},
#' Check if session is valid
#'
#' @param session_id character. Session identifier
#'
#' @return logical. TRUE if session is valid
#'
is_valid_session = function(session_id) {
exists(session_id, envir = self$sessions)
},
#' Check if session is expired
#'
#' @param session_id character. Session identifier
#'
#' @return logical. TRUE if session is expired
#'
is_session_expired = function(session_id) {
if (!self$is_valid_session(session_id)) {
return(TRUE)
}
session_data <- get(session_id, envir = self$sessions)
timeout_seconds <- self$config$timeout_minutes * 60
(Sys.time() - session_data$last_activity) > timeout_seconds
},
#' Update session activity
#'
#' @param session_id character. Session identifier
#'
update_session_activity = function(session_id) {
if (self$is_valid_session(session_id)) {
session_data <- get(session_id, envir = self$sessions)
session_data$last_activity <- Sys.time()
assign(session_id, session_data, envir = self$sessions)
}
},
#' Invalidate session
#'
#' @param session_id character. Session identifier
#'
invalidate_session = function(session_id) {
if (exists(session_id, envir = self$sessions)) {
rm(list = session_id, envir = self$sessions)
}
},
#' Get CSRF token for session
#'
#' @param session_id character. Session identifier
#'
#' @return character. CSRF token or NULL
#'
get_csrf_token = function(session_id) {
if (self$is_valid_session(session_id)) {
session_data <- get(session_id, envir = self$sessions)
return(session_data$csrf_token)
}
return(NULL)
}
),
private = list(
#' Count active sessions
#'
#' @return integer. Number of active sessions
#'
count_active_sessions = function() {
length(ls(envir = self$sessions))
},
#' Generate CSRF token
#'
#' @return character. CSRF token
#'
generate_csrf_token = function() {
openssl::base64_encode(openssl::rand_bytes(32))
}
)
)Audit Logging and Compliance
Comprehensive Audit System
Implement enterprise-grade audit logging for regulatory compliance:
# File: R/audit_logging.R
#' Enterprise Audit Logger
#'
#' @description Comprehensive audit logging system for regulatory compliance
#' with encrypted storage, retention management, and compliance reporting.
#'
#' @export
AuditLogger <- R6::R6Class(
"AuditLogger",
public = list(
#' @field config audit configuration
config = NULL,
#' @field log_file path to log file
log_file = NULL,
#' Initialize audit logger
#'
#' @param config list. Audit configuration
#'
initialize = function(config = list()) {
self$config <- private$default_audit_config(config)
self$log_file <- self$config$log_file
# Ensure log directory exists
log_dir <- dirname(self$log_file)
if (!dir.exists(log_dir)) {
dir.create(log_dir, recursive = TRUE)
}
# Initialize log file with header
private$initialize_log_file()
},
#' Log general event
#'
#' @param message character. Log message
#' @param level character. Log level (DEBUG, INFO, WARNING, ERROR)
#' @param category character. Event category
#' @param details list. Additional event details
#' @param user_id character. User identifier (optional)
#' @param session_id character. Session identifier (optional)
#'
log_event = function(message, level = "INFO", category = "GENERAL",
details = NULL, user_id = NULL, session_id = NULL) {
# Check if logging is enabled for this level
if (!private$should_log(level)) {
return(invisible(NULL))
}
log_entry <- list(
timestamp = format(Sys.time(), "%Y-%m-%d %H:%M:%S %Z"),
level = level,
category = category,
message = message,
user_id = user_id,
session_id = session_id,
ip_address = private$get_client_ip(),
user_agent = private$get_user_agent(),
details = details,
entry_id = uuid::UUIDgenerate()
)
private$write_log_entry(log_entry)
},
#' Log validation event
#'
#' @param validator_name character. Name of validator
#' @param input_hash character. Hash of input data
#' @param result ValidationResult. Validation result
#' @param duration numeric. Validation duration in seconds
#' @param context list. Validation context
#' @param user_id character. User identifier (optional)
#' @param session_id character. Session identifier (optional)
#'
log_validation = function(validator_name, input_hash, result, duration,
context = NULL, user_id = NULL, session_id = NULL) {
details <- list(
validator_name = validator_name,
input_hash = input_hash,
validation_successful = result$valid,
error_count = length(result$errors),
warning_count = length(result$warnings),
duration_seconds = duration,
context = context
)
level <- if (result$valid) "INFO" else "WARNING"
message <- paste("Validation", if (result$valid) "passed" else "failed",
"for", validator_name)
self$log_event(
message = message,
level = level,
category = "VALIDATION",
details = details,
user_id = user_id,
session_id = session_id
)
},
#' Log security event
#'
#' @param event_type character. Type of security event
#' @param severity character. Event severity (LOW, MEDIUM, HIGH, CRITICAL)
#' @param description character. Event description
#' @param threat_indicators list. Threat indicators
#' @param user_id character. User identifier (optional)
#' @param session_id character. Session identifier (optional)
#'
log_security_event = function(event_type, severity, description,
threat_indicators = NULL, user_id = NULL,
session_id = NULL) {
details <- list(
event_type = event_type,
severity = severity,
threat_indicators = threat_indicators,
remediation_required = severity %in% c("HIGH", "CRITICAL")
)
level <- switch(severity,
"LOW" = "INFO",
"MEDIUM" = "WARNING",
"HIGH" = "ERROR",
"CRITICAL" = "ERROR"
)
self$log_event(
message = paste("Security event:", description),
level = level,
category = "SECURITY",
details = details,
user_id = user_id,
session_id = session_id
)
# Send alerts for high severity events
if (severity %in% c("HIGH", "CRITICAL")) {
private$send_security_alert(event_type, severity, description, details)
}
},
#' Log data access event
#'
#' @param operation character. Type of operation (READ, WRITE, DELETE)
#' @param resource character. Resource accessed
#' @param success logical. Whether operation was successful
#' @param user_id character. User identifier
#' @param session_id character. Session identifier
#' @param details list. Additional operation details
#'
log_data_access = function(operation, resource, success, user_id,
session_id, details = NULL) {
access_details <- list(
operation = operation,
resource = resource,
success = success,
operation_details = details
)
level <- if (success) "INFO" else "WARNING"
message <- paste("Data access:", operation, "on", resource,
if (success) "succeeded" else "failed")
self$log_event(
message = message,
level = level,
category = "DATA_ACCESS",
details = access_details,
user_id = user_id,
session_id = session_id
)
},
#' Generate compliance report
#'
#' @param start_date Date. Report start date
#' @param end_date Date. Report end date
#' @param categories character vector. Event categories to include
#'
#' @return data.frame. Compliance report data
#'
generate_compliance_report = function(start_date, end_date,
categories = NULL) {
# Read and parse log entries
log_entries <- private$read_log_entries(start_date, end_date)
if (!is.null(categories)) {
log_entries <- log_entries[log_entries$category %in% categories, ]
}
# Generate summary statistics
report_data <- list(
period = list(start = start_date, end = end_date),
total_events = nrow(log_entries),
events_by_category = table(log_entries$category),
events_by_level = table(log_entries$level),
unique_users = length(unique(log_entries$user_id[!is.na(log_entries$user_id)])),
unique_sessions = length(unique(log_entries$session_id[!is.na(log_entries$session_id)])),
security_events = sum(log_entries$category == "SECURITY"),
validation_failures = sum(log_entries$category == "VALIDATION" &
log_entries$level %in% c("WARNING", "ERROR")),
data_access_events = sum(log_entries$category == "DATA_ACCESS")
)
return(report_data)
}
),
private = list(
#' Default audit configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete audit configuration
#'
default_audit_config = function(config) {
default_config <- list(
log_file = "logs/audit.log",
log_level = "INFO",
retention_days = 365,
encrypt_logs = FALSE,
max_file_size = 100 * 1024 * 1024, # 100MB
rotation_enabled = TRUE,
compression_enabled = TRUE,
real_time_alerts = TRUE,
alert_endpoints = list()
)
modifyList(default_config, config)
},
#' Check if message should be logged based on level
#'
#' @param level character. Message level
#'
#' @return logical. TRUE if should log
#'
should_log = function(level) {
level_hierarchy <- c("DEBUG" = 1, "INFO" = 2, "WARNING" = 3, "ERROR" = 4)
message_level <- level_hierarchy[level] %||% 1
config_level <- level_hierarchy[self$config$log_level] %||% 2
message_level >= config_level
},
#' Initialize log file with header
#'
initialize_log_file = function() {
if (!file.exists(self$log_file)) {
header <- paste(
"# Audit Log File",
paste("# Created:", Sys.time()),
paste("# Application: Enterprise Statistical Analysis Platform"),
paste("# Version: 1.0"),
"# Format: JSON Lines",
"",
sep = "\n"
)
writeLines(header, self$log_file)
}
},
#' Write log entry to file
#'
#' @param log_entry list. Log entry data
#'
write_log_entry = function(log_entry) {
# Convert to JSON
json_entry <- jsonlite::toJSON(log_entry, auto_unbox = TRUE)
# Encrypt if configured
if (self$config$encrypt_logs) {
json_entry <- private$encrypt_log_entry(json_entry)
}
# Write to file with file locking
tryCatch({
write(json_entry, file = self$log_file, append = TRUE)
}, error = function(e) {
# Fallback logging to console if file write fails
cat("LOG WRITE ERROR:", e$message, "\n")
cat("ENTRY:", json_entry, "\n")
})
# Check for log rotation
private$check_log_rotation()
},
#' Check if log rotation is needed
#'
check_log_rotation = function() {
if (!self$config$rotation_enabled) {
return(invisible(NULL))
}
file_info <- file.info(self$log_file)
if (file_info$size > self$config$max_file_size) {
private$rotate_log_file()
}
},
#' Rotate log file
#'
rotate_log_file = function() {
timestamp <- format(Sys.time(), "%Y%m%d_%H%M%S")
rotated_name <- paste0(self$log_file, ".", timestamp)
# Move current log to rotated name
file.rename(self$log_file, rotated_name)
# Compress if enabled
if (self$config$compression_enabled) {
R.utils::gzip(rotated_name)
file.remove(rotated_name)
}
# Initialize new log file
private$initialize_log_file()
},
#' Get client IP address
#'
#' @return character. Client IP address
#'
get_client_ip = function() {
# In a real Shiny application, this would extract from session
Sys.getenv("HTTP_X_FORWARDED_FOR", "127.0.0.1")
},
#' Get user agent string
#'
#' @return character. User agent string
#'
get_user_agent = function() {
# In a real Shiny application, this would extract from session
Sys.getenv("HTTP_USER_AGENT", "Unknown")
},
#' Encrypt log entry
#'
#' @param entry character. Log entry JSON
#'
#' @return character. Encrypted log entry
#'
encrypt_log_entry = function(entry) {
# In production, use proper key management
# This is a simplified example
key <- Sys.getenv("AUDIT_LOG_KEY", "default_key_change_in_production")
tryCatch({
# Simple encryption for demonstration
# In production, use proper encryption libraries
encrypted <- openssl::aes_cbc_encrypt(charToRaw(entry), key = charToRaw(key))
openssl::base64_encode(encrypted)
}, error = function(e) {
# Return unencrypted if encryption fails
entry
})
},
#' Read log entries for reporting
#'
#' @param start_date Date. Start date for entries
#' @param end_date Date. End date for entries
#'
#' @return data.frame. Log entries
#'
read_log_entries = function(start_date, end_date) {
if (!file.exists(self$log_file)) {
return(data.frame())
}
# Read all lines from log file
lines <- readLines(self$log_file)
# Filter out comment lines
json_lines <- lines[!grepl("^#", lines) & nchar(lines) > 0]
if (length(json_lines) == 0) {
return(data.frame())
}
# Parse JSON entries
entries <- list()
for (i in seq_along(json_lines)) {
tryCatch({
entry <- jsonlite::fromJSON(json_lines[i])
entry$timestamp <- as.POSIXct(entry$timestamp)
entries[[i]] <- entry
}, error = function(e) {
# Skip malformed entries
NULL
})
}
# Convert to data frame
if (length(entries) > 0) {
# Filter by date range
entry_df <- do.call(rbind.data.frame, entries)
entry_df <- entry_df[entry_df$timestamp >= start_date &
entry_df$timestamp <= end_date, ]
return(entry_df)
}
return(data.frame())
},
#' Send security alert
#'
#' @param event_type character. Event type
#' @param severity character. Event severity
#' @param description character. Event description
#' @param details list. Event details
#'
send_security_alert = function(event_type, severity, description, details) {
if (!self$config$real_time_alerts) {
return(invisible(NULL))
}
alert_data <- list(
timestamp = Sys.time(),
event_type = event_type,
severity = severity,
description = description,
details = details,
application = "Enterprise Statistical Analysis Platform"
)
# In production, this would integrate with alerting systems
# like PagerDuty, Slack, email, or SIEM systems
cat("SECURITY ALERT:", jsonlite::toJSON(alert_data, auto_unbox = TRUE), "\n")
}
)
)Error Handling and Recovery
Comprehensive Error Management
Implement robust error handling that maintains application stability:
# File: R/error_handling.R
#' Enterprise Error Handler
#'
#' @description Comprehensive error handling system with graceful degradation,
#' user-friendly messages, and detailed logging for enterprise applications.
#'
#' @export
ErrorHandler <- R6::R6Class(
"ErrorHandler",
public = list(
#' @field config error handling configuration
config = NULL,
#' @field audit_logger audit logger instance
audit_logger = NULL,
#' Initialize error handler
#'
#' @param config list. Error handling configuration
#' @param audit_logger AuditLogger. Audit logger instance
#'
initialize = function(config = list(), audit_logger = NULL) {
self$config <- private$default_error_config(config)
self$audit_logger <- audit_logger
},
#' Handle validation errors
#'
#' @param validation_result ValidationResult. Validation result object
#' @param context list. Error context information
#'
#' @return list. User-friendly error response
#'
handle_validation_error = function(validation_result, context = list()) {
if (validation_result$valid) {
return(list(
type = "success",
message = "Validation successful",
data = validation_result$data
))
}
# Log validation errors
if (!is.null(self$audit_logger)) {
self$audit_logger$log_event(
message = "Validation failed",
level = "WARNING",
category = "VALIDATION",
details = list(
errors = validation_result$errors,
warnings = validation_result$warnings,
context = context
)
)
}
# Generate user-friendly error message
user_message <- private$format_validation_errors(validation_result$errors)
list(
type = "validation_error",
message = user_message,
errors = validation_result$errors,
warnings = validation_result$warnings,
show_details = self$config$show_error_details
)
},
#' Handle system errors
#'
#' @param error condition. Error condition object
#' @param context list. Error context information
#' @param user_id character. User identifier (optional)
#' @param session_id character. Session identifier (optional)
#'
#' @return list. Error response for user
#'
handle_system_error = function(error, context = list(), user_id = NULL,
session_id = NULL) {
error_id <- uuid::UUIDgenerate()
# Log detailed error information
if (!is.null(self$audit_logger)) {
self$audit_logger$log_event(
message = paste("System error:", error$message),
level = "ERROR",
category = "SYSTEM",
details = list(
error_id = error_id,
error_class = class(error)[1],
error_message = error$message,
error_call = deparse(error$call),
stack_trace = private$get_stack_trace(error),
context = context
),
user_id = user_id,
session_id = session_id
)
}
# Generate user-friendly response
user_message <- if (self$config$show_error_details) {
paste("An error occurred:", error$message, "\nError ID:", error_id)
} else {
paste("An unexpected error occurred. Please try again.",
"If the problem persists, contact support with Error ID:", error_id)
}
list(
type = "system_error",
message = user_message,
error_id = error_id,
recoverable = private$is_recoverable_error(error),
retry_suggested = private$should_suggest_retry(error)
)
},
#' Handle security errors
#'
#' @param security_event character. Type of security event
#' @param details list. Security event details
#' @param user_id character. User identifier (optional)
#' @param session_id character. Session identifier (optional)
#'
#' @return list. Security error response
#'
handle_security_error = function(security_event, details = list(),
user_id = NULL, session_id = NULL) {
# Log security incident
if (!is.null(self$audit_logger)) {
self$audit_logger$log_security_event(
event_type = security_event,
severity = "HIGH",
description = paste("Security violation:", security_event),
threat_indicators = details,
user_id = user_id,
session_id = session_id
)
}
# Generic security message (don't reveal specific attack vectors)
user_message <- "Access denied. This incident has been logged."
list(
type = "security_error",
message = user_message,
action_required = "contact_administrator",
session_terminated = TRUE
)
},
#' Create error boundary for Shiny modules
#'
#' @param expr expression. Code to execute with error handling
#' @param context list. Execution context
#' @param fallback_ui UI. Fallback UI to show on error
#'
#' @return UI or error result
#'
error_boundary = function(expr, context = list(), fallback_ui = NULL) {
tryCatch({
expr
}, error = function(e) {
error_response <- self$handle_system_error(e, context)
if (is.null(fallback_ui)) {
fallback_ui <- private$create_error_ui(error_response)
}
return(fallback_ui)
})
},
#' Async error handler for long-running operations
#'
#' @param future_expr future. Future expression to monitor
#' @param progress_callback function. Progress update callback
#' @param error_callback function. Error handling callback
#'
#' @return promise. Promise with error handling
#'
async_error_handler = function(future_expr, progress_callback = NULL,
error_callback = NULL) {
future_expr %>%
promises::catch(function(error) {
error_response <- self$handle_system_error(
error,
context = list(operation = "async_operation")
)
if (!is.null(error_callback)) {
error_callback(error_response)
}
return(error_response)
})
}
),
private = list(
#' Default error handling configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete configuration
#'
default_error_config = function(config) {
default_config <- list(
show_error_details = FALSE, # Set to TRUE only in development
log_all_errors = TRUE,
user_friendly_messages = TRUE,
max_error_length = 500,
retry_attempts = 3,
retry_delay_seconds = 2,
error_page_template = "default",
notification_duration = 5000 # milliseconds
)
modifyList(default_config, config)
},
#' Format validation errors for users
#'
#' @param errors list. List of error messages
#'
#' @return character. Formatted error message
#'
format_validation_errors = function(errors) {
if (length(errors) == 0) {
return("No errors found.")
}
if (length(errors) == 1) {
return(paste("Validation error:", errors[[1]]))
}
# Multiple errors
error_list <- paste("-", errors, collapse = "\n")
paste("Multiple validation errors found:\n", error_list)
},
#' Get stack trace from error
#'
#' @param error condition. Error object
#'
#' @return character. Stack trace
#'
get_stack_trace = function(error) {
tryCatch({
# Get call stack
calls <- sys.calls()
if (length(calls) > 0) {
stack_trace <- sapply(calls, function(call) {
paste(deparse(call), collapse = " ")
})
return(paste(stack_trace, collapse = "\n"))
}
}, error = function(e) {
return("Stack trace unavailable")
})
return("No stack trace available")
},
#' Check if error is recoverable
#'
#' @param error condition. Error object
#'
#' @return logical. TRUE if recoverable
#'
is_recoverable_error = function(error) {
# Define recoverable error patterns
recoverable_patterns <- c(
"connection.*timeout",
"network.*error",
"temporary.*unavailable",
"service.*busy"
)
error_message <- tolower(error$message)
any(sapply(recoverable_patterns, function(pattern) {
grepl(pattern, error_message, perl = TRUE)
}))
},
#' Check if retry should be suggested
#'
#' @param error condition. Error object
#'
#' @return logical. TRUE if retry suggested
#'
should_suggest_retry = function(error) {
# Suggest retry for certain types of errors
retry_patterns <- c(
"timeout",
"connection",
"network",
"temporary",
"busy"
)
error_message <- tolower(error$message)
any(sapply(retry_patterns, function(pattern) {
grepl(pattern, error_message)
}))
},
#' Create error UI component
#'
#' @param error_response list. Error response data
#'
#' @return UI. Error display UI
#'
create_error_ui = function(error_response) {
alert_class <- switch(error_response$type,
"validation_error" = "alert-warning",
"system_error" = "alert-danger",
"security_error" = "alert-danger",
"alert-info"
)
div(
class = paste("alert", alert_class, "alert-dismissible fade show"),
role = "alert",
# Error icon
icon("exclamation-triangle", class = "me-2"),
# Error message
strong("Error: "), error_response$message,
# Additional information
if (!is.null(error_response$error_id)) {
div(
class = "mt-2 small",
"Reference ID: ",
code(error_response$error_id)
)
},
# Retry button if applicable
if (isTRUE(error_response$retry_suggested)) {
div(
class = "mt-3",
actionButton(
"retry_operation",
"Retry",
class = "btn btn-sm btn-secondary"
)
)
},
# Dismiss button
tags$button(
type = "button",
class = "btn-close",
`data-bs-dismiss` = "alert",
`aria-label` = "Close"
)
)
}
)
)Integration with t-Test Application
Implementing Validation in the t-Test Module
Apply comprehensive validation to the enterprise t-test application:
# File: R/mod_ttest_validated.R
#' Validated t-Test Module Server
#'
#' @description Enhanced t-test module with comprehensive validation,
#' security, and error handling for enterprise deployment.
#'
#' @param id character. Module identifier
#'
#' @export
mod_ttest_validated_server <- function(id) {
moduleServer(id, function(input, output, session) {
# Initialize enterprise frameworks
validation_framework <- ValidationFramework$new()
security_manager <- SecurityManager$new()
error_handler <- ErrorHandler$new(
config = list(show_error_details = FALSE),
audit_logger = validation_framework$audit_logger
)
# Create session and validate
session_created <- security_manager$session_manager$create_session(
session$token,
user_info = list(ip_address = session$clientData$url_hostname)
)
if (!session_created) {
showNotification("Session could not be created. Please try again.",
type = "error")
return()
}
# Reactive values with validation
values <- reactiveValues(
raw_data = NULL,
validated_data = NULL,
test_results = NULL,
validation_results = NULL,
last_validation = NULL
)
# Input validation with rate limiting
observeEvent(input$group_input, {
# Check rate limiting
if (!security_manager$check_rate_limit(session$token, "data_input")) {
showNotification("Too many requests. Please wait before trying again.",
type = "warning")
return()
}
# Validate session
if (!security_manager$validate_session(session)) {
showNotification("Session expired. Please refresh the page.",
type = "error")
return()
}
# Sanitize input
sanitized_input <- tryCatch({
security_manager$sanitize_input(input$group_input, type = "text")
}, error = function(e) {
error_response <- error_handler$handle_system_error(
e,
context = list(operation = "input_sanitization", input_type = "group")
)
showNotification(error_response$message, type = "error")
return(NULL)
})
if (is.null(sanitized_input)) return()
# Parse and validate group input
tryCatch({
group_data <- parse_group_input(sanitized_input)
if (!is.null(group_data)) {
values$raw_data$group <- group_data
# Trigger validation if both inputs are available
if (!is.null(values$raw_data$response)) {
validate_combined_data()
}
}
}, error = function(e) {
error_response <- error_handler$handle_system_error(
e,
context = list(operation = "group_parsing")
)
showNotification(error_response$message, type = "error")
})
})
observeEvent(input$response_input, {
# Rate limiting and session validation
if (!security_manager$check_rate_limit(session$token, "data_input")) {
showNotification("Too many requests. Please wait before trying again.",
type = "warning")
return()
}
if (!security_manager$validate_session(session)) {
showNotification("Session expired. Please refresh the page.",
type = "error")
return()
}
# Sanitize and validate response input
sanitized_input <- tryCatch({
security_manager$sanitize_input(input$response_input, type = "text")
}, error = function(e) {
error_response <- error_handler$handle_system_error(
e,
context = list(operation = "input_sanitization", input_type = "response")
)
showNotification(error_response$message, type = "error")
return(NULL)
})
if (is.null(sanitized_input)) return()
tryCatch({
response_data <- parse_response_input(sanitized_input)
if (!is.null(response_data)) {
values$raw_data$response <- response_data
# Trigger validation if both inputs are available
if (!is.null(values$raw_data$group)) {
validate_combined_data()
}
}
}, error = function(e) {
error_response <- error_handler$handle_system_error(
e,
context = list(operation = "response_parsing")
)
showNotification(error_response$message, type = "error")
})
})
# Combined data validation function
validate_combined_data <- function() {
if (is.null(values$raw_data$group) || is.null(values$raw_data$response)) {
return()
}
# Create data frame for validation
combined_data <- data.frame(
group = values$raw_data$group,
response = values$raw_data$response,
stringsAsFactors = FALSE
)
# Validate with t-test specific validator
validation_result <- validation_framework$validate(
data = combined_data,
validator_name = "ttest_data",
context = list(
min_group_size = 3,
recommended_group_size = 15,
max_missing_proportion = 0.3
)
)
values$validation_results <- validation_result
values$last_validation <- Sys.time()
# Handle validation results
if (validation_result$valid) {
values$validated_data <- validation_result$data
# Show success with any warnings
if (validation_result$has_warnings()) {
showNotification(
paste("Data validated successfully.",
"Warnings:", validation_result$get_warning_message()),
type = "warning",
duration = 8000
)
} else {
showNotification("Data validated successfully!", type = "success")
}
} else {
values$validated_data <- NULL
# Handle validation errors
error_response <- error_handler$handle_validation_error(
validation_result,
context = list(operation = "data_validation")
)
showNotification(error_response$message, type = "error", duration = 10000)
}
}
# Analysis execution with comprehensive error handling
observeEvent(input$run_analysis, {
# Rate limiting for analysis operations
if (!security_manager$check_rate_limit(session$token, "analysis")) {
showNotification("Analysis rate limit exceeded. Please wait before running another analysis.",
type = "warning")
return()
}
# Session validation
if (!security_manager$validate_session(session)) {
showNotification("Session expired. Please refresh the page.",
type = "error")
return()
}
# Data validation check
if (is.null(values$validated_data)) {
showNotification("Please provide valid data before running analysis.",
type = "error")
return()
}
# Show progress indicator
showNotification("Running statistical analysis...",
type = "message", id = "analysis_progress")
# Execute analysis with error boundary
analysis_result <- error_handler$error_boundary(
expr = {
# Log analysis start
validation_framework$audit_logger$log_event(
message = "Statistical analysis started",
level = "INFO",
category = "ANALYSIS",
details = list(
test_type = "independent_ttest",
sample_size = nrow(values$validated_data),
user_options = list(
alternative = input$alternative,
conf_level = input$conf_level,
var_equal = input$var_equal
)
),
user_id = session$user,
session_id = session$token
)
# Run comprehensive t-test analysis
calculate_ttest_comprehensive(
data = values$validated_data,
options = list(
alternative = input$alternative,
conf_level = input$conf_level,
var_equal = as.logical(input$var_equal),
auto_method = input$auto_method
)
)
},
context = list(
operation = "statistical_analysis",
test_type = "ttest"
),
fallback_ui = div(
class = "alert alert-danger",
"Analysis failed. Please check your data and try again."
)
)
# Remove progress indicator
removeNotification("analysis_progress")
# Handle analysis results
if (is.list(analysis_result) && !inherits(analysis_result, "shiny.tag")) {
values$test_results <- analysis_result
# Log successful analysis
validation_framework$audit_logger$log_event(
message = "Statistical analysis completed successfully",
level = "INFO",
category = "ANALYSIS",
details = list(
p_value = analysis_result$test_result$p.value,
effect_size = analysis_result$effect_size$cohens_d,
method_used = analysis_result$method_used
),
user_id = session$user,
session_id = session$token
)
showNotification("Analysis completed successfully!", type = "success")
} else {
values$test_results <- NULL
showNotification("Analysis failed. Please try again.", type = "error")
}
})
# Validation status display
output$validation_status <- renderUI({
if (is.null(values$validation_results)) {
return(
div(
class = "alert alert-info",
icon("info-circle", class = "me-2"),
"Waiting for data input..."
)
)
}
if (values$validation_results$valid) {
status_content <- list(
div(
class = "alert alert-success",
icon("check-circle", class = "me-2"),
"Data validation passed"
)
)
# Add warnings if present
if (values$validation_results$has_warnings()) {
status_content <- append(status_content, list(
div(
class = "alert alert-warning mt-2",
icon("exclamation-triangle", class = "me-2"),
strong("Warnings: "),
values$validation_results$get_warning_message()
)
))
}
return(status_content)
} else {
return(
div(
class = "alert alert-danger",
icon("times-circle", class = "me-2"),
strong("Validation failed: "),
values$validation_results$get_error_message()
)
)
}
})
# Security monitoring
observe({
# Monitor for suspicious activity patterns
if (!is.null(input$group_input) && !is.null(input$response_input)) {
# Check for potential injection attempts
suspicious_patterns <- c(
"<script",
"javascript:",
"eval\\s*\\(",
"exec\\s*\\(",
"system\\s*\\("
)
combined_input <- paste(input$group_input, input$response_input)
for (pattern in suspicious_patterns) {
if (grepl(pattern, combined_input, ignore.case = TRUE, perl = TRUE)) {
error_handler$handle_security_error(
security_event = "injection_attempt",
details = list(
pattern_detected = pattern,
input_source = "user_interface"
),
user_id = session$user,
session_id = session$token
)
# Terminate session for security
session$close()
return()
}
}
}
})
# Session cleanup on disconnect
session$onSessionEnded(function() {
# Log session end
validation_framework$audit_logger$log_event(
message = "User session ended",
level = "INFO",
category = "SESSION",
user_id = session$user,
session_id = session$token
)
# Clean up session data
security_manager$session_manager$invalidate_session(session$token)
})
# Return reactive values for testing and integration
return(values)
})
}Common Questions About Data Validation and Security
Implement multi-layered validation with progressive disclosure of errors. Start with client-side validation for immediate user feedback, but always validate on the server for security. Use clear, actionable error messages that guide users toward correct input without revealing system vulnerabilities. The ValidationFramework provided implements this pattern with user-friendly messages while maintaining comprehensive security checks behind the scenes.
Input sanitization removes or modifies potentially dangerous content (like HTML tags or SQL injection attempts), while input validation checks if data meets expected criteria (like data types, ranges, or business rules). Both are essential: sanitization prevents attacks, while validation ensures data quality. Always sanitize first to remove threats, then validate to ensure data meets your application’s requirements. The SecurityManager handles sanitization while ValidationFramework manages validation.
Create comprehensive audit trails that capture who did what, when, and from where. Log all user actions, system events, validation results, and security incidents with immutable timestamps and unique identifiers. The AuditLogger provided implements 21 CFR Part 11 compliant logging with encryption, retention management, and compliance reporting. Ensure logs cannot be modified post-creation and implement proper access controls for log viewing and analysis.
Prioritize input sanitization (prevent injection attacks), session management (prevent unauthorized access), rate limiting (prevent abuse), and comprehensive audit logging (regulatory compliance). For clinical data, implement encryption at rest and in transit, proper access controls, and data anonymization techniques. The SecurityManager framework provides these enterprise-grade protections specifically designed for statistical applications in regulated environments.
Implement graceful degradation where errors don’t crash the application but provide helpful guidance. Show users clear, actionable error messages while logging detailed technical information for developers. Use progressive disclosure - show simple messages initially with options to view more details if needed. The ErrorHandler framework demonstrates this approach with user-friendly messages, retry suggestions, and comprehensive background logging for troubleshooting.
Test Your Understanding
You’re implementing validation for a clinical trial data entry application. Users report that validation is too slow, but security requirements mandate comprehensive checking. Which validation strategy optimizes both performance and security?
- Move all validation to client-side JavaScript for speed
- Implement asynchronous server-side validation with immediate client-side feedback
- Cache validation results and skip re-validation for similar inputs
- Reduce validation checks to improve performance
- Consider the security implications of client-side only validation
- Think about how to provide immediate feedback while maintaining security
- Clinical applications cannot compromise on validation thoroughness
- Performance and security don’t have to be mutually exclusive
B) Implement asynchronous server-side validation with immediate client-side feedback
Strategic Implementation:
Client-Side Layer (Immediate Feedback):
// Immediate basic validation for user experience
function validateInputFormat(input) {
// Basic format checking only
// Range validation, required fields
// NOT security validation
return {
valid: basicChecks(input),
message: "Please check your input format"
};
}Server-Side Layer (Security & Compliance):
# Comprehensive validation with security
validation_result <- validation_framework$validate(
data = sanitized_data,
validator_name = "clinical_data",
context = list(regulatory_compliance = TRUE)
)Performance Optimization Techniques:
- Debounce validation to avoid excessive server calls
- Progressive validation - validate fields as completed
- Batch validation for multiple fields simultaneously
- Async processing to avoid blocking UI interactions
Security Maintenance:
- Server-side validation remains authoritative and cannot be bypassed
- Input sanitization prevents injection attacks before validation
- Audit logging captures all validation attempts for compliance
- Rate limiting prevents validation endpoint abuse
Clinical Compliance Benefits:
- Complete audit trail of all validation events
- Immutable server-side checks for regulatory review
- User experience optimization without compromising data integrity
- Error documentation supports validation evidence requirements
This approach satisfies both user experience demands and regulatory security requirements by layering fast client feedback over comprehensive server validation.
Your pharmaceutical Shiny application handles sensitive clinical trial data. You have limited development time to implement security measures. Which security implementation sequence provides the best protection against the most critical threats?
- Authentication → Input sanitization → Audit logging → Rate limiting
- Input sanitization → Audit logging → Authentication → Rate limiting
- Audit logging → Authentication → Input sanitization → Rate limiting
- Rate limiting → Authentication → Input sanitization → Audit logging
- Consider which threats can cause immediate and severe damage
- Think about the dependencies between security measures
- Pharmaceutical applications have specific regulatory requirements
- Some security measures enable effective implementation of others
B) Input sanitization → Audit logging → Authentication → Rate limiting
Threat-Based Prioritization Logic:
1. Input Sanitization (Immediate Critical Protection):
# Prevents catastrophic attacks that could compromise entire system
sanitized_input <- security_manager$sanitize_input(
user_input,
type = "clinical_data"
)
# Blocks: SQL injection, script injection, file system attacks2. Audit Logging (Regulatory Compliance Foundation):
# Required for regulatory compliance and incident investigation
audit_logger$log_event(
message = "Data access attempt",
category = "DATA_ACCESS",
details = list(sanitized_input_hash = hash(sanitized_input))
)
# Enables: Compliance evidence, threat detection, incident response3. Authentication (Access Control):
# Controls who can access the system
if (!authenticate_user(credentials)) {
audit_logger$log_security_event("unauthorized_access_attempt")
return(access_denied())
}
# Provides: User identity verification, access control4. Rate Limiting (Abuse Prevention):
# Prevents system abuse and DoS attacks
if (!check_rate_limit(user_id, "clinical_data_access")) {
audit_logger$log_event("rate_limit_exceeded")
return(rate_limited())
}
# Protects: System availability, prevents brute force attacksStrategic Reasoning:
- Input sanitization prevents immediate system compromise
- Audit logging provides regulatory compliance foundation and attack detection
- Authentication controls access but relies on sanitized inputs for security
- Rate limiting prevents abuse but requires authenticated context for effectiveness
This sequence protects against the most severe threats first while building the foundation for comprehensive security.
During a critical clinical trial analysis, your Shiny application encounters a statistical calculation error. The application must maintain data integrity, provide user guidance, ensure regulatory compliance, and enable quick recovery. Which error handling approach best addresses all these requirements?
- Display detailed error message and allow users to continue with partial results
- Log error details, show user-friendly message, preserve data state, and offer recovery options
- Restart the application automatically and lose current analysis state
- Show generic error message and require complete data re-entry
- Consider regulatory requirements for error documentation
- Think about user productivity and data preservation
- Clinical applications cannot afford data loss
- Error recovery should be guided and safe
B) Log error details, show user-friendly message, preserve data state, and offer recovery options
Comprehensive Error Response Implementation:
1. Detailed Logging (Regulatory Compliance):
error_handler$handle_system_error(
error = calculation_error,
context = list(
operation = "clinical_statistical_analysis",
data_hash = digest(analysis_data),
analysis_parameters = list(
test_type = "survival_analysis",
confidence_level = 0.95,
patient_count = nrow(data)
),
recovery_state = serialize_analysis_state()
),
user_id = session$user,
session_id = session$token
)2. User-Friendly Communication:
showNotification(
"Statistical calculation encountered an issue. Your data has been preserved. Please review the suggested actions below.",
type = "warning",
duration = NULL # Persistent until addressed
)3. Data State Preservation:
# Automatic state preservation
recovery_data <- list(
validated_data = values$validated_data,
analysis_parameters = get_analysis_parameters(),
partial_results = values$partial_results,
timestamp = Sys.time(),
error_context = error_context
)
# Secure storage for recovery
save_recovery_state(recovery_data, session$token)4. Guided Recovery Options:
# Recovery UI component
recovery_ui <- div(
class = "alert alert-info mt-3",
h5("Recovery Options:"),
actionButton("retry_calculation", "Retry Analysis", class = "btn-primary me-2"),
actionButton("modify_parameters", "Adjust Parameters", class = "btn-secondary me-2"),
actionButton("export_data", "Export Current Data", class = "btn-outline-primary"),
hr(),
p("If problems persist, contact support with Error ID: ", code(error_id))
)Clinical Trial Benefits:
- No data loss - analysis can continue without re-entering data
- Audit compliance - complete error documentation for regulatory review
- User productivity - clear recovery path without starting over
- Data integrity - preserved state ensures consistency
- Support enablement - detailed logs help technical teams resolve issues
This approach balances regulatory requirements, user experience, and system reliability for mission-critical clinical applications.
Conclusion
Comprehensive data validation and security implementation transforms statistical applications from functional tools into enterprise-grade systems that protect sensitive data, ensure regulatory compliance, and maintain user trust. Through multi-layered validation frameworks, robust security protocols, and sophisticated error handling, you’ve learned to create applications that meet the rigorous standards required for pharmaceutical and clinical research environments.
The validation and security patterns implemented in this tutorial directly support career advancement by demonstrating advanced understanding of enterprise software engineering, regulatory compliance, and professional security practices. These skills distinguish developers who can create production-ready applications from those limited to prototype development.
Your mastery of input sanitization, audit logging, error boundaries, and recovery mechanisms creates applications that maintain data integrity and user productivity even under adverse conditions, ensuring that critical statistical insights remain accessible when stakeholders need them most.
Next Steps
Based on your comprehensive validation and security foundation, here are the recommended paths for continuing your enterprise development journey:
Immediate Next Steps (Complete These First)
- Statistical Rigor and Assumption Testing - Apply validation frameworks to ensure statistical method reliability
- Professional Statistical Visualizations - Secure and validate advanced plotting capabilities
- Practice Exercise: Implement the complete validation framework in your t-test application with security monitoring and audit logging
Building on Your Security Foundation (Choose Your Path)
For Quality Assurance Excellence:
For Production Readiness:
For Regulatory Excellence:
Long-term Goals (3-4 Months)
- Establish organizational security standards for statistical applications
- Create validation frameworks that support regulatory submission requirements
- Build security monitoring and incident response procedures for production applications
- Develop comprehensive audit systems that meet pharmaceutical compliance standards
Explore More Enterprise Development
Ready to enhance your validated and secured application with statistical excellence? Continue with comprehensive statistical rigor and assumption testing.
Reuse
Citation
@online{kassambara2025,
author = {Kassambara, Alboukadel},
title = {Bulletproof {Data} {Validation:} {Enterprise} {Security} for
{Statistical} {Applications}},
date = {2025-05-23},
url = {https://www.datanovia.com/learn/tools/shiny-apps/enterprise-development/data-validation-security.html},
langid = {en}
}
