flowchart TD A[User Input] --> B[Client-Side Validation] B --> C[Transport Security] C --> D[Server-Side Validation] D --> E[Business Logic Validation] E --> F[Database Validation] F --> G[Output Validation] G --> H[Audit Logging] B --> B1[Format checking] B --> B2[Range validation] B --> B3[Required fields] D --> D1[Input sanitization] D --> D2[Type checking] D --> D3[Size limits] D --> D4[Injection prevention] E --> E1[Statistical validity] E --> E2[Domain constraints] E --> E3[Cross-field validation] E --> E4[Business rules] F --> F1[Data type constraints] F --> F2[Referential integrity] F --> F3[Unique constraints] G --> G1[Output sanitization] G --> G2[Format validation] G --> G3[Content filtering] H --> H1[Access logging] H --> H2[Change tracking] H --> H3[Error recording] H --> H4[Compliance documentation] style A fill:#ffebee style H fill:#e8f5e8 style D fill:#fff3e0 style E fill:#f3e5f5
Key Takeaways
- Comprehensive Validation: Implement multi-layered validation systems that prevent data corruption, injection attacks, and user errors at input, processing, and output stages
- Security Protocols: Master enterprise security standards including input sanitization, session management, and audit logging required for regulated clinical research environments
- Error Boundaries: Create robust error handling systems that gracefully manage failures while maintaining application stability and user trust
- Regulatory Compliance: Build validation frameworks that support 21 CFR Part 11 compliance, audit trails, and documentation requirements for pharmaceutical applications
- Performance Optimization: Implement efficient validation strategies that maintain security without compromising application responsiveness or user experience
Introduction
Data validation and security form the foundation of enterprise statistical applications, particularly in regulated industries where data integrity directly impacts patient safety, regulatory compliance, and business operations. Unlike basic applications that rely on simple input checking, enterprise systems require comprehensive validation frameworks that prevent corruption, detect anomalies, and maintain audit trails for regulatory review.
This tutorial transforms our professionally designed t-test application into a secure, validated system that meets pharmaceutical and clinical research standards. You’ll learn to implement multi-layered validation, comprehensive error handling, security protocols, and audit logging systems that protect sensitive data while maintaining user productivity and regulatory compliance.
The validation and security patterns you’ll master apply universally to statistical applications while addressing specific requirements for biostatistics, clinical trials, and healthcare data where validation failures can have serious consequences for research integrity and patient safety.
Understanding Enterprise Validation Requirements
Multi-Layered Validation Architecture
Enterprise applications require comprehensive validation at multiple levels to ensure data integrity and security:
Critical Validation Layers:
Client-Side Validation (User Experience):
Immediate feedback for users to prevent basic errors and improve usability, while understanding that client-side validation alone is insufficient for security.
Server-Side Validation (Security Foundation):
Comprehensive validation on the server that cannot be bypassed, including input sanitization, type checking, and injection attack prevention.
Business Logic Validation (Domain Integrity):
Statistical and domain-specific validation that ensures data meets analytical requirements and business rules.
Audit and Compliance (Regulatory Requirements):
Comprehensive logging and documentation that supports regulatory validation and audit trail requirements.
Security Threat Model for Statistical Applications
Understanding common security threats helps design appropriate protection strategies:
Data Injection Attacks:
- SQL injection through database queries
- Script injection through user inputs
- File path manipulation through upload features
- Cross-site scripting (XSS) through output display
Data Integrity Threats:
- Unauthorized data modification
- Concurrent access conflicts
- Data corruption during processing
- Incomplete transaction processing
Privacy and Confidentiality Risks:
- Unauthorized data access
- Session hijacking and impersonation
- Data leakage through error messages
- Insufficient access controls
Compliance Violations:
- Inadequate audit trails
- Missing change documentation
- Insufficient validation evidence
- Non-compliant data handling
Comprehensive Input Validation Framework
Core Validation Infrastructure
Create a robust validation system that handles all input types with enterprise-grade security:
# File: R/validation_framework.R
#' Enterprise Validation Framework
#'
#' @description Comprehensive validation system for enterprise statistical
#' applications with security, audit logging, and regulatory compliance support.
#'
#' @details This framework provides multi-layered validation including input
#' sanitization, type checking, business rule validation, and audit logging.
#'
#' @export
<- R6::R6Class(
ValidationFramework "ValidationFramework",
public = list(
#' @field validators list of registered validators
validators = NULL,
#' @field audit_logger audit logging instance
audit_logger = NULL,
#' @field security_config security configuration settings
security_config = NULL,
#' Initialize validation framework
#'
#' @param config list. Configuration settings for validation and security
#'
initialize = function(config = list()) {
$validators <- list()
self$security_config <- private$default_security_config(config)
self$audit_logger <- AuditLogger$new(self$security_config$audit)
self
# Register default validators
$register_default_validators()
private
$audit_logger$log_event("ValidationFramework initialized",
selflevel = "INFO", category = "SYSTEM")
},
#' Validate input data with comprehensive checking
#'
#' @param data any. Data to validate
#' @param validator_name character. Name of validator to use
#' @param context list. Additional context for validation
#'
#' @return ValidationResult object with validation outcome
#'
validate = function(data, validator_name, context = list()) {
<- Sys.time()
start_time
tryCatch({
# Input sanitization
<- private$sanitize_input(data)
sanitized_data
# Get validator
<- self$validators[[validator_name]]
validator if (is.null(validator)) {
stop(paste("Validator not found:", validator_name))
}
# Perform validation
<- validator$validate(sanitized_data, context)
result
# Log validation attempt
$audit_logger$log_validation(
selfvalidator_name = validator_name,
input_hash = private$hash_input(data),
result = result,
duration = as.numeric(Sys.time() - start_time, units = "secs"),
context = context
)
return(result)
error = function(e) {
},
# Log validation error
$audit_logger$log_event(
selfpaste("Validation error:", e$message),
level = "ERROR",
category = "VALIDATION",
details = list(
validator_name = validator_name,
error = e$message,
context = context
)
)
# Return error result
$new(
ValidationResultvalid = FALSE,
errors = list(paste("Validation failed:", e$message)),
warnings = list(),
data = NULL
)
})
},
#' Register custom validator
#'
#' @param name character. Validator name
#' @param validator Validator. Validator instance
#'
register_validator = function(name, validator) {
$validators[[name]] <- validator
self$audit_logger$log_event(
selfpaste("Validator registered:", name),
level = "INFO",
category = "CONFIGURATION"
)
}
),
private = list(
#' Sanitize input data
#'
#' @param data any. Input data to sanitize
#'
#' @return Sanitized data
#'
sanitize_input = function(data) {
if (is.character(data)) {
# Remove potentially dangerous characters
<- gsub("[<>\"'&]", "", data)
data
# Limit length to prevent buffer overflow
<- self$security_config$max_string_length
max_length if (nchar(data) > max_length) {
<- substr(data, 1, max_length)
data
}
# Remove control characters
<- gsub("[[:cntrl:]]", "", data)
data
}
if (is.numeric(data)) {
# Check for infinite or NaN values
if (any(is.infinite(data)) || any(is.nan(data))) {
stop("Invalid numeric values detected (Inf/NaN)")
}
# Check range limits
<- self$security_config$max_numeric_value
max_value if (any(abs(data) > max_value, na.rm = TRUE)) {
stop("Numeric values exceed allowed range")
}
}
return(data)
},
#' Generate hash of input for audit trail
#'
#' @param data any. Input data
#'
#' @return character. Hash of input data
#'
hash_input = function(data) {
::digest(data, algo = "sha256")
digest
},
#' Default security configuration
#'
#' @param config list. User-provided configuration
#'
#' @return list. Complete security configuration
#'
default_security_config = function(config) {
<- list(
default_config max_string_length = 10000,
max_numeric_value = 1e10,
max_file_size = 50 * 1024 * 1024, # 50MB
allowed_file_types = c("csv", "txt", "xlsx"),
session_timeout = 3600, # 1 hour
max_login_attempts = 5,
audit = list(
enabled = TRUE,
log_level = "INFO",
log_file = "audit.log",
retention_days = 365
)
)
# Merge with user configuration
modifyList(default_config, config)
},
#' Register default validators
#'
register_default_validators = function() {
# Numeric data validator
$register_validator(
self"numeric_data",
$new()
NumericDataValidator
)
# Text data validator
$register_validator(
self"text_data",
$new()
TextDataValidator
)
# File upload validator
$register_validator(
self"file_upload",
$new()
FileUploadValidator
)
# Statistical data validator
$register_validator(
self"statistical_data",
$new()
StatisticalDataValidator
)
# T-test specific validator
$register_validator(
self"ttest_data",
$new()
TTestDataValidator
)
}
)
)
#' Validation Result Class
#'
#' @description Container for validation results with comprehensive information
#' about validation outcome, errors, warnings, and processed data.
#'
#' @export
<- R6::R6Class(
ValidationResult "ValidationResult",
public = list(
#' @field valid logical. Whether validation passed
valid = NULL,
#' @field errors list. Validation errors
errors = NULL,
#' @field warnings list. Validation warnings
warnings = NULL,
#' @field data any. Validated and processed data
data = NULL,
#' @field metadata list. Additional validation metadata
metadata = NULL,
#' Initialize validation result
#'
#' @param valid logical. Validation success status
#' @param errors list. List of error messages
#' @param warnings list. List of warning messages
#' @param data any. Validated data
#' @param metadata list. Additional metadata
#'
initialize = function(valid, errors = list(), warnings = list(),
data = NULL, metadata = list()) {
$valid <- valid
self$errors <- errors
self$warnings <- warnings
self$data <- data
self$metadata <- modifyList(
selflist(timestamp = Sys.time(),
validation_id = uuid::UUIDgenerate()),
metadata
)
},
#' Get formatted error message
#'
#' @return character. Formatted error message
#'
get_error_message = function() {
if (length(self$errors) == 0) {
return("No errors")
}
paste(self$errors, collapse = "; ")
},
#' Get formatted warning message
#'
#' @return character. Formatted warning message
#'
get_warning_message = function() {
if (length(self$warnings) == 0) {
return("No warnings")
}
paste(self$warnings, collapse = "; ")
},
#' Check if result has errors
#'
#' @return logical. TRUE if errors exist
#'
has_errors = function() {
length(self$errors) > 0
},
#' Check if result has warnings
#'
#' @return logical. TRUE if warnings exist
#'
has_warnings = function() {
length(self$warnings) > 0
}
) )
Specialized Validators
Create domain-specific validators for statistical applications:
# File: R/validators_statistical.R
#' Statistical Data Validator
#'
#' @description Comprehensive validator for statistical data with checks for
#' data quality, statistical assumptions, and analytical requirements.
#'
#' @export
<- R6::R6Class(
StatisticalDataValidator "StatisticalDataValidator",
public = list(
#' Validate statistical data
#'
#' @param data data.frame or numeric vector. Data to validate
#' @param context list. Validation context and requirements
#'
#' @return ValidationResult. Validation outcome
#'
validate = function(data, context = list()) {
<- list()
errors <- list()
warnings <- list()
metadata
# Basic data structure validation
<- private$validate_data_structure(data)
structure_result <- c(errors, structure_result$errors)
errors <- c(warnings, structure_result$warnings)
warnings
# Statistical quality checks
<- private$validate_data_quality(data)
quality_result <- c(errors, quality_result$errors)
errors <- c(warnings, quality_result$warnings)
warnings
# Sample size validation
<- private$validate_sample_size(data, context)
sample_result <- c(errors, sample_result$errors)
errors <- c(warnings, sample_result$warnings)
warnings
# Missing data validation
<- private$validate_missing_data(data, context)
missing_result <- c(errors, missing_result$errors)
errors <- c(warnings, missing_result$warnings)
warnings
# Outlier detection
<- private$detect_outliers(data)
outlier_result <- c(warnings, outlier_result$warnings)
warnings $outliers <- outlier_result$metadata
metadata
# Compile validation result
$new(
ValidationResultvalid = length(errors) == 0,
errors = errors,
warnings = warnings,
data = if (length(errors) == 0) data else NULL,
metadata = metadata
)
}
),
private = list(
#' Validate data structure
#'
#' @param data any. Input data
#'
#' @return list. Validation results
#'
validate_data_structure = function(data) {
<- list()
errors <- list()
warnings
# Check if data exists
if (is.null(data)) {
<- append(errors, "Data is null")
errors return(list(errors = errors, warnings = warnings))
}
# Check data type
if (!is.data.frame(data) && !is.numeric(data)) {
<- append(errors, "Data must be a data.frame or numeric vector")
errors
}
# Check for empty data
if (is.data.frame(data)) {
if (nrow(data) == 0) {
<- append(errors, "Data frame is empty")
errors
}if (ncol(data) == 0) {
<- append(errors, "Data frame has no columns")
errors
}else if (is.numeric(data)) {
} if (length(data) == 0) {
<- append(errors, "Numeric vector is empty")
errors
}
}
list(errors = errors, warnings = warnings)
},
#' Validate data quality
#'
#' @param data data.frame or numeric. Data to check
#'
#' @return list. Quality validation results
#'
validate_data_quality = function(data) {
<- list()
errors <- list()
warnings
if (is.data.frame(data)) {
# Check for numeric columns
<- sapply(data, is.numeric)
numeric_cols if (!any(numeric_cols)) {
<- append(errors, "Data frame contains no numeric columns")
errors
}
# Check for constant columns
<- sapply(data[numeric_cols], function(x) {
constant_cols if (length(unique(x[!is.na(x)])) <= 1) TRUE else FALSE
})
if (any(constant_cols)) {
<- names(data)[numeric_cols][constant_cols]
const_names <- append(warnings,
warnings paste("Constant columns detected:",
paste(const_names, collapse = ", ")))
}
else if (is.numeric(data)) {
} # Check for infinite values
if (any(is.infinite(data))) {
<- append(errors, "Data contains infinite values")
errors
}
# Check for all identical values
if (length(unique(data[!is.na(data)])) <= 1) {
<- append(warnings, "All non-missing values are identical")
warnings
}
}
list(errors = errors, warnings = warnings)
},
#' Validate sample size requirements
#'
#' @param data data.frame or numeric. Data to check
#' @param context list. Validation context
#'
#' @return list. Sample size validation results
#'
validate_sample_size = function(data, context) {
<- list()
errors <- list()
warnings
<- context$min_sample_size %||% 3
min_sample_size <- context$recommended_sample_size %||% 30
recommended_size
if (is.data.frame(data)) {
<- nrow(data)
n else {
} <- length(data)
n
}
if (n < min_sample_size) {
<- append(errors,
errors paste("Sample size", n, "is below minimum required:",
min_sample_size))else if (n < recommended_size) {
} <- append(warnings,
warnings paste("Sample size", n, "is below recommended size:",
"for reliable statistical inference"))
recommended_size,
}
list(errors = errors, warnings = warnings)
},
#' Validate missing data patterns
#'
#' @param data data.frame or numeric. Data to check
#' @param context list. Validation context
#'
#' @return list. Missing data validation results
#'
validate_missing_data = function(data, context) {
<- list()
errors <- list()
warnings
<- context$max_missing_proportion %||% 0.5
max_missing_prop
if (is.data.frame(data)) {
<- sapply(data, function(x) sum(is.na(x)) / length(x))
missing_props
if (any(missing_props > max_missing_prop)) {
<- names(missing_props)[missing_props > max_missing_prop]
high_missing <- append(errors,
errors paste("High missing data proportion in columns:",
paste(high_missing, collapse = ", ")))
}
if (any(missing_props > 0.1)) {
<- names(missing_props)[missing_props > 0.1]
some_missing <- append(warnings,
warnings paste("Notable missing data in columns:",
paste(some_missing, collapse = ", ")))
}
else if (is.numeric(data)) {
} <- sum(is.na(data)) / length(data)
missing_prop
if (missing_prop > max_missing_prop) {
<- append(errors,
errors paste("High missing data proportion:",
round(missing_prop * 100, 1), "%"))
else if (missing_prop > 0.1) {
} <- append(warnings,
warnings paste("Notable missing data:",
round(missing_prop * 100, 1), "%"))
}
}
list(errors = errors, warnings = warnings)
},
#' Detect statistical outliers
#'
#' @param data data.frame or numeric. Data to check
#'
#' @return list. Outlier detection results
#'
detect_outliers = function(data) {
<- list()
warnings <- list()
metadata
if (is.data.frame(data)) {
<- sapply(data, is.numeric)
numeric_cols <- list()
outlier_info
for (col_name in names(data)[numeric_cols]) {
<- data[[col_name]]
col_data <- private$detect_univariate_outliers(col_data)
outliers
if (outliers$count > 0) {
<- outliers
outlier_info[[col_name]] <- append(warnings,
warnings paste("Potential outliers detected in", col_name,
":", outliers$count, "observations"))
}
}
$outlier_details <- outlier_info
metadata
else if (is.numeric(data)) {
} <- private$detect_univariate_outliers(data)
outliers
if (outliers$count > 0) {
<- append(warnings,
warnings paste("Potential outliers detected:",
$count, "observations"))
outliers$outlier_details <- outliers
metadata
}
}
list(warnings = warnings, metadata = metadata)
},
#' Detect univariate outliers using IQR method
#'
#' @param x numeric. Numeric vector
#'
#' @return list. Outlier information
#'
detect_univariate_outliers = function(x) {
if (length(x) < 4 || all(is.na(x))) {
return(list(count = 0, indices = integer(0), method = "insufficient_data"))
}
# Remove missing values for calculation
<- x[!is.na(x)]
x_clean
# Calculate IQR-based outlier bounds
<- quantile(x_clean, 0.25, na.rm = TRUE)
q1 <- quantile(x_clean, 0.75, na.rm = TRUE)
q3 <- q3 - q1
iqr
<- q1 - 1.5 * iqr
lower_bound <- q3 + 1.5 * iqr
upper_bound
# Find outliers
<- x < lower_bound | x > upper_bound
outlier_logical <- which(outlier_logical)
outlier_indices
list(
count = length(outlier_indices),
indices = outlier_indices,
method = "IQR",
bounds = c(lower_bound, upper_bound),
values = x[outlier_indices]
)
}
)
)
#' T-Test Specific Data Validator
#'
#' @description Specialized validator for independent samples t-test data
#' with checks for grouping variables, sample sizes, and statistical assumptions.
#'
#' @export
<- R6::R6Class(
TTestDataValidator "TTestDataValidator",
inherit = StatisticalDataValidator,
public = list(
#' Validate t-test specific requirements
#'
#' @param data data.frame. Data with group and response variables
#' @param context list. Validation context
#'
#' @return ValidationResult. Validation outcome
#'
validate = function(data, context = list()) {
# Call parent validation first
<- super$validate(data, context)
parent_result
if (!parent_result$valid) {
return(parent_result)
}
<- parent_result$errors
errors <- parent_result$warnings
warnings <- parent_result$metadata
metadata
# T-test specific validations
<- private$validate_ttest_structure(data, context)
ttest_result <- c(errors, ttest_result$errors)
errors <- c(warnings, ttest_result$warnings)
warnings <- modifyList(metadata, ttest_result$metadata)
metadata
# Group-specific validations
if (length(errors) == 0) {
<- private$validate_group_requirements(data, context)
group_result <- c(errors, group_result$errors)
errors <- c(warnings, group_result$warnings)
warnings <- modifyList(metadata, group_result$metadata)
metadata
}
$new(
ValidationResultvalid = length(errors) == 0,
errors = errors,
warnings = warnings,
data = if (length(errors) == 0) data else NULL,
metadata = metadata
)
}
),
private = list(
#' Validate t-test data structure
#'
#' @param data data.frame. Input data
#' @param context list. Validation context
#'
#' @return list. Validation results
#'
validate_ttest_structure = function(data, context) {
<- list()
errors <- list()
warnings <- list()
metadata
# Check for required columns
<- c("group", "response")
required_cols <- setdiff(required_cols, names(data))
missing_cols
if (length(missing_cols) > 0) {
<- append(errors,
errors paste("Missing required columns:",
paste(missing_cols, collapse = ", ")))
return(list(errors = errors, warnings = warnings, metadata = metadata))
}
# Validate response variable
if (!is.numeric(data$response)) {
<- append(errors, "Response variable must be numeric")
errors
}
# Validate group variable
<- unique(data$group[!is.na(data$group)])
unique_groups
if (length(unique_groups) != 2) {
<- append(errors,
errors paste("Group variable must have exactly 2 levels, found:",
length(unique_groups)))
}
$group_levels <- unique_groups
metadata$total_observations <- nrow(data)
metadata
list(errors = errors, warnings = warnings, metadata = metadata)
},
#' Validate group-specific requirements
#'
#' @param data data.frame. Input data
#' @param context list. Validation context
#'
#' @return list. Validation results
#'
validate_group_requirements = function(data, context) {
<- list()
errors <- list()
warnings <- list()
metadata
<- context$min_group_size %||% 3
min_group_size <- context$recommended_group_size %||% 15
recommended_group_size
# Calculate group sizes
<- table(data$group)
group_sizes $group_sizes <- as.list(group_sizes)
metadata
# Check minimum group sizes
# Warning for recommended group sizes
<- group_sizes < recommended_group_size
small_recommended if (any(small_recommended)) {
<- names(group_sizes)[small_recommended]
small_rec_names <- append(warnings,
warnings paste("Groups below recommended sample size:",
paste(small_rec_names, collapse = ", "),
"(recommended:", recommended_group_size, ")"))
}
# Check for balanced design
<- max(group_sizes) / min(group_sizes)
size_ratio if (size_ratio > 3) {
<- append(warnings,
warnings paste("Unbalanced group sizes detected (ratio:",
round(size_ratio, 2), "). Consider balanced design for optimal power."))
}
# Validate within-group data quality
for (group_name in names(group_sizes)) {
<- data$response[data$group == group_name]
group_data
# Check for constant values within group
if (length(unique(group_data[!is.na(group_data)])) <= 1) {
<- append(errors,
errors paste("Group", group_name, "has constant values"))
}
# Check for excessive missing data within group
<- sum(is.na(group_data)) / length(group_data)
missing_prop if (missing_prop > 0.3) {
<- append(warnings,
warnings paste("High missing data in group", group_name,
":", round(missing_prop * 100, 1), "%"))
}
}
list(errors = errors, warnings = warnings, metadata = metadata)
}
) )
Security Protocol Implementation
Comprehensive Security Framework
Implement enterprise-grade security measures for statistical applications:
# File: R/security_framework.R
#' Enterprise Security Manager
#'
#' @description Comprehensive security framework for Shiny applications with
#' session management, access control, and threat protection.
#'
#' @export
<- R6::R6Class(
SecurityManager "SecurityManager",
public = list(
#' @field config security configuration
config = NULL,
#' @field audit_logger audit logging instance
audit_logger = NULL,
#' @field session_manager session management instance
session_manager = NULL,
#' Initialize security manager
#'
#' @param config list. Security configuration
#'
initialize = function(config = list()) {
$config <- private$default_security_config(config)
self$audit_logger <- AuditLogger$new(self$config$audit)
self$session_manager <- SessionManager$new(self$config$session)
self
$audit_logger$log_event("SecurityManager initialized",
selflevel = "INFO", category = "SECURITY")
},
#' Sanitize user input to prevent injection attacks
#'
#' @param input any. User input to sanitize
#' @param type character. Type of input (text, numeric, file, etc.)
#'
#' @return Sanitized input or error
#'
sanitize_input = function(input, type = "text") {
tryCatch({
if (is.null(input) || length(input) == 0) {
return(input)
}
switch(type,
"text" = private$sanitize_text_input(input),
"numeric" = private$sanitize_numeric_input(input),
"file" = private$sanitize_file_input(input),
"sql" = private$sanitize_sql_input(input),
{$audit_logger$log_event(
selfpaste("Unknown input type:", type),
level = "WARNING", category = "SECURITY"
)$sanitize_text_input(input)
private
}
)
error = function(e) {
}, $audit_logger$log_event(
selfpaste("Input sanitization failed:", e$message),
level = "ERROR", category = "SECURITY"
)stop("Input sanitization failed")
})
},
#' Validate session security
#'
#' @param session shiny session object
#'
#' @return logical. TRUE if session is valid and secure
#'
validate_session = function(session) {
<- session$token
session_id
# Check session existence and validity
if (!self$session_manager$is_valid_session(session_id)) {
$audit_logger$log_event(
self"Invalid session access attempt",
level = "WARNING",
category = "SECURITY",
details = list(session_id = session_id)
)return(FALSE)
}
# Check session timeout
if (self$session_manager$is_session_expired(session_id)) {
$audit_logger$log_event(
self"Expired session access attempt",
level = "INFO",
category = "SECURITY",
details = list(session_id = session_id)
)$session_manager$invalidate_session(session_id)
selfreturn(FALSE)
}
# Update session activity
$session_manager$update_session_activity(session_id)
self
return(TRUE)
},
#' Check rate limiting for requests
#'
#' @param identifier character. User/session identifier
#' @param action character. Action being performed
#'
#' @return logical. TRUE if request is allowed
#'
check_rate_limit = function(identifier, action = "general") {
<- Sys.time()
current_time
# Get rate limit configuration for action
<- self$config$rate_limits[[action]] %||%
rate_config $config$rate_limits$default
self
# Check if rate limit exceeded
if (private$is_rate_limited(identifier, action, current_time, rate_config)) {
$audit_logger$log_event(
selfpaste("Rate limit exceeded for", action),
level = "WARNING",
category = "SECURITY",
details = list(
identifier = identifier,
action = action,
limit = rate_config
)
)
return(FALSE)
}
# Record request
$record_request(identifier, action, current_time)
private
return(TRUE)
},
#' Generate secure random token
#'
#' @param length integer. Token length
#'
#' @return character. Secure random token
#'
generate_secure_token = function(length = 32) {
# Use cryptographically secure random number generation
<- openssl::rand_bytes(length)
raw_bytes ::base64_encode(raw_bytes)
openssl
}
),
private = list(
#' Default security configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete security configuration
#'
default_security_config = function(config) {
<- list(
default_config input_limits = list(
max_string_length = 10000,
max_numeric_value = 1e10,
max_file_size = 50 * 1024 * 1024, # 50MB
allowed_file_types = c("csv", "txt", "xlsx", "rds")
),
session = list(
timeout_minutes = 60,
max_concurrent_sessions = 100,
require_csrf_token = TRUE
),
rate_limits = list(
default = list(requests = 100, window_minutes = 15),
file_upload = list(requests = 10, window_minutes = 60),
analysis = list(requests = 50, window_minutes = 10)
),
encryption = list(
algorithm = "AES-256-GCM",
key_rotation_days = 90
),
audit = list(
enabled = TRUE,
log_level = "INFO",
retention_days = 365,
encrypt_logs = TRUE
)
)
modifyList(default_config, config)
},
#' Sanitize text input
#'
#' @param input character. Text input
#'
#' @return character. Sanitized text
#'
sanitize_text_input = function(input) {
if (!is.character(input)) {
<- as.character(input)
input
}
# Remove HTML tags and potentially dangerous characters
<- gsub("<[^>]*>", "", input)
input <- gsub("[<>\"'&]", "", input)
input
# Remove script-related content
<- gsub("(?i)javascript:", "", input, perl = TRUE)
input <- gsub("(?i)<script", "", input, perl = TRUE)
input <- gsub("(?i)on\\w+\\s*=", "", input, perl = TRUE)
input
# Limit length
<- self$config$input_limits$max_string_length
max_length if (nchar(input) > max_length) {
<- substr(input, 1, max_length)
input
}
# Remove control characters except tabs and newlines
<- gsub("[[:cntrl:]&&[^\t\n]]", "", input)
input
return(input)
},
#' Sanitize numeric input
#'
#' @param input numeric. Numeric input
#'
#' @return numeric. Validated numeric input
#'
sanitize_numeric_input = function(input) {
if (!is.numeric(input)) {
# Try to convert to numeric
<- suppressWarnings(as.numeric(input))
input if (any(is.na(input))) {
stop("Invalid numeric input")
}
}
# Check for infinite or NaN values
if (any(is.infinite(input)) || any(is.nan(input))) {
stop("Invalid numeric values (Inf/NaN) detected")
}
# Check value limits
<- self$config$input_limits$max_numeric_value
max_value if (any(abs(input) > max_value, na.rm = TRUE)) {
stop("Numeric values exceed allowed range")
}
return(input)
},
#' Sanitize file input
#'
#' @param file_info list. File information from Shiny
#'
#' @return list. Validated file information
#'
sanitize_file_input = function(file_info) {
if (is.null(file_info) || is.null(file_info$datapath)) {
stop("Invalid file input")
}
# Check file size
<- file.info(file_info$datapath)$size
file_size <- self$config$input_limits$max_file_size
max_size
if (file_size > max_size) {
stop(paste("File size", file_size, "exceeds maximum allowed:", max_size))
}
# Check file extension
<- tolower(tools::file_ext(file_info$name))
file_ext <- self$config$input_limits$allowed_file_types
allowed_types
if (!file_ext %in% allowed_types) {
stop(paste("File type", file_ext, "not allowed"))
}
# Sanitize filename
<- gsub("[^a-zA-Z0-9._-]", "", file_info$name)
safe_name $name <- safe_name
file_info
return(file_info)
},
#' Sanitize SQL input to prevent injection
#'
#' @param input character. SQL input
#'
#' @return character. Sanitized SQL
#'
sanitize_sql_input = function(input) {
# Remove dangerous SQL keywords and characters
<- c(
dangerous_patterns "(?i)\\b(DROP|DELETE|UPDATE|INSERT|ALTER|CREATE|EXEC|EXECUTE)\\b",
"(?i)\\b(UNION|SELECT.*FROM)\\b",
"[';\"\\\\]",
"--",
"/\\*",
"\\*/"
)
for (pattern in dangerous_patterns) {
<- gsub(pattern, "", input, perl = TRUE)
input
}
return(input)
},
#' Check if rate limit is exceeded
#'
#' @param identifier character. Request identifier
#' @param action character. Action type
#' @param current_time POSIXct. Current timestamp
#' @param rate_config list. Rate limit configuration
#'
#' @return logical. TRUE if rate limited
#'
is_rate_limited = function(identifier, action, current_time, rate_config) {
# Get request history for this identifier and action
<- paste(identifier, action, sep = "_")
request_key
# This would typically use Redis or similar in production
# For demo purposes, using environment variable storage
<- get(request_key, envir = private$.request_cache,
request_history inherits = FALSE)
if (is.null(request_history)) {
return(FALSE)
}
# Filter requests within the time window
<- current_time - (rate_config$window_minutes * 60)
window_start <- request_history[request_history >= window_start]
recent_requests
# Check if limit exceeded
return(length(recent_requests) >= rate_config$requests)
},
#' Record request for rate limiting
#'
#' @param identifier character. Request identifier
#' @param action character. Action type
#' @param timestamp POSIXct. Request timestamp
#'
record_request = function(identifier, action, timestamp) {
<- paste(identifier, action, sep = "_")
request_key
# Get existing history
<- get(request_key, envir = private$.request_cache,
existing_history inherits = FALSE)
if (is.null(existing_history)) {
<- c()
existing_history
}
# Add new request
<- c(existing_history, timestamp)
updated_history
# Keep only recent requests (last 24 hours)
<- timestamp - (24 * 3600)
cutoff_time <- updated_history[updated_history >= cutoff_time]
updated_history
# Store updated history
assign(request_key, updated_history, envir = private$.request_cache)
},
#' Request cache environment
.request_cache = new.env(parent = emptyenv())
)
)
#' Session Manager for Security
#'
#' @description Manages user sessions with security features including
#' timeout handling, concurrent session limits, and session validation.
#'
#' @export
<- R6::R6Class(
SessionManager "SessionManager",
public = list(
#' @field config session configuration
config = NULL,
#' @field sessions session storage
sessions = NULL,
#' Initialize session manager
#'
#' @param config list. Session configuration
#'
initialize = function(config = list()) {
$config <- config
self$sessions <- new.env(parent = emptyenv())
self
},
#' Create new session
#'
#' @param session_id character. Session identifier
#' @param user_info list. User information
#'
#' @return logical. TRUE if session created successfully
#'
create_session = function(session_id, user_info = list()) {
# Check concurrent session limits
if (private$count_active_sessions() >= self$config$max_concurrent_sessions) {
return(FALSE)
}
<- list(
session_data id = session_id,
created = Sys.time(),
last_activity = Sys.time(),
user_info = user_info,
csrf_token = private$generate_csrf_token()
)
assign(session_id, session_data, envir = self$sessions)
return(TRUE)
},
#' Check if session is valid
#'
#' @param session_id character. Session identifier
#'
#' @return logical. TRUE if session is valid
#'
is_valid_session = function(session_id) {
exists(session_id, envir = self$sessions)
},
#' Check if session is expired
#'
#' @param session_id character. Session identifier
#'
#' @return logical. TRUE if session is expired
#'
is_session_expired = function(session_id) {
if (!self$is_valid_session(session_id)) {
return(TRUE)
}
<- get(session_id, envir = self$sessions)
session_data <- self$config$timeout_minutes * 60
timeout_seconds
Sys.time() - session_data$last_activity) > timeout_seconds
(
},
#' Update session activity
#'
#' @param session_id character. Session identifier
#'
update_session_activity = function(session_id) {
if (self$is_valid_session(session_id)) {
<- get(session_id, envir = self$sessions)
session_data $last_activity <- Sys.time()
session_dataassign(session_id, session_data, envir = self$sessions)
}
},
#' Invalidate session
#'
#' @param session_id character. Session identifier
#'
invalidate_session = function(session_id) {
if (exists(session_id, envir = self$sessions)) {
rm(list = session_id, envir = self$sessions)
}
},
#' Get CSRF token for session
#'
#' @param session_id character. Session identifier
#'
#' @return character. CSRF token or NULL
#'
get_csrf_token = function(session_id) {
if (self$is_valid_session(session_id)) {
<- get(session_id, envir = self$sessions)
session_data return(session_data$csrf_token)
}
return(NULL)
}
),
private = list(
#' Count active sessions
#'
#' @return integer. Number of active sessions
#'
count_active_sessions = function() {
length(ls(envir = self$sessions))
},
#' Generate CSRF token
#'
#' @return character. CSRF token
#'
generate_csrf_token = function() {
::base64_encode(openssl::rand_bytes(32))
openssl
}
) )
Audit Logging and Compliance
Comprehensive Audit System
Implement enterprise-grade audit logging for regulatory compliance:
# File: R/audit_logging.R
#' Enterprise Audit Logger
#'
#' @description Comprehensive audit logging system for regulatory compliance
#' with encrypted storage, retention management, and compliance reporting.
#'
#' @export
<- R6::R6Class(
AuditLogger "AuditLogger",
public = list(
#' @field config audit configuration
config = NULL,
#' @field log_file path to log file
log_file = NULL,
#' Initialize audit logger
#'
#' @param config list. Audit configuration
#'
initialize = function(config = list()) {
$config <- private$default_audit_config(config)
self$log_file <- self$config$log_file
self
# Ensure log directory exists
<- dirname(self$log_file)
log_dir if (!dir.exists(log_dir)) {
dir.create(log_dir, recursive = TRUE)
}
# Initialize log file with header
$initialize_log_file()
private
},
#' Log general event
#'
#' @param message character. Log message
#' @param level character. Log level (DEBUG, INFO, WARNING, ERROR)
#' @param category character. Event category
#' @param details list. Additional event details
#' @param user_id character. User identifier (optional)
#' @param session_id character. Session identifier (optional)
#'
log_event = function(message, level = "INFO", category = "GENERAL",
details = NULL, user_id = NULL, session_id = NULL) {
# Check if logging is enabled for this level
if (!private$should_log(level)) {
return(invisible(NULL))
}
<- list(
log_entry timestamp = format(Sys.time(), "%Y-%m-%d %H:%M:%S %Z"),
level = level,
category = category,
message = message,
user_id = user_id,
session_id = session_id,
ip_address = private$get_client_ip(),
user_agent = private$get_user_agent(),
details = details,
entry_id = uuid::UUIDgenerate()
)
$write_log_entry(log_entry)
private
},
#' Log validation event
#'
#' @param validator_name character. Name of validator
#' @param input_hash character. Hash of input data
#' @param result ValidationResult. Validation result
#' @param duration numeric. Validation duration in seconds
#' @param context list. Validation context
#' @param user_id character. User identifier (optional)
#' @param session_id character. Session identifier (optional)
#'
log_validation = function(validator_name, input_hash, result, duration,
context = NULL, user_id = NULL, session_id = NULL) {
<- list(
details validator_name = validator_name,
input_hash = input_hash,
validation_successful = result$valid,
error_count = length(result$errors),
warning_count = length(result$warnings),
duration_seconds = duration,
context = context
)
<- if (result$valid) "INFO" else "WARNING"
level <- paste("Validation", if (result$valid) "passed" else "failed",
message "for", validator_name)
$log_event(
selfmessage = message,
level = level,
category = "VALIDATION",
details = details,
user_id = user_id,
session_id = session_id
)
},
#' Log security event
#'
#' @param event_type character. Type of security event
#' @param severity character. Event severity (LOW, MEDIUM, HIGH, CRITICAL)
#' @param description character. Event description
#' @param threat_indicators list. Threat indicators
#' @param user_id character. User identifier (optional)
#' @param session_id character. Session identifier (optional)
#'
log_security_event = function(event_type, severity, description,
threat_indicators = NULL, user_id = NULL,
session_id = NULL) {
<- list(
details event_type = event_type,
severity = severity,
threat_indicators = threat_indicators,
remediation_required = severity %in% c("HIGH", "CRITICAL")
)
<- switch(severity,
level "LOW" = "INFO",
"MEDIUM" = "WARNING",
"HIGH" = "ERROR",
"CRITICAL" = "ERROR"
)
$log_event(
selfmessage = paste("Security event:", description),
level = level,
category = "SECURITY",
details = details,
user_id = user_id,
session_id = session_id
)
# Send alerts for high severity events
if (severity %in% c("HIGH", "CRITICAL")) {
$send_security_alert(event_type, severity, description, details)
private
}
},
#' Log data access event
#'
#' @param operation character. Type of operation (READ, WRITE, DELETE)
#' @param resource character. Resource accessed
#' @param success logical. Whether operation was successful
#' @param user_id character. User identifier
#' @param session_id character. Session identifier
#' @param details list. Additional operation details
#'
log_data_access = function(operation, resource, success, user_id,
details = NULL) {
session_id,
<- list(
access_details operation = operation,
resource = resource,
success = success,
operation_details = details
)
<- if (success) "INFO" else "WARNING"
level <- paste("Data access:", operation, "on", resource,
message if (success) "succeeded" else "failed")
$log_event(
selfmessage = message,
level = level,
category = "DATA_ACCESS",
details = access_details,
user_id = user_id,
session_id = session_id
)
},
#' Generate compliance report
#'
#' @param start_date Date. Report start date
#' @param end_date Date. Report end date
#' @param categories character vector. Event categories to include
#'
#' @return data.frame. Compliance report data
#'
generate_compliance_report = function(start_date, end_date,
categories = NULL) {
# Read and parse log entries
<- private$read_log_entries(start_date, end_date)
log_entries
if (!is.null(categories)) {
<- log_entries[log_entries$category %in% categories, ]
log_entries
}
# Generate summary statistics
<- list(
report_data period = list(start = start_date, end = end_date),
total_events = nrow(log_entries),
events_by_category = table(log_entries$category),
events_by_level = table(log_entries$level),
unique_users = length(unique(log_entries$user_id[!is.na(log_entries$user_id)])),
unique_sessions = length(unique(log_entries$session_id[!is.na(log_entries$session_id)])),
security_events = sum(log_entries$category == "SECURITY"),
validation_failures = sum(log_entries$category == "VALIDATION" &
$level %in% c("WARNING", "ERROR")),
log_entriesdata_access_events = sum(log_entries$category == "DATA_ACCESS")
)
return(report_data)
}
),
private = list(
#' Default audit configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete audit configuration
#'
default_audit_config = function(config) {
<- list(
default_config log_file = "logs/audit.log",
log_level = "INFO",
retention_days = 365,
encrypt_logs = FALSE,
max_file_size = 100 * 1024 * 1024, # 100MB
rotation_enabled = TRUE,
compression_enabled = TRUE,
real_time_alerts = TRUE,
alert_endpoints = list()
)
modifyList(default_config, config)
},
#' Check if message should be logged based on level
#'
#' @param level character. Message level
#'
#' @return logical. TRUE if should log
#'
should_log = function(level) {
<- c("DEBUG" = 1, "INFO" = 2, "WARNING" = 3, "ERROR" = 4)
level_hierarchy
<- level_hierarchy[level] %||% 1
message_level <- level_hierarchy[self$config$log_level] %||% 2
config_level
>= config_level
message_level
},
#' Initialize log file with header
#'
initialize_log_file = function() {
if (!file.exists(self$log_file)) {
<- paste(
header "# Audit Log File",
paste("# Created:", Sys.time()),
paste("# Application: Enterprise Statistical Analysis Platform"),
paste("# Version: 1.0"),
"# Format: JSON Lines",
"",
sep = "\n"
)
writeLines(header, self$log_file)
}
},
#' Write log entry to file
#'
#' @param log_entry list. Log entry data
#'
write_log_entry = function(log_entry) {
# Convert to JSON
<- jsonlite::toJSON(log_entry, auto_unbox = TRUE)
json_entry
# Encrypt if configured
if (self$config$encrypt_logs) {
<- private$encrypt_log_entry(json_entry)
json_entry
}
# Write to file with file locking
tryCatch({
write(json_entry, file = self$log_file, append = TRUE)
error = function(e) {
}, # Fallback logging to console if file write fails
cat("LOG WRITE ERROR:", e$message, "\n")
cat("ENTRY:", json_entry, "\n")
})
# Check for log rotation
$check_log_rotation()
private
},
#' Check if log rotation is needed
#'
check_log_rotation = function() {
if (!self$config$rotation_enabled) {
return(invisible(NULL))
}
<- file.info(self$log_file)
file_info
if (file_info$size > self$config$max_file_size) {
$rotate_log_file()
private
}
},
#' Rotate log file
#'
rotate_log_file = function() {
<- format(Sys.time(), "%Y%m%d_%H%M%S")
timestamp <- paste0(self$log_file, ".", timestamp)
rotated_name
# Move current log to rotated name
file.rename(self$log_file, rotated_name)
# Compress if enabled
if (self$config$compression_enabled) {
::gzip(rotated_name)
R.utilsfile.remove(rotated_name)
}
# Initialize new log file
$initialize_log_file()
private
},
#' Get client IP address
#'
#' @return character. Client IP address
#'
get_client_ip = function() {
# In a real Shiny application, this would extract from session
Sys.getenv("HTTP_X_FORWARDED_FOR", "127.0.0.1")
},
#' Get user agent string
#'
#' @return character. User agent string
#'
get_user_agent = function() {
# In a real Shiny application, this would extract from session
Sys.getenv("HTTP_USER_AGENT", "Unknown")
},
#' Encrypt log entry
#'
#' @param entry character. Log entry JSON
#'
#' @return character. Encrypted log entry
#'
encrypt_log_entry = function(entry) {
# In production, use proper key management
# This is a simplified example
<- Sys.getenv("AUDIT_LOG_KEY", "default_key_change_in_production")
key
tryCatch({
# Simple encryption for demonstration
# In production, use proper encryption libraries
<- openssl::aes_cbc_encrypt(charToRaw(entry), key = charToRaw(key))
encrypted ::base64_encode(encrypted)
opensslerror = function(e) {
}, # Return unencrypted if encryption fails
entry
})
},
#' Read log entries for reporting
#'
#' @param start_date Date. Start date for entries
#' @param end_date Date. End date for entries
#'
#' @return data.frame. Log entries
#'
read_log_entries = function(start_date, end_date) {
if (!file.exists(self$log_file)) {
return(data.frame())
}
# Read all lines from log file
<- readLines(self$log_file)
lines
# Filter out comment lines
<- lines[!grepl("^#", lines) & nchar(lines) > 0]
json_lines
if (length(json_lines) == 0) {
return(data.frame())
}
# Parse JSON entries
<- list()
entries for (i in seq_along(json_lines)) {
tryCatch({
<- jsonlite::fromJSON(json_lines[i])
entry $timestamp <- as.POSIXct(entry$timestamp)
entry<- entry
entries[[i]] error = function(e) {
}, # Skip malformed entries
NULL
})
}
# Convert to data frame
if (length(entries) > 0) {
# Filter by date range
<- do.call(rbind.data.frame, entries)
entry_df <- entry_df[entry_df$timestamp >= start_date &
entry_df $timestamp <= end_date, ]
entry_dfreturn(entry_df)
}
return(data.frame())
},
#' Send security alert
#'
#' @param event_type character. Event type
#' @param severity character. Event severity
#' @param description character. Event description
#' @param details list. Event details
#'
send_security_alert = function(event_type, severity, description, details) {
if (!self$config$real_time_alerts) {
return(invisible(NULL))
}
<- list(
alert_data timestamp = Sys.time(),
event_type = event_type,
severity = severity,
description = description,
details = details,
application = "Enterprise Statistical Analysis Platform"
)
# In production, this would integrate with alerting systems
# like PagerDuty, Slack, email, or SIEM systems
cat("SECURITY ALERT:", jsonlite::toJSON(alert_data, auto_unbox = TRUE), "\n")
}
) )
Error Handling and Recovery
Comprehensive Error Management
Implement robust error handling that maintains application stability:
# File: R/error_handling.R
#' Enterprise Error Handler
#'
#' @description Comprehensive error handling system with graceful degradation,
#' user-friendly messages, and detailed logging for enterprise applications.
#'
#' @export
<- R6::R6Class(
ErrorHandler "ErrorHandler",
public = list(
#' @field config error handling configuration
config = NULL,
#' @field audit_logger audit logger instance
audit_logger = NULL,
#' Initialize error handler
#'
#' @param config list. Error handling configuration
#' @param audit_logger AuditLogger. Audit logger instance
#'
initialize = function(config = list(), audit_logger = NULL) {
$config <- private$default_error_config(config)
self$audit_logger <- audit_logger
self
},
#' Handle validation errors
#'
#' @param validation_result ValidationResult. Validation result object
#' @param context list. Error context information
#'
#' @return list. User-friendly error response
#'
handle_validation_error = function(validation_result, context = list()) {
if (validation_result$valid) {
return(list(
type = "success",
message = "Validation successful",
data = validation_result$data
))
}
# Log validation errors
if (!is.null(self$audit_logger)) {
$audit_logger$log_event(
selfmessage = "Validation failed",
level = "WARNING",
category = "VALIDATION",
details = list(
errors = validation_result$errors,
warnings = validation_result$warnings,
context = context
)
)
}
# Generate user-friendly error message
<- private$format_validation_errors(validation_result$errors)
user_message
list(
type = "validation_error",
message = user_message,
errors = validation_result$errors,
warnings = validation_result$warnings,
show_details = self$config$show_error_details
)
},
#' Handle system errors
#'
#' @param error condition. Error condition object
#' @param context list. Error context information
#' @param user_id character. User identifier (optional)
#' @param session_id character. Session identifier (optional)
#'
#' @return list. Error response for user
#'
handle_system_error = function(error, context = list(), user_id = NULL,
session_id = NULL) {
<- uuid::UUIDgenerate()
error_id
# Log detailed error information
if (!is.null(self$audit_logger)) {
$audit_logger$log_event(
selfmessage = paste("System error:", error$message),
level = "ERROR",
category = "SYSTEM",
details = list(
error_id = error_id,
error_class = class(error)[1],
error_message = error$message,
error_call = deparse(error$call),
stack_trace = private$get_stack_trace(error),
context = context
),user_id = user_id,
session_id = session_id
)
}
# Generate user-friendly response
<- if (self$config$show_error_details) {
user_message paste("An error occurred:", error$message, "\nError ID:", error_id)
else {
} paste("An unexpected error occurred. Please try again.",
"If the problem persists, contact support with Error ID:", error_id)
}
list(
type = "system_error",
message = user_message,
error_id = error_id,
recoverable = private$is_recoverable_error(error),
retry_suggested = private$should_suggest_retry(error)
)
},
#' Handle security errors
#'
#' @param security_event character. Type of security event
#' @param details list. Security event details
#' @param user_id character. User identifier (optional)
#' @param session_id character. Session identifier (optional)
#'
#' @return list. Security error response
#'
handle_security_error = function(security_event, details = list(),
user_id = NULL, session_id = NULL) {
# Log security incident
if (!is.null(self$audit_logger)) {
$audit_logger$log_security_event(
selfevent_type = security_event,
severity = "HIGH",
description = paste("Security violation:", security_event),
threat_indicators = details,
user_id = user_id,
session_id = session_id
)
}
# Generic security message (don't reveal specific attack vectors)
<- "Access denied. This incident has been logged."
user_message
list(
type = "security_error",
message = user_message,
action_required = "contact_administrator",
session_terminated = TRUE
)
},
#' Create error boundary for Shiny modules
#'
#' @param expr expression. Code to execute with error handling
#' @param context list. Execution context
#' @param fallback_ui UI. Fallback UI to show on error
#'
#' @return UI or error result
#'
error_boundary = function(expr, context = list(), fallback_ui = NULL) {
tryCatch({
exprerror = function(e) {
},
<- self$handle_system_error(e, context)
error_response
if (is.null(fallback_ui)) {
<- private$create_error_ui(error_response)
fallback_ui
}
return(fallback_ui)
})
},
#' Async error handler for long-running operations
#'
#' @param future_expr future. Future expression to monitor
#' @param progress_callback function. Progress update callback
#' @param error_callback function. Error handling callback
#'
#' @return promise. Promise with error handling
#'
async_error_handler = function(future_expr, progress_callback = NULL,
error_callback = NULL) {
%>%
future_expr ::catch(function(error) {
promises
<- self$handle_system_error(
error_response
error, context = list(operation = "async_operation")
)
if (!is.null(error_callback)) {
error_callback(error_response)
}
return(error_response)
})
}
),
private = list(
#' Default error handling configuration
#'
#' @param config list. User configuration
#'
#' @return list. Complete configuration
#'
default_error_config = function(config) {
<- list(
default_config show_error_details = FALSE, # Set to TRUE only in development
log_all_errors = TRUE,
user_friendly_messages = TRUE,
max_error_length = 500,
retry_attempts = 3,
retry_delay_seconds = 2,
error_page_template = "default",
notification_duration = 5000 # milliseconds
)
modifyList(default_config, config)
},
#' Format validation errors for users
#'
#' @param errors list. List of error messages
#'
#' @return character. Formatted error message
#'
format_validation_errors = function(errors) {
if (length(errors) == 0) {
return("No errors found.")
}
if (length(errors) == 1) {
return(paste("Validation error:", errors[[1]]))
}
# Multiple errors
<- paste("-", errors, collapse = "\n")
error_list paste("Multiple validation errors found:\n", error_list)
},
#' Get stack trace from error
#'
#' @param error condition. Error object
#'
#' @return character. Stack trace
#'
get_stack_trace = function(error) {
tryCatch({
# Get call stack
<- sys.calls()
calls if (length(calls) > 0) {
<- sapply(calls, function(call) {
stack_trace paste(deparse(call), collapse = " ")
})return(paste(stack_trace, collapse = "\n"))
}error = function(e) {
}, return("Stack trace unavailable")
})
return("No stack trace available")
},
#' Check if error is recoverable
#'
#' @param error condition. Error object
#'
#' @return logical. TRUE if recoverable
#'
is_recoverable_error = function(error) {
# Define recoverable error patterns
<- c(
recoverable_patterns "connection.*timeout",
"network.*error",
"temporary.*unavailable",
"service.*busy"
)
<- tolower(error$message)
error_message
any(sapply(recoverable_patterns, function(pattern) {
grepl(pattern, error_message, perl = TRUE)
}))
},
#' Check if retry should be suggested
#'
#' @param error condition. Error object
#'
#' @return logical. TRUE if retry suggested
#'
should_suggest_retry = function(error) {
# Suggest retry for certain types of errors
<- c(
retry_patterns "timeout",
"connection",
"network",
"temporary",
"busy"
)
<- tolower(error$message)
error_message
any(sapply(retry_patterns, function(pattern) {
grepl(pattern, error_message)
}))
},
#' Create error UI component
#'
#' @param error_response list. Error response data
#'
#' @return UI. Error display UI
#'
create_error_ui = function(error_response) {
<- switch(error_response$type,
alert_class "validation_error" = "alert-warning",
"system_error" = "alert-danger",
"security_error" = "alert-danger",
"alert-info"
)
div(
class = paste("alert", alert_class, "alert-dismissible fade show"),
role = "alert",
# Error icon
icon("exclamation-triangle", class = "me-2"),
# Error message
strong("Error: "), error_response$message,
# Additional information
if (!is.null(error_response$error_id)) {
div(
class = "mt-2 small",
"Reference ID: ",
code(error_response$error_id)
)
},
# Retry button if applicable
if (isTRUE(error_response$retry_suggested)) {
div(
class = "mt-3",
actionButton(
"retry_operation",
"Retry",
class = "btn btn-sm btn-secondary"
)
)
},
# Dismiss button
$button(
tagstype = "button",
class = "btn-close",
`data-bs-dismiss` = "alert",
`aria-label` = "Close"
)
)
}
) )
Integration with t-Test Application
Implementing Validation in the t-Test Module
Apply comprehensive validation to the enterprise t-test application:
# File: R/mod_ttest_validated.R
#' Validated t-Test Module Server
#'
#' @description Enhanced t-test module with comprehensive validation,
#' security, and error handling for enterprise deployment.
#'
#' @param id character. Module identifier
#'
#' @export
<- function(id) {
mod_ttest_validated_server moduleServer(id, function(input, output, session) {
# Initialize enterprise frameworks
<- ValidationFramework$new()
validation_framework <- SecurityManager$new()
security_manager <- ErrorHandler$new(
error_handler config = list(show_error_details = FALSE),
audit_logger = validation_framework$audit_logger
)
# Create session and validate
<- security_manager$session_manager$create_session(
session_created $token,
sessionuser_info = list(ip_address = session$clientData$url_hostname)
)
if (!session_created) {
showNotification("Session could not be created. Please try again.",
type = "error")
return()
}
# Reactive values with validation
<- reactiveValues(
values raw_data = NULL,
validated_data = NULL,
test_results = NULL,
validation_results = NULL,
last_validation = NULL
)
# Input validation with rate limiting
observeEvent(input$group_input, {
# Check rate limiting
if (!security_manager$check_rate_limit(session$token, "data_input")) {
showNotification("Too many requests. Please wait before trying again.",
type = "warning")
return()
}
# Validate session
if (!security_manager$validate_session(session)) {
showNotification("Session expired. Please refresh the page.",
type = "error")
return()
}
# Sanitize input
<- tryCatch({
sanitized_input $sanitize_input(input$group_input, type = "text")
security_managererror = function(e) {
}, <- error_handler$handle_system_error(
error_response
e, context = list(operation = "input_sanitization", input_type = "group")
)showNotification(error_response$message, type = "error")
return(NULL)
})
if (is.null(sanitized_input)) return()
# Parse and validate group input
tryCatch({
<- parse_group_input(sanitized_input)
group_data
if (!is.null(group_data)) {
$raw_data$group <- group_data
values
# Trigger validation if both inputs are available
if (!is.null(values$raw_data$response)) {
validate_combined_data()
}
}error = function(e) {
}, <- error_handler$handle_system_error(
error_response
e,context = list(operation = "group_parsing")
)showNotification(error_response$message, type = "error")
})
})
observeEvent(input$response_input, {
# Rate limiting and session validation
if (!security_manager$check_rate_limit(session$token, "data_input")) {
showNotification("Too many requests. Please wait before trying again.",
type = "warning")
return()
}
if (!security_manager$validate_session(session)) {
showNotification("Session expired. Please refresh the page.",
type = "error")
return()
}
# Sanitize and validate response input
<- tryCatch({
sanitized_input $sanitize_input(input$response_input, type = "text")
security_managererror = function(e) {
}, <- error_handler$handle_system_error(
error_response
e,context = list(operation = "input_sanitization", input_type = "response")
)showNotification(error_response$message, type = "error")
return(NULL)
})
if (is.null(sanitized_input)) return()
tryCatch({
<- parse_response_input(sanitized_input)
response_data
if (!is.null(response_data)) {
$raw_data$response <- response_data
values
# Trigger validation if both inputs are available
if (!is.null(values$raw_data$group)) {
validate_combined_data()
}
}error = function(e) {
}, <- error_handler$handle_system_error(
error_response
e,context = list(operation = "response_parsing")
)showNotification(error_response$message, type = "error")
})
})
# Combined data validation function
<- function() {
validate_combined_data
if (is.null(values$raw_data$group) || is.null(values$raw_data$response)) {
return()
}
# Create data frame for validation
<- data.frame(
combined_data group = values$raw_data$group,
response = values$raw_data$response,
stringsAsFactors = FALSE
)
# Validate with t-test specific validator
<- validation_framework$validate(
validation_result data = combined_data,
validator_name = "ttest_data",
context = list(
min_group_size = 3,
recommended_group_size = 15,
max_missing_proportion = 0.3
)
)
$validation_results <- validation_result
values$last_validation <- Sys.time()
values
# Handle validation results
if (validation_result$valid) {
$validated_data <- validation_result$data
values
# Show success with any warnings
if (validation_result$has_warnings()) {
showNotification(
paste("Data validated successfully.",
"Warnings:", validation_result$get_warning_message()),
type = "warning",
duration = 8000
)else {
} showNotification("Data validated successfully!", type = "success")
}
else {
} $validated_data <- NULL
values
# Handle validation errors
<- error_handler$handle_validation_error(
error_response
validation_result,context = list(operation = "data_validation")
)
showNotification(error_response$message, type = "error", duration = 10000)
}
}
# Analysis execution with comprehensive error handling
observeEvent(input$run_analysis, {
# Rate limiting for analysis operations
if (!security_manager$check_rate_limit(session$token, "analysis")) {
showNotification("Analysis rate limit exceeded. Please wait before running another analysis.",
type = "warning")
return()
}
# Session validation
if (!security_manager$validate_session(session)) {
showNotification("Session expired. Please refresh the page.",
type = "error")
return()
}
# Data validation check
if (is.null(values$validated_data)) {
showNotification("Please provide valid data before running analysis.",
type = "error")
return()
}
# Show progress indicator
showNotification("Running statistical analysis...",
type = "message", id = "analysis_progress")
# Execute analysis with error boundary
<- error_handler$error_boundary(
analysis_result expr = {
# Log analysis start
$audit_logger$log_event(
validation_frameworkmessage = "Statistical analysis started",
level = "INFO",
category = "ANALYSIS",
details = list(
test_type = "independent_ttest",
sample_size = nrow(values$validated_data),
user_options = list(
alternative = input$alternative,
conf_level = input$conf_level,
var_equal = input$var_equal
)
),user_id = session$user,
session_id = session$token
)
# Run comprehensive t-test analysis
calculate_ttest_comprehensive(
data = values$validated_data,
options = list(
alternative = input$alternative,
conf_level = input$conf_level,
var_equal = as.logical(input$var_equal),
auto_method = input$auto_method
)
)
},context = list(
operation = "statistical_analysis",
test_type = "ttest"
),fallback_ui = div(
class = "alert alert-danger",
"Analysis failed. Please check your data and try again."
)
)
# Remove progress indicator
removeNotification("analysis_progress")
# Handle analysis results
if (is.list(analysis_result) && !inherits(analysis_result, "shiny.tag")) {
$test_results <- analysis_result
values
# Log successful analysis
$audit_logger$log_event(
validation_frameworkmessage = "Statistical analysis completed successfully",
level = "INFO",
category = "ANALYSIS",
details = list(
p_value = analysis_result$test_result$p.value,
effect_size = analysis_result$effect_size$cohens_d,
method_used = analysis_result$method_used
),user_id = session$user,
session_id = session$token
)
showNotification("Analysis completed successfully!", type = "success")
else {
} $test_results <- NULL
valuesshowNotification("Analysis failed. Please try again.", type = "error")
}
})
# Validation status display
$validation_status <- renderUI({
output
if (is.null(values$validation_results)) {
return(
div(
class = "alert alert-info",
icon("info-circle", class = "me-2"),
"Waiting for data input..."
)
)
}
if (values$validation_results$valid) {
<- list(
status_content div(
class = "alert alert-success",
icon("check-circle", class = "me-2"),
"Data validation passed"
)
)
# Add warnings if present
if (values$validation_results$has_warnings()) {
<- append(status_content, list(
status_content div(
class = "alert alert-warning mt-2",
icon("exclamation-triangle", class = "me-2"),
strong("Warnings: "),
$validation_results$get_warning_message()
values
)
))
}
return(status_content)
else {
} return(
div(
class = "alert alert-danger",
icon("times-circle", class = "me-2"),
strong("Validation failed: "),
$validation_results$get_error_message()
values
)
)
}
})
# Security monitoring
observe({
# Monitor for suspicious activity patterns
if (!is.null(input$group_input) && !is.null(input$response_input)) {
# Check for potential injection attempts
<- c(
suspicious_patterns "<script",
"javascript:",
"eval\\s*\\(",
"exec\\s*\\(",
"system\\s*\\("
)
<- paste(input$group_input, input$response_input)
combined_input
for (pattern in suspicious_patterns) {
if (grepl(pattern, combined_input, ignore.case = TRUE, perl = TRUE)) {
$handle_security_error(
error_handlersecurity_event = "injection_attempt",
details = list(
pattern_detected = pattern,
input_source = "user_interface"
),user_id = session$user,
session_id = session$token
)
# Terminate session for security
$close()
sessionreturn()
}
}
}
})
# Session cleanup on disconnect
$onSessionEnded(function() {
session
# Log session end
$audit_logger$log_event(
validation_frameworkmessage = "User session ended",
level = "INFO",
category = "SESSION",
user_id = session$user,
session_id = session$token
)
# Clean up session data
$session_manager$invalidate_session(session$token)
security_manager
})
# Return reactive values for testing and integration
return(values)
}) }
Common Questions About Data Validation and Security
Implement multi-layered validation with progressive disclosure of errors. Start with client-side validation for immediate user feedback, but always validate on the server for security. Use clear, actionable error messages that guide users toward correct input without revealing system vulnerabilities. The ValidationFramework provided implements this pattern with user-friendly messages while maintaining comprehensive security checks behind the scenes.
Input sanitization removes or modifies potentially dangerous content (like HTML tags or SQL injection attempts), while input validation checks if data meets expected criteria (like data types, ranges, or business rules). Both are essential: sanitization prevents attacks, while validation ensures data quality. Always sanitize first to remove threats, then validate to ensure data meets your application’s requirements. The SecurityManager handles sanitization while ValidationFramework manages validation.
Create comprehensive audit trails that capture who did what, when, and from where. Log all user actions, system events, validation results, and security incidents with immutable timestamps and unique identifiers. The AuditLogger provided implements 21 CFR Part 11 compliant logging with encryption, retention management, and compliance reporting. Ensure logs cannot be modified post-creation and implement proper access controls for log viewing and analysis.
Prioritize input sanitization (prevent injection attacks), session management (prevent unauthorized access), rate limiting (prevent abuse), and comprehensive audit logging (regulatory compliance). For clinical data, implement encryption at rest and in transit, proper access controls, and data anonymization techniques. The SecurityManager framework provides these enterprise-grade protections specifically designed for statistical applications in regulated environments.
Implement graceful degradation where errors don’t crash the application but provide helpful guidance. Show users clear, actionable error messages while logging detailed technical information for developers. Use progressive disclosure - show simple messages initially with options to view more details if needed. The ErrorHandler framework demonstrates this approach with user-friendly messages, retry suggestions, and comprehensive background logging for troubleshooting.
Test Your Understanding
You’re implementing validation for a clinical trial data entry application. Users report that validation is too slow, but security requirements mandate comprehensive checking. Which validation strategy optimizes both performance and security?
- Move all validation to client-side JavaScript for speed
- Implement asynchronous server-side validation with immediate client-side feedback
- Cache validation results and skip re-validation for similar inputs
- Reduce validation checks to improve performance
- Consider the security implications of client-side only validation
- Think about how to provide immediate feedback while maintaining security
- Clinical applications cannot compromise on validation thoroughness
- Performance and security don’t have to be mutually exclusive
B) Implement asynchronous server-side validation with immediate client-side feedback
Strategic Implementation:
Client-Side Layer (Immediate Feedback):
// Immediate basic validation for user experience
function validateInputFormat(input) {
// Basic format checking only
// Range validation, required fields
// NOT security validation
return {
valid: basicChecks(input),
message: "Please check your input format"
;
} }
Server-Side Layer (Security & Compliance):
# Comprehensive validation with security
<- validation_framework$validate(
validation_result data = sanitized_data,
validator_name = "clinical_data",
context = list(regulatory_compliance = TRUE)
)
Performance Optimization Techniques:
- Debounce validation to avoid excessive server calls
- Progressive validation - validate fields as completed
- Batch validation for multiple fields simultaneously
- Async processing to avoid blocking UI interactions
Security Maintenance:
- Server-side validation remains authoritative and cannot be bypassed
- Input sanitization prevents injection attacks before validation
- Audit logging captures all validation attempts for compliance
- Rate limiting prevents validation endpoint abuse
Clinical Compliance Benefits:
- Complete audit trail of all validation events
- Immutable server-side checks for regulatory review
- User experience optimization without compromising data integrity
- Error documentation supports validation evidence requirements
This approach satisfies both user experience demands and regulatory security requirements by layering fast client feedback over comprehensive server validation.
Your pharmaceutical Shiny application handles sensitive clinical trial data. You have limited development time to implement security measures. Which security implementation sequence provides the best protection against the most critical threats?
- Authentication → Input sanitization → Audit logging → Rate limiting
- Input sanitization → Audit logging → Authentication → Rate limiting
- Audit logging → Authentication → Input sanitization → Rate limiting
- Rate limiting → Authentication → Input sanitization → Audit logging
- Consider which threats can cause immediate and severe damage
- Think about the dependencies between security measures
- Pharmaceutical applications have specific regulatory requirements
- Some security measures enable effective implementation of others
B) Input sanitization → Audit logging → Authentication → Rate limiting
Threat-Based Prioritization Logic:
1. Input Sanitization (Immediate Critical Protection):
# Prevents catastrophic attacks that could compromise entire system
<- security_manager$sanitize_input(
sanitized_input
user_input, type = "clinical_data"
)# Blocks: SQL injection, script injection, file system attacks
2. Audit Logging (Regulatory Compliance Foundation):
# Required for regulatory compliance and incident investigation
$log_event(
audit_loggermessage = "Data access attempt",
category = "DATA_ACCESS",
details = list(sanitized_input_hash = hash(sanitized_input))
)# Enables: Compliance evidence, threat detection, incident response
3. Authentication (Access Control):
# Controls who can access the system
if (!authenticate_user(credentials)) {
$log_security_event("unauthorized_access_attempt")
audit_loggerreturn(access_denied())
}# Provides: User identity verification, access control
4. Rate Limiting (Abuse Prevention):
# Prevents system abuse and DoS attacks
if (!check_rate_limit(user_id, "clinical_data_access")) {
$log_event("rate_limit_exceeded")
audit_loggerreturn(rate_limited())
}# Protects: System availability, prevents brute force attacks
Strategic Reasoning:
- Input sanitization prevents immediate system compromise
- Audit logging provides regulatory compliance foundation and attack detection
- Authentication controls access but relies on sanitized inputs for security
- Rate limiting prevents abuse but requires authenticated context for effectiveness
This sequence protects against the most severe threats first while building the foundation for comprehensive security.
During a critical clinical trial analysis, your Shiny application encounters a statistical calculation error. The application must maintain data integrity, provide user guidance, ensure regulatory compliance, and enable quick recovery. Which error handling approach best addresses all these requirements?
- Display detailed error message and allow users to continue with partial results
- Log error details, show user-friendly message, preserve data state, and offer recovery options
- Restart the application automatically and lose current analysis state
- Show generic error message and require complete data re-entry
- Consider regulatory requirements for error documentation
- Think about user productivity and data preservation
- Clinical applications cannot afford data loss
- Error recovery should be guided and safe
B) Log error details, show user-friendly message, preserve data state, and offer recovery options
Comprehensive Error Response Implementation:
1. Detailed Logging (Regulatory Compliance):
$handle_system_error(
error_handlererror = calculation_error,
context = list(
operation = "clinical_statistical_analysis",
data_hash = digest(analysis_data),
analysis_parameters = list(
test_type = "survival_analysis",
confidence_level = 0.95,
patient_count = nrow(data)
),recovery_state = serialize_analysis_state()
),user_id = session$user,
session_id = session$token
)
2. User-Friendly Communication:
showNotification(
"Statistical calculation encountered an issue. Your data has been preserved. Please review the suggested actions below.",
type = "warning",
duration = NULL # Persistent until addressed
)
3. Data State Preservation:
# Automatic state preservation
<- list(
recovery_data validated_data = values$validated_data,
analysis_parameters = get_analysis_parameters(),
partial_results = values$partial_results,
timestamp = Sys.time(),
error_context = error_context
)
# Secure storage for recovery
save_recovery_state(recovery_data, session$token)
4. Guided Recovery Options:
# Recovery UI component
<- div(
recovery_ui class = "alert alert-info mt-3",
h5("Recovery Options:"),
actionButton("retry_calculation", "Retry Analysis", class = "btn-primary me-2"),
actionButton("modify_parameters", "Adjust Parameters", class = "btn-secondary me-2"),
actionButton("export_data", "Export Current Data", class = "btn-outline-primary"),
hr(),
p("If problems persist, contact support with Error ID: ", code(error_id))
)
Clinical Trial Benefits:
- No data loss - analysis can continue without re-entering data
- Audit compliance - complete error documentation for regulatory review
- User productivity - clear recovery path without starting over
- Data integrity - preserved state ensures consistency
- Support enablement - detailed logs help technical teams resolve issues
This approach balances regulatory requirements, user experience, and system reliability for mission-critical clinical applications.
Conclusion
Comprehensive data validation and security implementation transforms statistical applications from functional tools into enterprise-grade systems that protect sensitive data, ensure regulatory compliance, and maintain user trust. Through multi-layered validation frameworks, robust security protocols, and sophisticated error handling, you’ve learned to create applications that meet the rigorous standards required for pharmaceutical and clinical research environments.
The validation and security patterns implemented in this tutorial directly support career advancement by demonstrating advanced understanding of enterprise software engineering, regulatory compliance, and professional security practices. These skills distinguish developers who can create production-ready applications from those limited to prototype development.
Your mastery of input sanitization, audit logging, error boundaries, and recovery mechanisms creates applications that maintain data integrity and user productivity even under adverse conditions, ensuring that critical statistical insights remain accessible when stakeholders need them most.
Next Steps
Based on your comprehensive validation and security foundation, here are the recommended paths for continuing your enterprise development journey:
Immediate Next Steps (Complete These First)
- Statistical Rigor and Assumption Testing - Apply validation frameworks to ensure statistical method reliability
- Professional Statistical Visualizations - Secure and validate advanced plotting capabilities
- Practice Exercise: Implement the complete validation framework in your t-test application with security monitoring and audit logging
Building on Your Security Foundation (Choose Your Path)
For Quality Assurance Excellence:
For Production Readiness:
For Regulatory Excellence:
Long-term Goals (3-4 Months)
- Establish organizational security standards for statistical applications
- Create validation frameworks that support regulatory submission requirements
- Build security monitoring and incident response procedures for production applications
- Develop comprehensive audit systems that meet pharmaceutical compliance standards
Explore More Enterprise Development
Ready to enhance your validated and secured application with statistical excellence? Continue with comprehensive statistical rigor and assumption testing.
Reuse
Citation
@online{kassambara2025,
author = {Kassambara, Alboukadel},
title = {Bulletproof {Data} {Validation:} {Enterprise} {Security} for
{Statistical} {Applications}},
date = {2025-05-23},
url = {https://www.datanovia.com/learn/tools/shiny-apps/enterprise-development/data-validation-security.html},
langid = {en}
}