flowchart TB A[Internet Traffic] --> B[Traefik Load Balancer] B --> C[ShinyProxy Manager Node] B --> D[ShinyProxy Worker Node 1] B --> E[ShinyProxy Worker Node 2] C --> F[Docker Swarm Manager] D --> G[Docker Swarm Worker 1] E --> H[Docker Swarm Worker 2] F --> I[Shiny App Containers] G --> J[Shiny App Containers] H --> K[Shiny App Containers] L[Grafana Dashboard] --> M[InfluxDB Metrics] N[Swarmpit Management] --> F O[Let's Encrypt SSL] --> B P[Enterprise Auth] --> C P --> D P --> E style A fill:#e1f5fe style B fill:#f3e5f5 style F fill:#e8f5e8 style L fill:#fff3e0 style P fill:#fce4ec
Key Takeaways
- Complete Production Stack: Integrate Docker Swarm orchestration with ShinyProxy for enterprise-grade scaling and high availability
- Automated SSL and Load Balancing: Traefik provides zero-configuration SSL certificates and intelligent traffic distribution across multiple nodes
- Comprehensive Monitoring: Built-in observability with Grafana, InfluxDB, and Swarmpit for complete infrastructure visibility
- Cost-Effective Scaling: Docker Swarm offers 80% of Kubernetes features with significantly reduced complexity and operational overhead
- Production-Ready Security: Enterprise authentication, network isolation, and automated security updates for compliance and reliability
Introduction
Enterprise Shiny deployments require sophisticated orchestration to handle the demands of multiple concurrent users, high availability requirements, and scalable infrastructure management. While individual components like ShinyProxy, Docker Swarm, and Traefik are powerful on their own, their true potential emerges when integrated into a cohesive production stack that automates scaling, security, and operational management.
This comprehensive guide walks you through building a complete enterprise orchestration platform that combines the container management capabilities of Docker Swarm, the multi-tenant architecture of ShinyProxy, the intelligent routing of Traefik, and comprehensive monitoring tools into a unified system that scales from departmental analytics to organization-wide analytical platforms.
The integrated approach eliminates the complexity of managing separate deployment tools while providing enterprise-grade features including automated SSL certificate management, intelligent load balancing, rolling updates with zero downtime, and comprehensive observability that meets compliance and operational requirements.
Complete Architecture Overview
The enterprise orchestration stack integrates five core components that work together to provide a production-ready analytics platform:
Integrated Component Architecture
Docker Swarm Orchestration Layer
Provides container orchestration with manager-worker node architecture, enabling automatic scaling, rolling updates, and high availability. Swarm manages the lifecycle of ShinyProxy instances and Shiny application containers across multiple nodes.
ShinyProxy Application Management
Handles multi-tenant Shiny application deployment with container-per-user isolation. Integrates with Swarm’s service discovery and load balancing while providing enterprise authentication and resource management.
Traefik Intelligent Routing
Serves as the edge router with automatic service discovery, SSL termination, and load balancing. Automatically discovers ShinyProxy instances in the Swarm and routes traffic intelligently based on health checks and availability.
Monitoring and Observability Stack
Comprehensive monitoring using Grafana for visualization, InfluxDB for metrics storage, and Swarmpit for cluster management. Provides real-time visibility into application performance, resource utilization, and system health.
Enterprise Integration Layer
Connects with organizational authentication systems, monitoring infrastructure, and compliance tools while maintaining security boundaries and audit capabilities.
Prerequisites and Infrastructure Setup
Infrastructure Requirements
Multi-Node Cluster Configuration:
- Manager Node: 8+ cores, 16GB+ RAM, 100GB+ SSD storage
- Worker Nodes: 4+ cores per node, 8GB+ RAM per node, 50GB+ SSD storage per node
- Network: Stable internal networking between nodes, external access for SSL certificates
- Operating System: Ubuntu 20.04+ or RHEL 8+ on all nodes
Network Configuration:
# Required ports for Docker Swarm communication
# Manager Node
sudo ufw allow 2377/tcp # Cluster management
sudo ufw allow 7946/tcp # Node communication
sudo ufw allow 7946/udp # Node communication
sudo ufw allow 4789/udp # Overlay network traffic
# Worker Nodes
sudo ufw allow 7946/tcp # Node communication
sudo ufw allow 7946/udp # Node communication
sudo ufw allow 4789/udp # Overlay network traffic
# HTTP/HTTPS traffic
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
Docker Swarm Cluster Initialization
Manager Node Setup:
# Initialize Docker Swarm on manager node
docker swarm init --advertise-addr $(hostname -I | awk '{print $1}')
# Get worker join token
docker swarm join-token worker
# Get manager join token (for additional managers)
docker swarm join-token manager
# Verify cluster status
docker node ls
Worker Node Configuration:
# On each worker node, run the join command from manager
docker swarm join --token SWMTKN-1-xxx-xxx <manager-ip>:2377
# Verify from manager node
docker node ls
Network Setup:
# Create overlay networks for service communication
docker network create --driver overlay --attachable traefik-public
docker network create --driver overlay --attachable shinyproxy-network
# Verify networks
docker network ls
Node Labeling and Constraints
# Label nodes for service placement
docker node update --label-add traefik.traefik-public-certificates=true <manager-node-id>
docker node update --label-add monitoring.grafana-data=true <manager-node-id>
docker node update --label-add monitoring.influxdb-data=true <manager-node-id>
# Label worker nodes for application workloads
docker node update --label-add workload.shiny-apps=true <worker-node-1-id>
docker node update --label-add workload.shiny-apps=true <worker-node-2-id>
Traefik Orchestration Setup
Traefik Stack Configuration
Create the foundation routing and SSL management layer:
# traefik-stack.yml
version: '3.8'
services:
traefik:
image: traefik:v3.0
command:
# API and Dashboard
- --api.dashboard=true
- --api.insecure=false
# Docker Swarm Provider
- --providers.docker=true
- --providers.docker.swarmmode=true
- --providers.docker.exposedbydefault=false
- --providers.docker.network=traefik-public
# Entry Points
- --entrypoints.web.address=:80
- --entrypoints.websecure.address=:443
# SSL/TLS Configuration
- --certificatesresolvers.letsencrypt.acme.tlschallenge=true
- --certificatesresolvers.letsencrypt.acme.email=admin@company.com
- --certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json
# Security and Headers
- --entrypoints.websecure.http.middlewares=security-headers@docker
# Logging and Monitoring
- --log.level=INFO
- --accesslog=true
- --metrics.prometheus=true
- --metrics.prometheus.addEntryPointsLabels=true
- --metrics.prometheus.addServicesLabels=true
ports:
- target: 80
published: 80
protocol: tcp
mode: host
- target: 443
published: 443
protocol: tcp
mode: host
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- traefik-public-certificates:/letsencrypt
networks:
- traefik-public
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
- node.labels.traefik.traefik-public-certificates == true
labels:
# Dashboard Configuration
- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.constraint-label=traefik-public
# Dashboard Routing
- traefik.http.routers.traefik-public-http.rule=Host(`traefik.company.com`)
- traefik.http.routers.traefik-public-http.entrypoints=web
- traefik.http.routers.traefik-public-http.middlewares=https-redirect
- traefik.http.routers.traefik-public-https.rule=Host(`traefik.company.com`)
- traefik.http.routers.traefik-public-https.entrypoints=websecure
- traefik.http.routers.traefik-public-https.tls=true
- traefik.http.routers.traefik-public-https.tls.certresolver=letsencrypt
- traefik.http.routers.traefik-public-https.service=api@internal
- traefik.http.routers.traefik-public-https.middlewares=admin-auth,security-headers
# Middleware Definitions
- traefik.http.middlewares.https-redirect.redirectscheme.scheme=https
- traefik.http.middlewares.https-redirect.redirectscheme.permanent=true
- traefik.http.middlewares.admin-auth.basicauth.users=admin:$$2y$$10$$hashed-password-here
- traefik.http.middlewares.security-headers.headers.stsSeconds=31536000
- traefik.http.middlewares.security-headers.headers.stsIncludeSubdomains=true
- traefik.http.middlewares.security-headers.headers.stsPreload=true
- traefik.http.middlewares.security-headers.headers.forceSTSHeader=true
- traefik.http.middlewares.security-headers.headers.frameDeny=true
- traefik.http.middlewares.security-headers.headers.contentTypeNosniff=true
- traefik.http.middlewares.security-headers.headers.browserXssFilter=true
- traefik.http.middlewares.security-headers.headers.referrerPolicy=strict-origin-when-cross-origin
- traefik.http.middlewares.security-headers.headers.customRequestHeaders.X-Forwarded-Proto=https
# Service Configuration
- traefik.http.services.traefik-public.loadbalancer.server.port=8080
volumes:
traefik-public-certificates:
external: false
networks:
traefik-public:
external: true
Deploy Traefik Stack:
# Create required secrets and passwords
echo "admin" | docker secret create traefik-admin-user -
echo "your-secure-password" | openssl passwd -apr1 | docker secret create traefik-admin-password -
# Deploy Traefik stack
docker stack deploy -c traefik-stack.yml traefik
# Verify deployment
docker stack services traefik
docker service logs traefik_traefik
ShinyProxy Swarm Integration
ShinyProxy Application Configuration
Configure ShinyProxy to work seamlessly with Docker Swarm orchestration:
# shinyproxy-application.yml
proxy:
title: Enterprise Analytics Platform
logo-url: file:///opt/shinyproxy/config/logo.png
landingpage: /
# Performance and Connection Settings
heartbeat-rate: 10000
heartbeat-timeout: 60000
port: 8080
bind-address: 0.0.0.0
# Docker Swarm Backend Configuration
container-backend: docker-swarm
docker:
internal-networking: true
# Authentication (configure based on your enterprise requirements)
authentication: openid
openid:
auth-url: https://sso.company.com/auth/realms/company/protocol/openid-connect/auth
token-url: https://sso.company.com/auth/realms/company/protocol/openid-connect/token
jwks-url: https://sso.company.com/auth/realms/company/protocol/openid-connect/certs
logout-url: https://sso.company.com/auth/realms/company/protocol/openid-connect/logout
client-id: shinyproxy-analytics
client-secret: ${OIDC_CLIENT_SECRET}
username-attribute: preferred_username
roles-claim: groups
# Admin Configuration
admin-groups: [admin, shiny-administrators]
# Application Specifications
specs:
- id: data-explorer
display-name: Interactive Data Explorer
description: Advanced data exploration and visualization platform
container-image: company-registry/shiny-data-explorer:latest
container-cmd: ["R", "-e", "shiny::runApp('/srv/shiny-server/app', host='0.0.0.0', port=3838)"]
container-network: shinyproxy-network
container-memory: "4GB"
container-cpu-limit: 2
container-cpu-reservation: 1
access-groups: [data-analysts, data-scientists, admin]
container-env:
R_MAX_VSIZE: "3GB"
SHINY_LOG_LEVEL: "INFO"
- id: financial-dashboard
display-name: Financial Analytics Dashboard
description: Real-time financial performance monitoring and reporting
container-image: company-registry/shiny-financial:latest
container-cmd: ["R", "-e", "shiny::runApp('/srv/shiny-server/app', host='0.0.0.0', port=3838)"]
container-network: shinyproxy-network
container-memory: "6GB"
container-cpu-limit: 3
container-cpu-reservation: 2
access-groups: [finance-team, executives, admin]
container-env:
R_MAX_VSIZE: "5GB"
SHINY_LOG_LEVEL: "INFO"
DATABASE_URL: "postgresql://findb:5432/analytics"
- id: ml-modeling-suite
display-name: Machine Learning Modeling Suite
description: Advanced machine learning model development and deployment
container-image: company-registry/shiny-ml-suite:latest
container-cmd: ["R", "-e", "shiny::runApp('/srv/shiny-server/app', host='0.0.0.0', port=3838)"]
container-network: shinyproxy-network
container-memory: "8GB"
container-cpu-limit: 4
container-cpu-reservation: 2
access-groups: [data-scientists, ml-engineers, admin]
container-env:
R_MAX_VSIZE: "7GB"
PYTHON_PATH: "/usr/local/bin/python3"
SHINY_LOG_LEVEL: "DEBUG"
# Usage Statistics for Monitoring Integration
usage-stats-url: http://influxdb:8086/write?db=shinyproxy_usagestats
usage-stats-username: ${INFLUXDB_USER}
usage-stats-password: ${INFLUXDB_PASSWORD}
# Server Configuration
server:
useForwardHeaders: true
forward-headers-strategy: native
# Logging Configuration
logging:
level:
root: INFO
eu.openanalytics: DEBUG
file:
name: /opt/shinyproxy/logs/shinyproxy.log
max-size: 10MB
max-history: 30
ShinyProxy Stack Deployment
# shinyproxy-stack.yml
version: '3.8'
services:
shinyproxy:
image: openanalytics/shinyproxy:3.0.2
environment:
- SPRING_PROFILES_ACTIVE=production
- OIDC_CLIENT_SECRET=${OIDC_CLIENT_SECRET}
- INFLUXDB_USER=${INFLUXDB_USER}
- INFLUXDB_PASSWORD=${INFLUXDB_PASSWORD}
volumes:
- ./config/shinyproxy-application.yml:/opt/shinyproxy/application.yml:ro
- shinyproxy-logs:/opt/shinyproxy/logs
- /var/run/docker.sock:/var/run/docker.sock
networks:
- traefik-public
- shinyproxy-network
deploy:
mode: replicated
replicas: 2
placement:
constraints:
- node.role == manager
restart_policy:
condition: on-failure
delay: 10s
max_attempts: 3
resources:
limits:
memory: 2GB
cpus: '2'
reservations:
memory: 1GB
cpus: '1'
labels:
# Traefik Configuration
- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.constraint-label=traefik-public
# HTTP to HTTPS Redirect
- traefik.http.routers.shinyproxy-http.rule=Host(`analytics.company.com`)
- traefik.http.routers.shinyproxy-http.entrypoints=web
- traefik.http.routers.shinyproxy-http.middlewares=https-redirect
# HTTPS Configuration
- traefik.http.routers.shinyproxy-https.rule=Host(`analytics.company.com`)
- traefik.http.routers.shinyproxy-https.entrypoints=websecure
- traefik.http.routers.shinyproxy-https.tls=true
- traefik.http.routers.shinyproxy-https.tls.certresolver=letsencrypt
- traefik.http.routers.shinyproxy-https.middlewares=security-headers
# Load Balancer Configuration
- traefik.http.services.shinyproxy.loadbalancer.server.port=8080
- traefik.http.services.shinyproxy.loadbalancer.sticky.cookie=true
- traefik.http.services.shinyproxy.loadbalancer.sticky.cookie.name=shinyproxy-server
- traefik.http.services.shinyproxy.loadbalancer.healthcheck.path=/actuator/health
- traefik.http.services.shinyproxy.loadbalancer.healthcheck.interval=30s
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/actuator/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
volumes:
shinyproxy-logs:
external: false
networks:
traefik-public:
external: true
shinyproxy-network:
external: true
Deploy ShinyProxy Stack:
# Set required environment variables
export OIDC_CLIENT_SECRET="your-client-secret"
export INFLUXDB_USER="shinyproxy"
export INFLUXDB_PASSWORD="secure-password"
# Deploy ShinyProxy stack
docker stack deploy -c shinyproxy-stack.yml shinyproxy
# Monitor deployment
docker stack services shinyproxy
docker service logs shinyproxy_shinyproxy
Comprehensive Monitoring Stack
InfluxDB and Grafana Deployment
# monitoring-stack.yml
version: '3.8'
services:
influxdb:
image: influxdb:1.8
environment:
- INFLUXDB_DB=shinyproxy_usagestats
- INFLUXDB_ADMIN_USER=${INFLUXDB_ADMIN_USER}
- INFLUXDB_ADMIN_PASSWORD=${INFLUXDB_ADMIN_PASSWORD}
- INFLUXDB_USER=${INFLUXDB_USER}
- INFLUXDB_USER_PASSWORD=${INFLUXDB_PASSWORD}
- INFLUXDB_HTTP_AUTH_ENABLED=true
volumes:
- influxdb-data:/var/lib/influxdb
networks:
- traefik-public
- shinyproxy-network
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
- node.labels.monitoring.influxdb-data == true
resources:
limits:
memory: 2GB
cpus: '2'
reservations:
memory: 1GB
cpus: '1'
labels:
- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.http.routers.influxdb.rule=Host(`influxdb.company.com`)
- traefik.http.routers.influxdb.entrypoints=websecure
- traefik.http.routers.influxdb.tls=true
- traefik.http.routers.influxdb.tls.certresolver=letsencrypt
- traefik.http.routers.influxdb.middlewares=admin-auth,security-headers
- traefik.http.services.influxdb.loadbalancer.server.port=8086
grafana:
image: grafana/grafana:latest
environment:
- GF_SECURITY_ADMIN_USER=${GRAFANA_ADMIN_USER}
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD}
- GF_USERS_ALLOW_SIGN_UP=false
- GF_INSTALL_PLUGINS=grafana-clock-panel,grafana-simple-json-datasource
- GF_SERVER_ROOT_URL=https://grafana.company.com
volumes:
- grafana-data:/var/lib/grafana
- ./config/grafana/provisioning:/etc/grafana/provisioning:ro
- ./config/grafana/dashboards:/var/lib/grafana/dashboards:ro
networks:
- traefik-public
- shinyproxy-network
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
- node.labels.monitoring.grafana-data == true
resources:
limits:
memory: 1GB
cpus: '1'
reservations:
memory: 512MB
cpus: '0.5'
labels:
- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.http.routers.grafana-http.rule=Host(`grafana.company.com`)
- traefik.http.routers.grafana-http.entrypoints=web
- traefik.http.routers.grafana-http.middlewares=https-redirect
- traefik.http.routers.grafana-https.rule=Host(`grafana.company.com`)
- traefik.http.routers.grafana-https.entrypoints=websecure
- traefik.http.routers.grafana-https.tls=true
- traefik.http.routers.grafana-https.tls.certresolver=letsencrypt
- traefik.http.routers.grafana-https.middlewares=security-headers
- traefik.http.services.grafana.loadbalancer.server.port=3000
depends_on:
- influxdb
prometheus:
image: prom/prometheus:latest
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--storage.tsdb.retention.time=30d'
- '--web.enable-lifecycle'
volumes:
- ./config/prometheus:/etc/prometheus:ro
- prometheus-data:/prometheus
networks:
- traefik-public
- shinyproxy-network
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
resources:
limits:
memory: 2GB
cpus: '2'
reservations:
memory: 1GB
cpus: '1'
labels:
- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.http.routers.prometheus.rule=Host(`prometheus.company.com`)
- traefik.http.routers.prometheus.entrypoints=websecure
- traefik.http.routers.prometheus.tls=true
- traefik.http.routers.prometheus.tls.certresolver=letsencrypt
- traefik.http.routers.prometheus.middlewares=admin-auth,security-headers
- traefik.http.services.prometheus.loadbalancer.server.port=9090
volumes:
influxdb-data:
external: false
grafana-data:
external: false
prometheus-data:
external: false
networks:
traefik-public:
external: true
shinyproxy-network:
external: true
Swarmpit Cluster Management
# swarmpit-stack.yml
version: '3.8'
services:
app:
image: swarmpit/swarmpit:latest
environment:
- SWARMPIT_DB=http://db:5984
- SWARMPIT_INFLUXDB=http://influxdb:8086
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
networks:
- traefik-public
- swarmpit-net
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
resources:
limits:
memory: 1GB
cpus: '1'
reservations:
memory: 512MB
cpus: '0.5'
labels:
- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.http.routers.swarmpit-http.rule=Host(`swarmpit.company.com`)
- traefik.http.routers.swarmpit-http.entrypoints=web
- traefik.http.routers.swarmpit-http.middlewares=https-redirect
- traefik.http.routers.swarmpit-https.rule=Host(`swarmpit.company.com`)
- traefik.http.routers.swarmpit-https.entrypoints=websecure
- traefik.http.routers.swarmpit-https.tls=true
- traefik.http.routers.swarmpit-https.tls.certresolver=letsencrypt
- traefik.http.routers.swarmpit-https.middlewares=security-headers
- traefik.http.services.swarmpit.loadbalancer.server.port=8080
db:
image: couchdb:2.3.0
volumes:
- swarmpit-db-data:/opt/couchdb/data
networks:
- swarmpit-net
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
resources:
limits:
memory: 512MB
cpus: '0.5'
reservations:
memory: 256MB
cpus: '0.25'
agent:
image: swarmpit/agent:latest
environment:
- DOCKER_API_VERSION=1.35
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
networks:
- swarmpit-net
deploy:
mode: global
resources:
limits:
memory: 64MB
cpus: '0.1'
reservations:
memory: 32MB
cpus: '0.05'
volumes:
swarmpit-db-data:
external: false
networks:
traefik-public:
external: true
swarmpit-net:
external: false
Deploy Monitoring Stack:
# Set monitoring environment variables
export INFLUXDB_ADMIN_USER="admin"
export INFLUXDB_ADMIN_PASSWORD="secure-admin-password"
export INFLUXDB_USER="shinyproxy"
export INFLUXDB_PASSWORD="secure-user-password"
export GRAFANA_ADMIN_USER="admin"
export GRAFANA_ADMIN_PASSWORD="secure-grafana-password"
# Deploy monitoring stack
docker stack deploy -c monitoring-stack.yml monitoring
# Deploy Swarmpit
docker stack deploy -c swarmpit-stack.yml swarmpit
# Verify deployments
docker stack services monitoring
docker stack services swarmpit
Grafana Dashboard Configuration
Create comprehensive dashboards for monitoring the entire stack:
{
"dashboard": {
"title": "ShinyProxy Enterprise Analytics Platform",
"panels": [
{
"title": "Active ShinyProxy Instances",
"type": "stat",
"targets": [
{
"query": "SELECT count(distinct(\"service\")) FROM \"shinyproxy_instances\" WHERE time >= now() - 5m"
}
]
},
{
"title": "Active Shiny Applications",
"type": "stat",
"targets": [
{
"query": "SELECT sum(\"active_containers\") FROM \"shinyproxy_apps\" WHERE time >= now() - 1m"
}
]
},
{
"title": "User Sessions Over Time",
"type": "graph",
"targets": [
{
"query": "SELECT mean(\"concurrent_users\") FROM \"shinyproxy_usage\" WHERE time >= now() - 1h GROUP BY time(5m)"
}
]
},
{
"title": "Docker Swarm Node Status",
"type": "table",
"targets": [
{
"query": "SELECT last(\"node_status\"), last(\"availability\") FROM \"swarm_nodes\" GROUP BY \"node_name\""
}
]
},
{
"title": "Container Resource Usage",
"type": "graph",
"targets": [
{
"query": "SELECT mean(\"cpu_percent\") FROM \"container_stats\" WHERE time >= now() - 1h GROUP BY time(5m), \"container_name\""
},
{
"query": "SELECT mean(\"memory_usage\") FROM \"container_stats\" WHERE time >= now() - 1h GROUP BY time(5m), \"container_name\""
}
]
},
{
"title": "Traefik Request Rate",
"type": "graph",
"targets": [
{
"query": "SELECT derivative(mean(\"traefik_requests_total\"), 1s) FROM \"traefik_metrics\" WHERE time >= now() - 1h GROUP BY time(1m)"
}
]
}
]
}
}
Production Scaling Strategies
Horizontal Scaling Configuration
Dynamic Node Addition:
# Add new worker nodes dynamically
# On new worker node
docker swarm join --token SWMTKN-1-xxx-xxx manager-ip:2377
# Label new node for workloads
docker node update --label-add workload.shiny-apps=true <new-node-id>
# Scale ShinyProxy instances
docker service update --replicas 3 shinyproxy_shinyproxy
# Verify scaling
docker service ps shinyproxy_shinyproxy
Application Scaling Policies:
# Resource-based scaling configuration
proxy:
specs:
- id: high-demand-app
container-memory: "2GB"
container-cpu-limit: 2
# Auto-scaling parameters
container-env:
SHINY_WORKER_PROCESSES: "4"
R_MAX_VSIZE: "1.5GB"
# Load balancing hints
container-labels:
traefik.backend.loadbalancer.stickiness: "true"
traefik.backend.loadbalancer.method: "wrr"
Performance Optimization
Container Image Optimization:
# Multi-stage optimized Dockerfile for Shiny applications
FROM rocker/r-base:4.3.2 as builder
# Install build dependencies
RUN apt-get update && apt-get install -y \
\
libcurl4-gnutls-dev \
libssl-dev \
libxml2-dev
build-essential
# Pre-compile R packages
RUN R -e "install.packages(c('shiny', 'DT', 'plotly', 'dplyr'), repos='https://cran.rstudio.com/')"
# Production stage
FROM rocker/shiny:4.3.2
# Copy pre-compiled packages
COPY --from=builder /usr/local/lib/R/site-library /usr/local/lib/R/site-library
# Install only runtime dependencies
RUN apt-get update && apt-get install -y \
\
libcurl4-gnutls-dev \
libssl-dev && rm -rf /var/lib/apt/lists/*
# Application files
COPY ./app /srv/shiny-server/app/
# Optimization settings
RUN echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'))" >> /usr/local/lib/R/etc/Rprofile.site && \
echo "options(download.file.method = 'libcurl')" >> /usr/local/lib/R/etc/Rprofile.site
USER shiny
EXPOSE 3838
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3838/ || exit 1
Resource Monitoring and Alerting:
# Prometheus alerting rules
groups:
- name: shinyproxy-alerts
rules:
- alert: ShinyProxyDown
expr: up{job="shinyproxy"} == 0
for: 2m
labels:
severity: critical
annotations:
summary: "ShinyProxy instance is down"
- alert: HighMemoryUsage
expr: container_memory_usage_bytes / container_spec_memory_limit_bytes > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "Container memory usage is above 80%"
- alert: SwarmNodeDown
expr: swarm_node_info{state="down"} == 1
for: 1m
labels:
severity: critical
annotations:
summary: "Docker Swarm node is down"
Security Hardening and Compliance
Network Security Configuration
# Configure firewall rules for Swarm cluster
# Manager node
sudo ufw --force enable
sudo ufw default deny incoming
sudo ufw default allow outgoing
# Allow SSH
sudo ufw allow 22/tcp
# Allow Swarm ports
sudo ufw allow 2377/tcp # Cluster management
sudo ufw allow 7946 # Node communication
sudo ufw allow 4789/udp # Overlay network
# Allow HTTP/HTTPS
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
# Worker node configuration (similar but without 2377)
sudo ufw allow 7946 # Node communication
sudo ufw allow 4789/udp # Overlay network
Container Security Best Practices
# Enhanced security configuration
proxy:
specs:
- id: secure-app
# Security constraints
container-privileged: false
container-network: shinyproxy-network
# Resource limits (prevent resource exhaustion attacks)
container-memory: "2GB"
container-cpu-limit: 2
# Security options
container-security-opts:
- no-new-privileges
- apparmor:docker-default
# Read-only root filesystem
container-read-only: true
# Temporary filesystem mounts
container-tmpfs:
- /tmp:rw,size=100m
- /var/tmp:rw,size=50m
# Environment security
container-env:
R_ENVIRON_USER: "/tmp/.Renviron"
HOME: "/tmp"
# User configuration
container-user: "shiny"
SSL/TLS Security Enhancement
# Enhanced Traefik TLS configuration
services:
traefik:
command:
# ... other commands ...
# Enhanced TLS settings
- --entrypoints.websecure.http.tls.options=modern@file
- --providers.file.filename=/etc/traefik/tls-config.yml
volumes:
# ... other volumes ...
- ./config/traefik/tls-config.yml:/etc/traefik/tls-config.yml:ro
# tls-config.yml
tls:
options:
modern:
minVersion: "VersionTLS12"
maxVersion: "VersionTLS13"
sslStrategies:
- "tls.SniStrict"
cipherSuites:
- "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384"
- "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305"
- "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
curvePreferences:
- CurveP521
- CurveP384
Troubleshooting and Maintenance
Common Issues and Solutions
Issue 1: Service Discovery Problems
Problem: ShinyProxy instances not appearing in Traefik dashboard or load balancing not working.
Solution:
# Check service registration
docker service ls
docker service inspect shinyproxy_shinyproxy
# Verify network connectivity
docker network ls
docker network inspect traefik-public
# Check Traefik logs
docker service logs traefik_traefik
# Restart services if needed
docker service update --force shinyproxy_shinyproxy
Issue 2: Container Startup Failures in Swarm
Problem: Shiny application containers fail to start on worker nodes.
Solution:
# Check node availability
docker node ls
# Verify image availability on worker nodes
docker service ps shinyproxy_shinyproxy --no-trunc
# Check resource constraints
docker node inspect <worker-node-id> --format '{{.Status.State}}'
# Update resource allocations
docker service update --limit-memory 4GB --limit-cpu 2 shinyproxy_shinyproxy
Issue 3: SSL Certificate Issues
Problem: Let’s Encrypt certificates not renewing or HTTPS not working.
Solution:
# Check certificate status
docker exec $(docker ps -q -f name=traefik) cat /letsencrypt/acme.json
# Verify domain DNS resolution
nslookup analytics.company.com
# Check Traefik certificate resolver logs
docker service logs traefik_traefik | grep -i acme
# Force certificate refresh
docker service update --force traefik_traefik
Health Monitoring and Alerting
# Comprehensive health checking configuration
version: '3.8'
services:
healthcheck-monitor:
image: prom/blackbox-exporter:latest
command:
- '--config.file=/config/blackbox.yml'
volumes:
- ./config/blackbox.yml:/config/blackbox.yml:ro
networks:
- traefik-public
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
# blackbox.yml - External monitoring configuration
modules:
http_2xx:
prober: http
timeout: 5s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]
valid_status_codes: [200, 301, 302]
method: GET
shinyproxy_health:
prober: http
timeout: 10s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]
valid_status_codes: [200]
method: GET
basic_auth:
username: "health-check"
password: "secure-password"
Backup and Disaster Recovery
#!/bin/bash
# backup-stack.sh - Complete stack backup script
# Backup Docker Swarm configuration
docker node ls --format "table {{.ID}}\t{{.Hostname}}\t{{.Status}}\t{{.Availability}}" > swarm-nodes-backup.txt
# Backup stack configurations
mkdir -p backups/$(date +%Y%m%d)
cp -r config/ backups/$(date +%Y%m%d)/
cp *.yml backups/$(date +%Y%m%d)/
# Backup persistent volumes
docker run --rm -v traefik-public-certificates:/data -v $(pwd)/backups/$(date +%Y%m%d):/backup alpine tar czf /backup/traefik-certificates.tar.gz /data
docker run --rm -v grafana-data:/data -v $(pwd)/backups/$(date +%Y%m%d):/backup alpine tar czf /backup/grafana-data.tar.gz /data
docker run --rm -v influxdb-data:/data -v $(pwd)/backups/$(date +%Y%m%d):/backup alpine tar czf /backup/influxdb-data.tar.gz /data
# Upload to cloud storage (example with AWS S3)
aws s3 sync backups/$(date +%Y%m%d)/ s3://company-shinyproxy-backups/$(date +%Y%m%d)/
echo "Backup completed: $(date)"
Performance Monitoring and Optimization
Resource Usage Analytics
# Monitor cluster resource usage
docker stats --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}"
# Check service resource allocation
docker service inspect shinyproxy_shinyproxy --format '{{.Spec.TaskTemplate.Resources}}'
# Monitor node resource utilization
docker system df
docker system events --filter type=container --filter event=start
Performance Tuning Guidelines
Container Resource Optimization:
# Optimized resource allocation based on application profiles
proxy:
specs:
# Lightweight dashboard applications
- id: executive-dashboard
container-memory: "1GB"
container-cpu-limit: 1
container-cpu-reservation: 0.5
# Memory-intensive analytics applications
- id: big-data-analytics
container-memory: "8GB"
container-cpu-limit: 4
container-cpu-reservation: 2
container-env:
R_MAX_VSIZE: "7GB"
JAVA_OPTS: "-Xmx6g -XX:+UseG1GC"
# High-concurrency applications
- id: public-reporting
container-memory: "2GB"
container-cpu-limit: 2
container-env:
SHINY_WORKER_PROCESSES: "4"
R_MAX_VSIZE: "1.5GB"
Common Questions About Enterprise Orchestration
Docker Swarm provides 80% of Kubernetes functionality with significantly reduced complexity, making it ideal for most enterprise Shiny deployments:
Docker Swarm Advantages:
- Simpler Setup: Native Docker integration requires minimal additional learning
- Resource Efficiency: Lower overhead compared to Kubernetes control plane
- Faster Deployment: Hours to deploy vs. days for Kubernetes setup
- Easier Maintenance: Fewer moving parts and components to manage
- Cost Effective: No specialized Kubernetes expertise required
When to Choose Kubernetes:
- Large Scale: 100+ nodes or complex multi-cloud deployments
- Advanced Features: Need for custom resource definitions, operators, or complex networking
- Existing Infrastructure: Organization already invested in Kubernetes ecosystem
- Compliance Requirements: Specific governance or policy management needs
For Most Organizations: Docker Swarm provides the optimal balance of features, complexity, and cost for Shiny application orchestration. The integrated approach with ShinyProxy, Traefik, and monitoring tools creates a production-ready platform without Kubernetes overhead.
The enterprise orchestration stack provides excellent cost efficiency compared to managed cloud services:
Minimum Production Setup:
- Manager Node: AWS t3.large (2 vCPU, 8GB RAM) - ~$60/month
- Worker Nodes: 2x AWS t3.medium (2 vCPU, 4GB RAM each) - ~$60/month total
- Storage: 200GB SSD across nodes - ~$20/month
- Total Infrastructure: ~$140/month
Scaling Considerations:
# Cost estimation for different user loads
concurrent_users_10="Manager: t3.large, Worker: 1x t3.medium = $90/month"
concurrent_users_50="Manager: t3.xlarge, Workers: 2x t3.large = $200/month"
concurrent_users_100="Manager: t3.xlarge, Workers: 3x t3.xlarge = $400/month"
Cost Comparison:
- AWS ECS/Fargate: 2-3x more expensive for equivalent resources
- Managed Kubernetes: 3-4x more expensive including management overhead
- RStudio Connect: $12,000+/year licensing plus infrastructure
- Enterprise Shiny Solutions: $50,000+/year for comparable features
Hidden Cost Savings:
- No licensing fees for orchestration components
- Reduced operational overhead vs. complex cloud services
- Predictable scaling costs without vendor lock-in
- Reusable infrastructure for other containerized applications
The Traefik integration provides automated SSL management that scales across development, staging, and production environments:
Automated Let’s Encrypt Integration:
# Production SSL configuration
command:
- "--certificatesresolvers.letsencrypt.acme.tlschallenge=true"
- "--certificatesresolvers.letsencrypt.acme.email=admin@company.com"
- "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
Multi-Environment Domain Strategy:
# Development environment
dev-analytics.company.com -> Development cluster
# Staging environment
staging-analytics.company.com -> Staging cluster
# Production environment
analytics.company.com -> Production cluster
Certificate Management Best Practices:
- Wildcard Certificates: Use
*.company.com
for multiple subdomains - Staging Certificates: Test with Let’s Encrypt staging environment first
- Certificate Monitoring: Set up alerts for certificate expiration
- Backup Strategy: Regular backups of certificate data volumes
DNS Configuration:
# DNS records for complete setup
analytics.company.com A production-cluster-ip
staging-analytics.company.com A staging-cluster-ip
dev-analytics.company.com A development-cluster-ip
traefik.company.com A production-cluster-ip
grafana.company.com A production-cluster-ip
swarmpit.company.com A production-cluster-ip
Advanced SSL Features:
- HTTP to HTTPS automatic redirection
- HSTS security headers implementation
- Modern TLS configuration with secure cipher suites
- Certificate transparency logging integration
Comprehensive backup and disaster recovery requires protecting both configuration and data across all stack components:
Critical Backup Components:
# 1. Docker Swarm cluster state
docker node ls --format json > cluster-state-backup.json
docker service ls --format json > services-backup.json
# 2. Persistent volume data
docker run --rm -v traefik-public-certificates:/data -v $(pwd)/backup:/backup alpine \
$(date +%Y%m%d).tar.gz /data
tar czf /backup/traefik-certs-
# 3. Configuration files
tar czf config-backup-$(date +%Y%m%d).tar.gz *.yml config/
# 4. Application container images
docker save company-registry/shiny-app:latest | gzip > shiny-app-backup.tar.gz
Automated Backup Strategy:
# backup-service.yml
version: '3.8'
services:
backup-agent:
image: alpine:latest
command: |
sh -c "
while true; do
# Daily backup routine
tar czf /backup/config-$(date +%Y%m%d).tar.gz /config
aws s3 sync /backup s3://company-disaster-recovery/shinyproxy/
sleep 86400 # 24 hours
done" volumes:
- ./config:/config:ro
- backup-storage:/backup
deploy:
placement:
constraints:
- node.role == manager
Disaster Recovery Procedures:
Complete Cluster Recovery:
# 1. Rebuild Swarm cluster
docker swarm init --advertise-addr <new-manager-ip>
# 2. Restore configuration
aws s3 sync s3://company-disaster-recovery/shinyproxy/latest/ ./restore/
tar xzf restore/config-backup.tar.gz
# 3. Restore persistent data
docker volume create traefik-public-certificates
docker run --rm -v traefik-public-certificates:/data -v $(pwd)/restore:/backup alpine \
-C /
tar xzf /backup/traefik-certs-backup.tar.gz
# 4. Redeploy services
docker stack deploy -c traefik-stack.yml traefik
docker stack deploy -c shinyproxy-stack.yml shinyproxy
docker stack deploy -c monitoring-stack.yml monitoring
Testing Recovery Procedures:
# Quarterly disaster recovery testing
./scripts/backup-stack.sh
./scripts/simulate-disaster.sh # Controlled failure simulation
./scripts/restore-stack.sh # Recovery procedure validation
./scripts/verify-functionality.sh # End-to-end testing
Recovery Time Objectives:
- Configuration Restore: < 30 minutes
- Service Recovery: < 60 minutes
- Full Functionality: < 2 hours
- Data Loss Tolerance: < 24 hours (with daily backups)
Test Your Understanding
What are the key benefits of integrating Docker Swarm with ShinyProxy compared to running ShinyProxy on a single Docker host?
- Docker Swarm provides better container isolation than single-host Docker
- Swarm enables high availability, load distribution, and rolling updates across multiple nodes
- ShinyProxy requires Docker Swarm for proper authentication integration
- Container startup times are faster in Swarm mode
- Think about what happens when your single Docker host fails
- Consider how traffic and workload can be distributed across multiple machines
- Remember the deployment and update capabilities discussed
B) Swarm enables high availability, load distribution, and rolling updates across multiple nodes
Key advantages of Docker Swarm integration:
High Availability: - Multiple ShinyProxy instances across different nodes prevent single points of failure - Automatic failover when nodes or services become unavailable - Service redundancy ensures continuous application availability
Load Distribution: - Intelligent workload distribution across available nodes - Dynamic container placement based on resource availability - Better resource utilization across the cluster
Rolling Updates: - Zero-downtime deployments with gradual service updates - Automatic rollback capabilities if updates fail - Controlled update strategies (parallel, sequential)
Operational Benefits: - Centralized cluster management and monitoring - Automated service recovery and healing - Simplified scaling operations across multiple nodes
While container isolation remains the same, the orchestration benefits make Swarm essential for enterprise production environments.
Complete this Traefik service configuration for ShinyProxy load balancing:
labels:
- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.http.routers.shinyproxy-https.rule=Host(`analytics.company.com`)
- traefik.http.routers.shinyproxy-https.entrypoints=websecure
- traefik.http.routers.shinyproxy-https.tls.certresolver=letsencrypt
- traefik.http.services.shinyproxy.loadbalancer.server.port=8080
- traefik.http.services.shinyproxy.loadbalancer.______=true
What configuration is needed in the blank to ensure WebSocket connections work properly?
- Think about how Shiny applications maintain state between client and server
- Consider what happens when load balancer sends requests to different backend instances
- Remember that Shiny uses WebSocket connections for reactive updates
- traefik.http.services.shinyproxy.loadbalancer.sticky.cookie=true
Why sticky sessions are essential:
WebSocket Connection Requirements: - Shiny applications use WebSocket connections for real-time reactive updates - WebSocket connections must remain with the same backend server throughout the session - Load balancing without stickiness would break these persistent connections
Session State Management: - ShinyProxy maintains user session state in memory on specific instances - Routing users to different instances would lose their application state - Sticky cookies ensure users always return to their assigned ShinyProxy instance
Complete Sticky Configuration:
- traefik.http.services.shinyproxy.loadbalancer.sticky.cookie=true
- traefik.http.services.shinyproxy.loadbalancer.sticky.cookie.name=shinyproxy-server
- traefik.http.services.shinyproxy.loadbalancer.sticky.cookie.secure=true
This ensures reliable user experience while maintaining load balancing benefits for new sessions.
Your organization needs to scale the enterprise Shiny stack to handle 200 concurrent users with high availability requirements. Which scaling approach would you recommend?
- Scale vertically by adding more CPU and memory to existing nodes
- Add more ShinyProxy replicas on the same nodes with higher resource limits
- Add worker nodes and distribute ShinyProxy instances with monitoring-based auto-scaling
- Migrate to a managed Kubernetes service for better scaling capabilities
- Consider the resource requirements per concurrent user
- Think about fault tolerance and high availability requirements
- Remember the cost and complexity implications discussed
C) Add worker nodes and distribute ShinyProxy instances with monitoring-based auto-scaling
Why this is the optimal approach:
Scalability Architecture:
# Recommended configuration for 200 concurrent users
# Manager Node: 1x t3.xlarge (4 vCPU, 16GB RAM)
# Worker Nodes: 4x t3.large (2 vCPU, 8GB RAM each)
# ShinyProxy Replicas: 3-4 instances across nodes
High Availability Benefits: - Node Redundancy: Failure of any single node doesn’t impact service - Geographic Distribution: Nodes can be in different availability zones - Load Distribution: Even resource utilization across cluster - Fault Isolation: Problems on one node don’t cascade
Monitoring-Based Scaling:
# Scaling triggers based on metrics
CPU_THRESHOLD="80%" # Scale up when average CPU > 80%
MEMORY_THRESHOLD="85%" # Scale up when memory > 85%
RESPONSE_TIME_THRESHOLD="2s" # Scale up when response time > 2s
Resource Optimization:
- Container-per-user: Each user gets dedicated 200MB-400MB
- Dynamic Allocation: Resources scale with actual usage
- Cost Efficiency: Pay only for used resources vs. over-provisioned vertical scaling
Why Not Other Options:
- Option A: Single points of failure, limited scalability ceiling
- Option B: Resource contention, no fault tolerance improvement
- Option D: Unnecessary complexity and 3x cost increase for equivalent functionality
The horizontal scaling approach provides optimal balance of performance, reliability, and cost-effectiveness.
Conclusion
The enterprise orchestration stack combining Docker Swarm, ShinyProxy, Traefik, and comprehensive monitoring represents the evolution of production Shiny deployment. By integrating these components into a cohesive platform, organizations can achieve enterprise-grade reliability, security, and scalability while maintaining the simplicity that makes R-based analytics accessible to data science teams.
This integrated approach eliminates the operational complexity of managing disparate deployment tools while providing automated SSL certificate management, intelligent load balancing, rolling updates with zero downtime, and comprehensive observability. The result is a production-ready analytics platform that scales from departmental tools to organization-wide analytical infrastructure without the complexity and cost overhead of managed cloud services or enterprise software licenses.
The Docker Swarm foundation provides 80% of Kubernetes functionality with significantly reduced operational overhead, making it ideal for organizations that need enterprise features without enterprise complexity. Combined with Traefik’s modern reverse proxy capabilities and ShinyProxy’s multi-tenant container orchestration, this stack delivers a robust, cost-effective solution for scaling analytical applications.
Whether you’re building internal dashboards for executive teams, deploying client-facing analytics for external stakeholders, or creating comprehensive analytical platforms that serve hundreds of concurrent users, this enterprise orchestration approach provides the foundation for reliable, secure, and maintainable Shiny application deployment that grows with organizational needs.
Next Steps
Based on what you’ve learned in this comprehensive guide, here are the recommended paths for implementing and mastering enterprise Shiny orchestration:
Immediate Next Steps (Complete These First)
- ShinyProxy Enterprise Deployment - Master ShinyProxy fundamentals before implementing the complete orchestration stack
- Docker Containerization for Shiny - Optimize your container images for production deployment and performance
- Practice Exercise: Set up a three-node test cluster and deploy the complete stack with monitoring to validate your understanding
Building on Your Foundation (Choose Your Path)
For Infrastructure Focus:
- Production Deployment Overview - Implement the orchestration stack on AWS, GCP, or Azure with cloud-native features
- Scaling and Long-term Maintenance - Advanced optimization techniques for high-load environments
For Operations Focus:
- Production Deployment and Monitoring - Implement comprehensive observability and alerting for production operations
- Security Best Practices - Harden your deployment for enterprise security requirements
For Advanced Integration:
- Enterprise Development Overview - When and how to migrate from Docker Swarm to Kubernetes orchestration
- User Authentication and Security - Advanced integration patterns with organizational identity systems
Long-term Goals (2-4 Weeks)
- Deploy a production enterprise orchestration stack serving multiple business units
- Implement automated CI/CD pipelines for application and infrastructure deployment
- Create organizational standards and templates for Shiny application deployment
- Establish monitoring, alerting, and incident response procedures for production operations
Explore More Articles
Here are more articles from the same category to help you dive deeper into production deployment strategies.
Reuse
Citation
@online{kassambara2025,
author = {Kassambara, Alboukadel},
title = {Enterprise {Shiny} {Orchestration:} {Complete} {Production}
{Stack}},
date = {2025-05-23},
url = {https://www.datanovia.com/learn/tools/shiny-apps/production-deployment/enterprise-orchestration.html},
langid = {en}
}