Infrastructure as Code with Ansible⚓︎
Overview⚓︎
This article provides a complete guide to implementing Infrastructure as Code (IaC) using Ansible to manage Ubuntu servers from a Synology NAS. The solution enables consistent, version-controlled server provisioning and ongoing management, including automated maintenance, Docker container orchestration, security hardening, and disaster recovery.
Architecture⚓︎
Control Machine⚓︎
- Platform: Synology NAS
- Ansible Runtime: Docker container (
cytopia/ansible:latest) - Project Directory:
/volume2/ansible-infrastructure - SSH Key Location:
/volume2/ansible-infrastructure/ssh-keys
Target Servers⚓︎
- Operating System: Ubuntu 24.04 LTS
- Authentication: SSH key-based
- Privilege Escalation:
sudowith password (--ask-become-pass)
Data Protection⚓︎
- Infrastructure State: Fully managed via Ansible, reproducible from code
- Application Data: Backed up with custom scripts that preserve all Docker volumes
- Recovery Strategy: Rebuild with Ansible + restore from backup
Project Structure⚓︎
/volume2/ansible-infrastructure/
├── ansible.cfg # Ansible configuration
├── inventory/
│ ├── xenlab-local.yml # Local server inventory
│ └── portable-vps.yml # VPS inventory
├── playbooks/
│ ├── discover-server.yml # System discovery
│ ├── local-infrastructure.yml # Local server deployment
│ ├── vps-infrastructure.yml # VPS deployment
│ ├── verify-base-system.yml # Base system verification
│ ├── verify-docker.yml # Docker role verification
│ ├── verify-scripts.yml # Scripts role verification
│ └── verify-cron.yml # Cron role verification
├── roles/
│ ├── xenlab-local/ # Consolidated local server role
│ │ ├── defaults/main.yml # Role defaults
│ │ ├── tasks/main.yml # All tasks
│ │ └── templates/ # Configuration templates
│ └── vps-applications/ # Consolidated VPS role
│ ├── defaults/main.yml # Role defaults
│ ├── tasks/main.yml # All tasks
│ └── templates/ # Configuration templates
├── group_vars/
│ ├── local_servers.yml # Local server group variables
│ └── vps_servers.yml # VPS server group variables
└── ssh-keys/
├── ansible_key # Private SSH key
└── ansible_key.pub # Public SSH key
Step-by-Step Implementation⚓︎
1. Initial Setup⚓︎
Create Project Structure⚓︎
Generate SSH Keys⚓︎
Configure Ansible⚓︎
Create /volume2/ansible-infrastructure/ansible.cfg:
[defaults]
host_key_checking = False
inventory = inventory/
private_key_file = ssh-keys/ansible_key
roles_path = roles
stdout_callback = yaml
retry_files_enabled = False
gathering = smart
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
pipelining = True
Create Inventories⚓︎
Local Server (/volume2/ansible-infrastructure/inventory/xenlab-local.yml):
---
all:
children:
local_servers:
hosts:
xenlab:
ansible_host: 100.102.180.92
ansible_user: dave
ansible_ssh_private_key_file: ssh-keys/ansible_key
VPS Server (/volume2/ansible-infrastructure/inventory/portable-vps.yml):
---
all:
children:
vps_servers:
hosts:
portable-vps:
ansible_host: YOUR_VPS_IP
ansible_user: unifiadmin
ansible_ssh_private_key_file: ssh-keys/ansible_key
2. Consolidated Role Architecture⚓︎
Xenlab-Local Role (roles/xenlab-local/)⚓︎
Purpose: Complete local server configuration with comprehensive variable handling
Architecture: - Role defaults (defaults/main.yml): Comprehensive defaults ensure consistent deployments - Group variables (group_vars/local_servers.yml): Override defaults with environment-specific values - Template safety checks: Handle undefined variables gracefully - Task conditionals: Skip optional features when variables unavailable
Features: * Base system hardening and package installation * User and group management with retry logic * Docker environment setup with latest version detection * Custom automation scripts deployment * Scheduled task management * Robust deployment handling for fresh Ubuntu systems
VPS-Applications Role (roles/vps-applications/)⚓︎
Purpose: Complete VPS configuration with comprehensive variable handling
Architecture: - Role defaults (defaults/main.yml): All variables defined with sensible defaults - Group variables (group_vars/vps_servers.yml): Environment-specific overrides - Safety checks: Prevent undefined variable errors - Multi-cloud compatibility: Works across all cloud providers
Features: * Security hardening (UFW firewall, Fail2Ban) * Docker environment optimized for VPS * NAS integration with CIFS mounting * Monitoring and notification system (ntfy.sh, healthcheck.io) * Automated backup scripts (Unifi, Golinks, dotfiles) * Multi-cloud portability
Key Scripts Deployed: * /usr/local/bin/unifi-backup.sh - Weekly Unifi controller backup * /usr/local/bin/backup-golinks.sh - Daily Tailscale Golinks backup * /usr/local/bin/internet-connectivity-check.sh - 5-minute connectivity monitoring * /usr/local/bin/monitor-exit-node.sh - 5-minute Tailscale exit node monitoring * /usr/local/bin/docker-maintenance.sh - Daily Docker updates and cleanup * /usr/local/bin/backup-dotfiles.sh - Weekly dotfiles backup to GitHub
3. Execution Commands⚓︎
Local Infrastructure Deployment⚓︎
Discovery:
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/xenlab-local.yml playbooks/discover-server.yml --ask-become-pass -v"
Dry Run:
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/xenlab-local.yml playbooks/local-infrastructure.yml --check --diff --ask-become-pass -v"
Production Deployment:
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/xenlab-local.yml playbooks/local-infrastructure.yml --ask-become-pass -v"
VPS Infrastructure Deployment⚓︎
Dry Run:
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/portable-vps.yml playbooks/vps-infrastructure.yml --check --diff --ask-become-pass -v"
Production Deployment:
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/portable-vps.yml playbooks/vps-infrastructure.yml --ask-become-pass -v"
Variable Management Strategy⚓︎
Layered Variable Architecture⚓︎
The infrastructure uses a role defaults + group_vars override strategy:
- Role Defaults (
roles/*/defaults/main.yml): - Comprehensive defaults for all variables
- Ensures roles are self-contained and functional
-
Provides sensible fallback values
-
Group Variables (
group_vars/*.yml): - Environment-specific overrides
- Clean organization of configuration
-
Higher precedence than role defaults
-
Template Safety Checks:
{% if variable is defined %}checks in templates- Graceful handling of undefined variables
-
Fallback configurations when needed
-
Task Conditionals:
when: variable is definedon optional tasks- Skip features when variables unavailable
- Prevent deployment failures
Example Variable Flow⚓︎
# roles/xenlab-local/defaults/main.yml (always available)
disk_space_threshold: 85
docker_compose_version: "2.39.0"
# group_vars/local_servers.yml (overrides defaults)
disk_space_threshold: 90 # Override default
# docker_compose_version uses default value
Making Updates and Changes⚓︎
Recommended Workflow⚓︎
When making infrastructure changes, follow this disciplined workflow to maintain consistency and avoid configuration drift:
1. Update Ansible First (Preferred Method)⚓︎
For new packages, scripts, or configuration changes:
# Edit the appropriate group variables
nano /volume2/ansible-infrastructure/group_vars/local_servers.yml
# or
nano /volume2/ansible-infrastructure/group_vars/vps_servers.yml
# Test with dry run
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/xenlab-local.yml playbooks/local-infrastructure.yml --check --diff --ask-become-pass -v"
# Apply if changes look correct
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/xenlab-local.yml playbooks/local-infrastructure.yml --ask-become-pass -v"
2. Audit with Discovery⚓︎
Run discovery periodically to identify configuration drift:
# Monthly audit to see what's changed
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/xenlab-local.yml playbooks/discover-server.yml --ask-become-pass -v"
Review the output for unexpected changes and decide whether to:
- Add them to Ansible configurations (if they should be permanent)
- Document them as acceptable exceptions
- Remove them as unwanted drift
3. Handle Manual Changes⚓︎
When manual changes are needed:
For Testing/Troubleshooting:
- Make the change manually
- Document it as temporary
- Remove it when done, or add it to Ansible if it should be permanent
For Emergency Fixes:
- Make the immediate fix
- Update Ansible configurations afterward
- Run Ansible to ensure consistency across all servers
Configuration Drift Detection⚓︎
Create a simple script to monitor for significant changes:
#!/bin/bash
# /home/dave/scripts/drift-check.sh
CURRENT_PACKAGES=$(dpkg --get-selections | wc -l)
LAST_COUNT=$(cat /tmp/package_count 2>/dev/null || echo 0)
if [ "$CURRENT_PACKAGES" != "$LAST_COUNT" ]; then
echo "Package count changed: $LAST_COUNT -> $CURRENT_PACKAGES"
# Optional: Send notification
fi
echo "$CURRENT_PACKAGES" > /tmp/package_count
Documentation Strategy⚓︎
Keep a simple log of manual changes:
# /volume2/ansible-infrastructure/MANUAL_CHANGES.md
## 2025-07-29
- Temporarily installed `strace` for debugging - removed after troubleshooting
- Added `htop` to essential_packages in local_servers.yml
## 2025-07-15
- Manual Docker cleanup during disk space issue - added to docker cleanup script
Disaster Recovery (DR)⚓︎
Comprehensive Backup Strategy⚓︎
Local Infrastructure: The system includes robust backup scripts that capture:
VPS Infrastructure: Automated backups include:
- Unifi controller data
- Tailscale Golinks configuration
- User dotfiles and configurations
- Docker volumes and application data
Coverage:
/home: All user data, including docker-compose.yml files/var/spool/cron: All scheduled tasks/etc/apt: Package manager configuration/var/lib/docker/volumes: ALL Docker volumes (complete application data)
Recovery Process⚓︎
- Infrastructure Rebuild (Ansible): 5-10 minutes
- System hardening and packages
- System updates (for fresh deployments)
- Docker environment setup
- User and group management
- Automation scripts and scheduling
-
Logging infrastructure
-
Data Restoration (Backup): 10-20 minutes
- Complete Docker volume restoration
- User data and configurations
-
Application state preservation
-
Service Startup (Docker Compose): 2-5 minutes
- All containers automatically recreated
- Networks and volumes reconnected
- Applications fully operational
SSH Key Deployment for Recovery⚓︎
Note: ssh-copy-id is not available on Synology NAS. Use this alternative:
cat /volume2/ansible-infrastructure/ssh-keys/ansible_key.pub | ssh dave@NEW_SERVER_IP "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys"
Security Considerations⚓︎
SSH Hardening⚓︎
- Key-based authentication only
- Root login disabled
- Password authentication disabled
- Custom SSH port (configurable)
VPS-Specific Security⚓︎
- UFW Firewall: Configured with essential ports only
- Fail2Ban: SSH brute force protection
- Network Security: Tailscale mesh networking
- Secure Mounting: CIFS credentials properly secured
Privilege Escalation⚓︎
- No passwordless sudo
- Explicit password prompts for privilege escalation
- Minimal privilege principle applied
Backup Security⚓︎
- Automated backup with notification systems
- Container shutdown during backup for consistency
- Comprehensive logging and monitoring
- Encrypted transport for VPS backups
Maintenance & Monitoring⚓︎
Log Locations⚓︎
Local Infrastructure:
- Custom Scripts:
/var/log/custom-scripts/ - Cron Jobs:
/var/log/cron-jobs/
VPS Infrastructure:
- Monitoring Scripts: Integrated with ntfy.sh notifications
- Healthcheck Integration: Real-time service verification
- Backup Logs: Automated logging with retention
Automated Maintenance⚓︎
Local Infrastructure:
- Disk space monitoring: Every 6 hours with alerts
- System backups: Weekly with 30-day retention
- Docker cleanup: Weekly resource optimization
VPS Infrastructure:
- Internet connectivity: Every 5 minutes
- Exit node monitoring: Every 5 minutes with healthcheck
- Docker maintenance: Daily updates and cleanup
- Automated backups: Unifi (weekly), Golinks (daily), dotfiles (weekly)
Troubleshooting⚓︎
Common Issues & Solutions⚓︎
Variable Loading Issues:
- Previous Issue:
'docker_daemon_config' is undefined - Solution: Role defaults ensure variables are always available
Permission Errors During Deployment:
- Symptom:
usermod: Permission deniedor/etc/passwdlock errors - Solution: The consolidated roles handle this automatically with retry logic
APT Lock Issues:
- Symptom:
Could not get lock /var/lib/apt/lists/lock - Solution: The roles wait for package managers and handle this automatically
Template Errors:
- Previous Issue: Jinja2 template undefined variable errors
- Solution: Template safety checks with proper variable existence validation
SSH Connection Issues:
- Symptom:
No such file or directory: b'ssh' - Solution: Ensure
apk add --no-cache openssh-clientis included in Docker commands
VPS-Specific Issues:
- NAS Mount Failures: Check CIFS credentials and network connectivity
- Notification Failures: Verify ntfy.sh tokens and connectivity
- Healthcheck Failures: Confirm healthcheck.io endpoints are accessible
Validation Commands⚓︎
Test SSH connectivity:
ssh -i /volume2/ansible-infrastructure/ssh-keys/ansible_key dave@100.102.180.92
ssh -i /volume2/ansible-infrastructure/ssh-keys/ansible_key unifiadmin@VPS_IP
Ping All Hosts:
Test Inventory:
ansible-inventory -i inventory/xenlab-local.yml --list
ansible-inventory -i inventory/portable-vps.yml --list
Customization⚓︎
Modifying Variables⚓︎
Update group variables for environment-specific configuration:
Local Infrastructure (group_vars/local_servers.yml):
disk_space_threshold: 90 # Override default 85%
backup_retention_days: 60 # Override default 30 days
VPS Infrastructure (group_vars/vps_servers.yml):
Add New Hosts⚓︎
Local Servers (inventory/xenlab-local.yml):
all:
children:
local_servers:
hosts:
xenlab:
ansible_host: 100.102.180.92
xenlab-test:
ansible_host: 10.1.1.22
VPS Servers (inventory/portable-vps.yml):
all:
children:
vps_servers:
hosts:
portable-vps:
ansible_host: YOUR_VPS_IP
backup-vps:
ansible_host: BACKUP_VPS_IP
Summary & Benefits⚓︎
Operational Excellence⚓︎
- Layered Architecture: Role defaults + group_vars ensure reliable deployments
- Consolidated Roles: Single roles per deployment type for simplicity
- Symmetric Structure: Consistent patterns across local and VPS deployments
- Repeatable Deployments: Version-controlled, documented infrastructure
- Fast Rebuilds: 30–60 mins total recovery time
- Robust Error Handling: Handles fresh Ubuntu systems automatically
Disaster Recovery⚓︎
- Complete Automation: Infrastructure + data restoration
- Multi-Environment Support: Local servers and cloud VPS
- Comprehensive Backups: All Docker volumes and application data
- Quick Recovery: Minimal downtime with automated processes
Security & Monitoring⚓︎
- Security Hardening: SSH, firewall, and intrusion prevention
- Real-time Monitoring: Connectivity, services, and health checks
- Automated Notifications: ntfy.sh and healthcheck.io integration
- Centralized Logging: Comprehensive audit trails
Scalability & Portability⚓︎
- Multi-Cloud Support: Deploy VPS to any Ubuntu 24.04 provider
- Consistent Management: Single Ansible configuration for all environments
- Group-Based Variables: Flexible configuration management
- Consolidated Roles: Simplified maintenance and updates
Variable Management Excellence⚓︎
- Self-Contained Roles: Always work regardless of environment
- Clean Organization: Group variables for environment-specific settings
- Safety Checks: Template and task conditionals prevent failures
- Professional Standards: Enterprise-grade variable precedence handling
Final Thoughts⚓︎
This Ansible-based IaC framework delivers a scalable, secure, and maintainable infrastructure management solution with comprehensive variable handling and perfect symmetry between local and cloud deployments. The consolidated role architecture with extensive defaults provides enterprise-grade reliability while maintaining simplicity. With robust error handling, security hardening, comprehensive monitoring, automated backups, and multi-cloud portability, this solution provides production-ready infrastructure management from a Synology NAS control plane.