Xenlab Rebuild with Ansible⚓︎
🚨 Emergency Server Recovery Guide⚓︎
This guide assumes the Xenlab server has failed and needs to be rebuilt completely.
Prerequisites Checklist⚓︎
Before starting, ensure you have:
- New Ubuntu 24.04 server deployed
- Root or sudo access to new server
- Synology NAS accessible
- Network connectivity between NAS and new server
- Access to latest server backup
Phase 1: Ubuntu Server Setup⚓︎
Get New Server IP Address⚓︎
Write down the IP: _________________
Create User Account⚓︎
Deploy SSH Key⚓︎
# ssh-copy-id is not available on Synology NAS, use this one-liner instead:
cat /volume2/ansible-infrastructure/ssh-keys/ansible_key.pub | ssh dave@NEW_SERVER_IP "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys"
Test SSH access:
Phase 2: Update Ansible Configuration⚓︎
Update Server IP in Inventory⚓︎
Change the IP address:
all:
children:
local_servers:
hosts:
xenlab:
ansible_host: NEW_SERVER_IP_HERE # ← Update this line
ansible_user: dave
ansible_ssh_private_key_file: ssh-keys/ansible_key
Phase 3: Deploy Infrastructure with Ansible⚓︎
Test Connectivity⚓︎
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible -i inventory/xenlab-local.yml all -m ping"
Expected output: xenlab | SUCCESS => {"ping": "pong"}
Optional: Run Discovery⚓︎
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/xenlab-local.yml playbooks/discover-server.yml --ask-become-pass -v"
Phase 3.5: Test Deployment (Recommended)⚓︎
Test Deployment First⚓︎
# Test deployment in check mode first
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/xenlab-local.yml playbooks/local-infrastructure.yml --check --diff --ask-become-pass -v"
Deploy Complete Infrastructure⚓︎
# Deploy infrastructure
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/xenlab-local.yml playbooks/local-infrastructure.yml --ask-become-pass -v"
When prompted for BECOME password: Enter the dave sudo password
Note: The deployment uses a consolidated xenlab-local role with comprehensive defaults and bulletproof variable handling. Both local and VPS deployments are tested and working.
What Ansible Builds Automatically⚓︎
✅ Base System Configuration⚓︎
- SSH hardening (key-only auth, disable root login)
- Essential packages (curl, wget, git, htop, vim, tree, etc.)
- System package updates and security hardening
- User shell configuration
- Service management
- Robust deployment handling (apt locks, systemd timeouts, user/group management)
✅ Docker Environment⚓︎
- Docker Compose v2.39.0+ installation (latest version automatically detected)
- Docker daemon configuration (logging, storage driver)
- Docker networks and volumes
- Docker Compose directory:
/home/dave/.docker/compose/
✅ Automation Scripts⚓︎
/home/dave/scripts/disk-space-check.sh/home/dave/scripts/backup/system-backup.sh/home/dave/scripts/maintenance/docker-cleanup.sh- Log directories:
/var/log/custom-scripts/ - Logrotate configuration
✅ Scheduled Tasks⚓︎
- Disk space monitoring (every 6 hours)
- System backups (Sunday 2:30 AM)
- Docker cleanup (Monday 3:00 AM)
- Log directory:
/var/log/cron-jobs/
Phase 4: Restore Data⚓︎
Extract Backup⚓︎
# SSH to the new server
ssh -i /volume2/ansible-infrastructure/ssh-keys/ansible_key dave@NEW_SERVER_IP
# Navigate to root for full restore
cd /
# Extract latest backup (adjust path and filename as needed)
sudo tar xzf /path/to/backup/xenlab-YYYY-MM-DD.tgz
This restores:
- All Docker volumes (
/var/lib/docker/volumes/) - Docker compose configuration (
/home/dave/.docker/compose/) - All user data (
/home/) - Cron jobs (
/var/spool/cron/) - Package configurations (
/etc/apt/)
Phase 5: Start Services⚓︎
Start Docker Compose Services⚓︎
# Navigate to Docker Compose directory
cd /home/dave/.docker/compose
# Start all services
docker-compose up -d
Verify Everything is Running⚓︎
# Check Docker containers
docker ps
# Check system services
systemctl status docker
systemctl status cron
# Check cron jobs
crontab -l
# Check custom scripts
ls -la /home/dave/scripts/
Complete Recovery Timeline⚓︎
- Ubuntu Install: 15-20 minutes
- User Setup: 2-3 minutes
- Ansible Deployment: 5-10 minutes
- Data Restore: 10-20 minutes (depending on backup size)
- Service Startup: 2-5 minutes
Common Issues & Solutions⚓︎
Variable Loading Issues⚓︎
Solution: Roles now have comprehensive defaults that ensure all variables are available, preventing undefined variable errors.
Permission Errors During Deployment⚓︎
Solution: The consolidated xenlab-local role handles this automatically with retry logic.
APT Lock Issues⚓︎
Solution: The consolidated role waits for package managers and handles this automatically.
Template Errors⚓︎
Solution: Templates now include safety checks for undefined variables.
Emergency Commands Reference⚓︎
Quick Server Info⚓︎
If Ansible Fails⚓︎
# Test SSH manually
ssh -i /volume2/ansible-infrastructure/ssh-keys/ansible_key dave@NEW_SERVER_IP
# Check SSH key permissions
ls -la /volume2/ansible-infrastructure/ssh-keys/
# Fix SSH key permissions if needed
chmod 600 /volume2/ansible-infrastructure/ssh-keys/ansible_key
chmod 644 /volume2/ansible-infrastructure/ssh-keys/ansible_key.pub
Dry Run Mode (Testing)⚓︎
# Test all changes without applying them
docker run --rm -it \
-v /volume2/ansible-infrastructure:/ansible \
-w /ansible \
cytopia/ansible:latest \
sh -c "apk add --no-cache openssh-client && ansible-playbook -i inventory/xenlab-local.yml playbooks/local-infrastructure.yml --check --diff --ask-become-pass -v"
Success Verification⚓︎
Server is fully restored when:
- SSH access works with key authentication
- Docker and Docker Compose are running
- All containers from
docker psmatch original setup - All cron jobs are scheduled (
crontab -l) - Custom scripts exist and are executable
- Log directories exist with proper permissions
- All applications are accessible
Backup Integration⚓︎
Existing backup script captures:
/home- All user data including docker-compose.yml/var/spool/cron- All cron jobs/etc/apt- Package manager configuration/var/lib/docker/volumes- ALL Docker volumes
This backup + Ansible approach provides:
- Infrastructure: Rebuilt by Ansible using consolidated
xenlab-localrole - Data: Restored from comprehensive backups
- Complete Recovery: Both system and application state
Emergency Contacts & Resources⚓︎
Ansible Project Location: /volume2/ansible-infrastructure
SSH Keys Location: /volume2/ansible-infrastructure/ssh-keys
Docker Compose Location: /home/dave/.docker/compose
Backup Location: /mnt/Backup/Server-Backups
Recovery Architecture:
- Ansible: Infrastructure foundation with bulletproof variable handling
- Backup Script: Complete data preservation
- Docker Compose: Application orchestration
Post-Recovery Tasks⚓︎
- Update documentation with any changes made during recovery
- Test all services to ensure they're working properly
- Update monitoring with new server details if needed
- Verify backup script is working on restored server
- Schedule next DR test for one year from now
Key Advantages of This Approach⚓︎
✅ Complete Automation⚓︎
- Infrastructure rebuilt from consolidated role with comprehensive defaults
- Data restored from comprehensive backups
- Services automatically restarted
✅ Minimal Manual Steps⚓︎
- Only Ubuntu install and user creation required
- Everything else is automated
✅ Comprehensive Coverage⚓︎
- System configuration, Docker environment, automation scripts
- ALL Docker volumes preserved in backups
- Scheduled tasks and logging infrastructure
✅ Fast Recovery⚓︎
- 30-60 minute complete rebuild
- Tested and documented process
- Bulletproof variable handling prevents deployment failures
Recovery completed successfully! 🎉