# Server Provisioning Checklist ## AWS / Forge Setup - [ ] Use Forge to create server - [ ] Tag the EC2 instance and the root storage - [ ] After creation add elastic IP - [ ] Add monitoring in Forge - [ ] Update root volume to gp3 - [ ] Enable AWS backup - [ ] Setup Forge database backups - [ ] Set up SSH key access for team members ## OS Tooling - [ ] Install atop (`apt install atop`, verify it runs via systemd and writes to `/var/log/atop/`) - [ ] Install htop (`apt install htop`) - [ ] Install gdu or ncdu (`apt install gdu` or `apt install ncdu`) for disk usage analysis ## Redis Hardening - [ ] Set `maxmemory` to an appropriate limit (e.g. 2gb for a 16GB server) - [ ] Set `maxmemory-policy allkeys-lru` - [ ] Disable RDB persistence if not needed (`save ""`) to prevent fork-based OOM - [ ] Persist config: `redis-cli CONFIG REWRITE` - [ ] Verify config survives reboot: check `/etc/redis/redis.conf` directly ## Laravel / Horizon / Pulse - [ ] Verify Horizon trim settings in `config/horizon.php` (recent/completed: 60 min or less) - [ ] If Pulse is enabled, ensure `pulse:work` is running in supervisor - [ ] If Pulse is not used, disable it entirely (remove provider or `PULSE_ENABLED=false`) - [ ] Set queue worker memory limits (`--memory=256`) and max jobs (`--max-jobs=500`) ## PHP-FPM - [ ] Remove unused PHP-FPM pools/versions (only keep the version the site uses) - [ ] Tune `pm.max_children` based on available RAM and per-worker memory usage ## Swap - [ ] Verify swap is configured (at least 2 GB for a 16GB server) - [ ] Check `vm.swappiness` is set appropriately (default 60 is fine for most cases) ## Security - [ ] Verify UFW is enabled and only allows necessary ports (22, 80, 443) - [ ] Disable password-based SSH login (`PasswordAuthentication no`) - [ ] Verify unattended-upgrades is enabled for security patches ## Deployment - [ ] Verify deployment script does not spawn hundreds of parallel processes (serialize unzip/rm) - [ ] Cap node build memory: `NODE_OPTIONS=--max-old-space-size=512` in deploy script - [ ] Test a deploy on the new server before going live ## Monitoring / Alerting - [ ] Set up memory usage alerting (CloudWatch, Forge, or similar) so OOM situations are caught before they crash the server - [ ] Set up disk usage alerting (logs and atop files can fill disks over time) - [ ] Configure atop log retention (`/etc/default/atop`, default keeps 28 days)