B0-Production-Infrastructure
B0 — Production Infrastructure
Status. Locked. This page is the single source of truth for the production stack and supersedes any conflicting statement on earlier pages. Where another page disagrees, B0 wins.
Posture. Web-first, mobile-supported. The Laravel Control Center is the primary operator surface. The iPhone app is a fully-featured remote that uses the same API. No business logic lives outside Laravel.
1. Locked production stack
Canonical diagrams (Option 1). For system-level diagrams (architecture, run lifecycle, Git/snapshots, RAG flows, security boundaries, export flows), link to 🧭ARCHITECTURE_DIAGRAMS (Canonical) instead of duplicating Mermaid here.
| Layer | Choice | Notes |
|---|---|---|
| Operating system | CentOS-compatible Linux (AlmaLinux 9 / Rocky Linux 9 recommended) | Long support window, RPM ecosystem, SELinux available. |
| Web server | LiteSpeed Web Server (LSWS) 6.x (Enterprise or OpenLiteSpeed) | Event-driven, HTTP/2 + HTTP/3, native LSAPI. |
| PHP runtime | lsphp 8.3 via LSAPI | Faster than PHP-FPM under LSWS; no separate process manager. |
| Backend | Laravel 11 | See B1 for app skeleton. |
| Database | MySQL 8.0 or MariaDB 11.4+ | Default. PostgreSQL+pgvector is an opt-in upgrade path, not the default. |
| Cache / queue / locks / pubsub | Redis 7 | Sessions, cache, queues, workspace locks, broadcast events. |
| Job runner | Laravel Horizon under Supervisor (or systemd unit) | Supervisors: agents-default, agents-long, rag-index, docs. |
| Realtime | Laravel Reverb for web WS + SSE for iPhone | Both fanned out from one ConsoleEventService::publish(). |
| SSL / TLS | Let's Encrypt (acme.sh or certbot) or server-provided cert | HTTP/3 enabled, HSTS preload. |
| Storage | Local disk for workspaces + snapshots; S3-compatible for long-term snapshot archives (optional) | S3 path keeps prod nodes stateless beyond active workspaces. |
| Deployment | Git-based with atomic release directories | See §8. |
| Process supervision | Supervisor preferred; systemd units acceptable | Reverb + Horizon + workspace janitor. |
| Backup | mysqldump to S3 daily + wal-style binlog snapshot | RPO ≤ 24 h, RTO ≤ 1 h. |
| Monitoring | LSWS access log + Laravel Telescope (non-prod) + Horizon dashboard + Prometheus node exporter | Optional Sentry for app exceptions. |
| Time | chrony synced to NTP | Required for SSE/token expiry correctness. |
Explicitly not in the production stack (rejected for v1): Nginx, Apache HTTP Server, PHP-FPM, PostgreSQL-as-default, Docker as the production runtime (Docker is dev-only). Any earlier spec page that implies otherwise defers to this one.
2. Server provisioning (CentOS / AlmaLinux / Rocky)
Run as root or with sudo. Steps assume a fresh AlmaLinux 9 box.
# 2.1 base hardening
dnf -y update
dnf -y install epel-release dnf-plugins-core firewalld policycoreutils-python-utils chrony git curl tar unzip jq
systemctl enable --now chronyd firewalld
firewall-cmd --permanent --add-service=http
firewall-cmd --permanent --add-service=https
firewall-cmd --permanent --add-port=443/udp # HTTP/3 (QUIC)
firewall-cmd --reload
# 2.2 user accounts
adduser deploy
usermod -aG wheel deploy
mkdir -p /home/deploy/.ssh && chmod 700 /home/deploy/.ssh
# add deploy key to /home/deploy/.ssh/authorized_keys
chown -R deploy:deploy /home/deploy/.ssh
# 2.3 SELinux: keep enforcing; we'll grant the contexts we need below.
sestatus
# 2.4 LiteSpeed repo + LSWS Enterprise (or OpenLiteSpeed)
# OpenLiteSpeed path:
rpm -Uvh https://rpms.litespeedtech.com/centos/litespeed-repo-1.2-1.el8.noarch.rpm
dnf -y install openlitespeed
# OR LSWS Enterprise (license required):
# follow https://docs.litespeedtech.com/ then `dnf install lsws`
# 2.5 lsphp 8.3 + extensions
dnf -y install lsphp83 lsphp83-common lsphp83-mysqlnd lsphp83-pdo lsphp83-mbstring \
lsphp83-opcache lsphp83-redis lsphp83-bcmath lsphp83-gd lsphp83-intl \
lsphp83-soap lsphp83-xml lsphp83-zip lsphp83-process lsphp83-pecl-imagick \
lsphp83-pecl-zip
# 2.6 database
dnf -y module reset mysql && dnf -y module enable mysql:8.0 && dnf -y install mysql-server
# or MariaDB:
# dnf -y install mariadb-server
systemctl enable --now mysqld
mysql_secure_installation
# 2.7 redis
dnf -y install redis
systemctl enable --now redis
# 2.8 supervisor
dnf -y install supervisor
systemctl enable --now supervisord
# 2.9 git + node (Node only for Vite asset build in CI; not required at runtime)
dnf -y install git
curl -fsSL https://rpm.nodesource.com/setup_20.x | bash -
dnf -y install nodejs
# 2.10 composer
curl -sS https://getcomposer.org/installer | /usr/local/lsws/lsphp83/bin/php -- --install-dir=/usr/local/bin --filename=composer3. LiteSpeed virtual host configuration
Layout on disk:
/var/www/agent-workspace/
current/ → symlink to releases/<timestamp>
releases/<timestamp>/
shared/
.env
storage/ → mounted into each release
snapshots/
workspaces/ → actual project clones live here (one dir per project)
logs/3.1 OpenLiteSpeed vhost (web admin or /usr/local/lsws/conf/vhosts/agent-workspace/vhconf.conf)
docRoot /var/www/agent-workspace/current/public
enableGzip 1
enableBr 1
adminEmails ops@example.com
index {
useServer 0
indexFiles index.php
}
errorlog $SERVER_ROOT/logs/agent-workspace.error.log {
useServer 0
logLevel NOTICE
rollingSize 50M
keepDays 14
}
accesslog $SERVER_ROOT/logs/agent-workspace.access.log {
useServer 0
logFormat "%h %l %u %t \"%r\" %>s %b %D"
rollingSize 50M
keepDays 14
}
scripthandler {
add lsapi:lsphp83 php
}
extprocessor lsphp83 {
type lsapi
address uds://tmp/lshttpd/lsphp83.sock
maxConns 35
env PHP_LSAPI_CHILDREN=35
env PHP_LSAPI_MAX_REQUESTS=10000
initTimeout 60
retryTimeout 0
pcKeepAliveTimeout 60
respBuffer 0
autoStart 2
path /usr/local/lsws/lsphp83/bin/lsphp
backlog 100
instances 1
priority 0
memSoftLimit 2048M
memHardLimit 2560M
procSoftLimit 400
procHardLimit 500
}
rewrite {
enable 1
autoLoadHtaccess 1
rules <<<END
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^ index.php [L]
END
}3.2 Listeners
- 80 (HTTP) — redirects to 443.
- 443 (HTTPS / HTTP/2) — binds the vhost above.
- 443/UDP (HTTP/3 QUIC) — enable in LSWS listener settings.
- Reverb upstream: LSWS reverse-proxies
/app/{appId}and/apps/{appId}/eventsto127.0.0.1:8080(Reverb).
3.3 Reverb upstream context
context /app/ {
type proxy
handler reverbBackend
websocket 1
addDefaultCharset off
}
context /apps/ {
type proxy
handler reverbBackend
websocket 1
addDefaultCharset off
}
extprocessor reverbBackend {
type proxy
address 127.0.0.1:8080
maxConns 100
initTimeout 30
retryTimeout 0
respBuffer 0
}3.4 SELinux contexts
semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/agent-workspace(/.*)?"
semanage fcontext -a -t httpd_sys_rw_content_t "/var/www/agent-workspace/shared/workspaces(/.*)?"
restorecon -R /var/www/agent-workspace
setsebool -P httpd_can_network_connect 1 # outbound to OpenAI/Anthropic
setsebool -P httpd_execmem 0If you exec git or other binaries from PHP, prefer dedicated user contexts (run agent commands as the deploy user via a setuid-free dispatcher) rather than relaxing SELinux.
4. PHP & Laravel runtime tuning
/usr/local/lsws/lsphp83/etc/php.ini overrides (drop in 99-agent.ini):
memory_limit = 512M
max_execution_time = 120 ; web requests; agent jobs run via queue, not web
upload_max_filesize = 64M
post_max_size = 64M
opcache.enable = 1
opcache.enable_cli = 1
opcache.memory_consumption = 256
opcache.interned_strings_buffer = 32
opcache.max_accelerated_files = 30000
opcache.validate_timestamps = 0 ; flip to 1 in staging
realpath_cache_size = 4096K
realpath_cache_ttl = 600
date.timezone = UTC
expose_php = OffLaravel-side:
php artisan config:cache && php artisan route:cache && php artisan view:cache && php artisan event:cacheafter every deploy.
php artisan optimize:clearonly when troubleshooting.
5. MySQL / MariaDB tuning
/etc/my.cnf.d/agent-workspace.cnf:
[mysqld]
bind-address = 127.0.0.1
default-authentication-plugin = caching_sha2_password
character-set-server = utf8mb4
collation-server = utf8mb4_0900_ai_ci
max_connections = 200
max_allowed_packet = 64M
innodb_buffer_pool_size = 2G ; ~50-70% of RAM in prod
innodb_log_file_size = 512M
innodb_flush_log_at_trx_commit = 1
innodb_flush_method = O_DIRECT
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1.0
log_error_verbosity = 2RAG on MySQL. The default storage for rag_chunks.embedding is a JSON column (see B1 §4.7). Cosine similarity is computed in the app layer. This works to ~50k chunks per project. Past that, the operator either:
- enables MariaDB 11.7+ vector indexes (when GA in the host distro), or
- switches
DB_CONNECTIONto PostgreSQL withpgvector(the upgrade migration is already part ofrag_chunksand is idempotent).
This is the only vector-store-related decision pending; the application code is driver-aware on day one.
6. Redis tuning
/etc/redis/redis.conf:
bind 127.0.0.1
protected-mode yes
maxmemory 1gb
maxmemory-policy allkeys-lru ; cache + queue mixed; LRU is safe because queues are persisted via Horizon
appendonly yes
appendfsync everysecLaravel uses three logical Redis prefixes via database/connections:
cache: db 0
session: db 1
queue: db 2 (Horizon)
locks: db 3 (workspace:lock:{project_id})7. Process supervision (Supervisor)
/etc/supervisord.d/agent-workspace.ini:
[program:agent-horizon]
process_name=%(program_name)s
command=/usr/local/lsws/lsphp83/bin/php /var/www/agent-workspace/current/artisan horizon
autostart=true
autorestart=true
user=deploy
redirect_stderr=true
stdout_logfile=/var/www/agent-workspace/logs/horizon.log
stopwaitsecs=3600
[program:agent-reverb]
process_name=%(program_name)s
command=/usr/local/lsws/lsphp83/bin/php /var/www/agent-workspace/current/artisan reverb:start --host=127.0.0.1 --port=8080
autostart=true
autorestart=true
user=deploy
redirect_stderr=true
stdout_logfile=/var/www/agent-workspace/logs/reverb.log
[program:agent-schedule]
process_name=%(program_name)s
command=/usr/local/lsws/lsphp83/bin/php /var/www/agent-workspace/current/artisan schedule:work
autostart=true
autorestart=true
user=deploy
redirect_stderr=true
stdout_logfile=/var/www/agent-workspace/logs/schedule.logsystemctl enable --now supervisord and supervisorctl reread && supervisorctl update.
8. Git-based deployment (zero-downtime, atomic)
Deploy as the deploy user. Strategy: clone into a timestamped release dir, run the build, then atomically swap the current symlink.
#!/usr/bin/env bash
set -euo pipefail
APP=/var/www/agent-workspace
REL="$APP/releases/$(date +%Y%m%d%H%M%S)"
git clone --depth 1 --branch "${1:-main}" git@github.com:org/agent-workspace.git "$REL"
cd "$REL"
# wire shared paths
ln -sfn "$APP/shared/.env" "$REL/.env"
rm -rf "$REL/storage"
ln -sfn "$APP/shared/storage" "$REL/storage"
ln -sfn "$APP/shared/workspaces" "$REL/storage/app/workspaces"
ln -sfn "$APP/shared/snapshots" "$REL/storage/app/snapshots"
# build
composer install --no-dev --prefer-dist --optimize-autoloader --no-interaction
npm ci && npm run build
php artisan migrate --force
php artisan storage:link || true
php artisan config:cache && php artisan route:cache && php artisan view:cache && php artisan event:cache
php artisan filament:optimize
# atomic swap
ln -sfn "$REL" "$APP/current"
# restart workers (graceful)
php "$REL/artisan" horizon:terminate
supervisorctl restart agent-reverb
# tell LSWS to pick up new opcache
touch "$REL/public/index.php"
# retention: keep last 5 releases
ls -1dt "$APP/releases"/* | tail -n +6 | xargs -r rm -rfRollback is ln -sfn $APP/releases/<previous> $APP/current && supervisorctl restart agent-reverb && php artisan horizon:terminate.
9. SSL / TLS
dnf -y install epel-release
curl https://get.acme.sh | sh -s email=ops@example.com
~/.acme.sh/acme.sh --issue -d agent.example.com -w /var/www/agent-workspace/current/public
~/.acme.sh/acme.sh --install-cert -d agent.example.com \
--key-file /usr/local/lsws/conf/cert/agent.key \
--fullchain-file /usr/local/lsws/conf/cert/agent.crt \
--reloadcmd "systemctl reload lsws"In LSWS listener 443: point keyFile/certFile to the files above, enable HTTP/2 and HTTP/3 (QUIC), enable HSTS with max-age=31536000; includeSubDomains; preload.
10. Backups & disaster recovery
- Nightly:
mysqldump --single-transaction --routines --triggers --events agent_workspace | gzip > $S3/db/$(date +%F).sql.gz.
- Hourly: binlog increment to S3.
- Snapshots:
workspace_snapshots.archive_patheither local + rsync to S3 nightly, or S3 directly whenAGENT_SNAPSHOT_DRIVER=s3.
- Restore drill: quarterly. Bring up a sibling node, restore the latest dump, replay binlogs, run
php artisan agent:reindex --all, run the smoke suite.
11. Observability
- LSWS access log in combined format → shipped to your log store.
- Laravel logs via
dailychannel understorage/logs/.
- Horizon dashboard at
/horizon(admin-only).
- Telescope allowed in
stagingonly.
- Reverb logs to
logs/reverb.log; success metric: WS message lag p95 < 250 ms.
- Node exporter + LSWS exporter for Prometheus (optional).
12. Security posture (infra-layer)
- SELinux enforcing in prod.
firewalldallows only 80/443/443-udp inbound.
- SSH on a non-default port with key-only auth;
fail2banenabled.
- All app secrets live in
server_params(B1 §4.14), not in environment variables, with the sole exception of bootstrap keys (APP_KEY, DB DSN, Redis URL).
- Outbound to
api.openai.comandapi.anthropic.comonly — enforce viafirewalldipset if compliance requires.
- No third-party CDN dependencies in the operator UI — self-hosted Monaco, Tailwind, fonts.
13. iPhone client posture (infra view)
The iPhone app is treated as just another HTTPS client by this layer. It does not connect to the database, Redis, Git, OpenAI, Anthropic, or any internal service — only to LSWS on 443. It cannot read .env, server paths, or secrets. Pairing happens entirely through the Control Center (B1 §8 + spec page 18 §13).
Prohibitions enforced by the API surface:
- No endpoint returns plaintext secrets (regex-scanned in CI).
- No endpoint accepts a workspace path outside the project root.
- No endpoint exposes a shell, Git binary, or filesystem write to arbitrary paths.
- All mutating endpoints require a Sanctum token with the correct ability (see B1 §8.3).
14. Conflict resolution against earlier spec pages
| Earlier statement | Status | Resolution |
|---|---|---|
| "PostgreSQL 16 is the primary DB" (page 01) | Superseded | MySQL 8 / MariaDB 11.4+ is default; PG+pgvector is opt-in. |
| "Backend-first, iPhone-first" (page 01 callout) | Repolished | Web-first, mobile-supported. Build order still backend → web Control Center → iPhone. |
| "Web console (later, optional)" (page 01 §1) | Superseded | Web Control Center is part of v1; iPhone is alongside, not in front. |
| "Nginx / Apache" anywhere | Removed | LSWS is the only supported web server in prod. |
| "pgvector required" (page 03) | Softened | Driver-aware migration ships both paths; MySQL JSON is default. |
| "Mobile-only architecture" or "iPhone-first" wording | Removed | Parity rule from spec page 18 governs. |
Future edits to pages 01, 03, and 13 must reference B0 in their headers.
15. Acceptance criteria for B0 (infra ready)
curl -I https://agent.example.comreturns HTTP/2 and HSTS header.
curl --http3 -I https://agent.example.comsucceeds.
supervisorctl statusshowsagent-horizon,agent-reverb,agent-scheduleallRUNNING.
mysql -e "SELECT VERSION();"returns 8.0+ or MariaDB 11.4+.
redis-cli pingreturnsPONG;redis-cli -n 2 keys 'queues:*'lists Horizon queues after a firstphp artisan horizonrun.
selinuxisenforcing; LSWS request to/horizonsucceeds without AVC denials in/var/log/audit/audit.log.
- The deploy script in §8 completes end-to-end on a clean release dir and the new release is live in <30 s.
- A QUIC-enabled iPhone reaches
/api/meover HTTP/3 (verified via Charles ornscurl).
- Killing
agent-reverbcauses the Control Center to fall back to polling without losing console events; restarting it resumes WS push within 5 s.
- CI secret-leak scanner (B1 §13) reports 0 hits.
16. Operational runbook (one-pager)
- Deploy:
sudo -iu deploy /var/www/agent-workspace/shared/bin/deploy.sh main
- Rollback: swap
currentsymlink to the priorreleases/..., restartagent-reverb,php artisan horizon:terminate.
- Restart web:
systemctl restart lsws
- Restart workers:
supervisorctl restart agent-horizon agent-reverb
- Tail logs:
tail -f /var/www/agent-workspace/logs/*.log /usr/local/lsws/logs/error.log
- DB backup now:
/var/www/agent-workspace/shared/bin/backup.sh
- Rotate secrets: update via Filament →
/admin/server-params, thenphp artisan config:clear.
- Reindex one project:
php artisan agent:reindex --project={id}
17. Out of scope for B0
- Kubernetes / multi-node clustering. v1 is single-node; horizontal scaling is a B7+ topic.
- Bring-your-own-LLM gateways (LiteLLM, vLLM). v1 uses OpenAI + Anthropic + Claude Code directly.
- Per-tenant isolation. v1 is single-tenant; tenancy is schema-ready but not enabled.
From now on every other spec page that talks about web server, OS, PHP runtime, DB engine, or process supervision defers to this page. Future build prompts (B2 onward) extend B0 with workload-specific deltas only.