# MVNexus Operations Runbook

## Daily checks

- `GET /health/app`, `/health/database`, `/health/queue`, `/health/reverb`
- Queue depth: `php artisan queue:monitor` or Redis `LLEN` on queue keys
- Failed jobs: `php artisan queue:failed`
- Application logs: `storage/logs/laravel.log` (or centralized JSON logs)

## Structured logging

Production recommendation:

```env
LOG_CHANNEL=stack
LOG_STACK=daily,stderr
LOG_LEVEL=info
```

Ship `stderr` to your log aggregator (CloudWatch, Datadog, ELK). Do not log:

- Bearer tokens or Sanctum secrets
- Encrypted setting values
- Full request bodies for auth endpoints

## Incident response

### API 5xx spike

1. Check PHP-FPM / nginx error logs
2. Verify database connectivity (`/health/database`)
3. Roll back latest deploy if correlated
4. Scale queue workers if backlog-related

### Queue backlog

```bash
php artisan queue:work redis --queue=default,long --tries=3 --backoff=10,30,60
```

Review `failed_jobs` and retry after fix:

```bash
php artisan queue:retry all
```

### Reverb / realtime down

1. Check `reverb` container/process
2. Verify `BROADCAST_CONNECTION=reverb` and keys in `.env`
3. Confirm nginx `/app` websocket proxy
4. Frontend falls back gracefully when `VITE_REVERB_APP_KEY` is unset

### Database maintenance

- Take backups before migrations
- Run `php artisan migrate --force` during maintenance window
- Rebuild caches after migration

## Maintenance mode

```bash
php artisan down --retry=60 --refresh=15
# deploy / migrate
php artisan up
```

## Scaling guidance

- Horizontally scale `app` (PHP-FPM) behind load balancer with shared sessions (Redis)
- Scale `queue` workers independently
- Run single `scheduler` leader (or use cron on one node)
- Reverb may require sticky sessions or Redis scaling per Laravel docs
