8.4 KiB
First Deployment Runbook
Purpose
Execute the first real deployment with a repeatable sequence that covers infrastructure, secrets, webhook cutover, smoke checks, scheduler rollout, and rollback.
Preconditions
mainis green in CI.- Terraform baseline has already been reviewed for the target environment.
- You have access to:
- GCP project
- GitHub repo settings
- Telegram bot token
- Supabase project and three database URLs:
- owner
DATABASE_URLfor migrations only APP_DATABASE_URLfor authenticated request pathsWORKER_DATABASE_URLfor bot and scheduler workers
- owner
Required Configuration Inventory
Terraform variables
Required in your environment *.tfvars:
project_idregionenvironmentbot_api_imagemini_app_image
Recommended:
app_database_url_secret_id = "app-database-url"worker_database_url_secret_id = "worker-database-url"telegram_bot_token_secret_id = "telegram-bot-token"openai_api_key_secret_id = "openai-api-key"bot_mini_app_allowed_originsscheduler_timezonescheduler_paused = truescheduler_dry_run = true
Household chat/topic bindings are no longer deployment config. Configure them in Telegram with
/setup, /bind_purchase_topic, /bind_feedback_topic, and /bind_payments_topic after deploy.
Secret Manager values
Create the secret resources via Terraform, then add secret versions for:
telegram-bot-tokentelegram-webhook-secretscheduler-shared-secretapp-database-urlworker-database-url- optional
openai-api-key
GitHub Actions secrets
Required for CD:
GCP_PROJECT_IDGCP_WORKLOAD_IDENTITY_PROVIDERGCP_SERVICE_ACCOUNT
Required for a real deploy:
DATABASE_URL
GitHub Actions variables
Set if you do not want the defaults:
GCP_REGIONARTIFACT_REPOSITORYCLOUD_RUN_SERVICE_BOTCLOUD_RUN_SERVICE_MINI
Phase 1: Local Readiness
Run the quality gates locally from the deployment ref:
bun run format:check
bun run lint
bun run typecheck
bun run test
bun run build
If the release includes schema changes, also run:
bun run db:check
E2E_SMOKE_ALLOW_WRITE=true bun run test:e2e
Phase 2: Provision or Reconcile Infrastructure
- Prepare environment-specific variables:
cp infra/terraform/terraform.tfvars.example infra/terraform/dev.tfvars
- Initialize Terraform with the correct state bucket:
terraform -chdir=infra/terraform init -backend-config="bucket=<terraform-state-bucket>"
- Review and apply:
terraform -chdir=infra/terraform plan -var-file=dev.tfvars
terraform -chdir=infra/terraform apply -var-file=dev.tfvars
- Capture outputs:
BOT_API_URL="$(terraform -chdir=infra/terraform output -raw bot_api_service_url)"
MINI_APP_URL="$(terraform -chdir=infra/terraform output -raw mini_app_service_url)"
- If you did not know the mini app URL before the first apply, set
bot_mini_app_allowed_origins = [\"${MINI_APP_URL}\"]indev.tfvarsand apply again.
Phase 3: Add Runtime Secret Versions
Use the real project ID from Terraform variables:
echo -n "<telegram-bot-token>" | gcloud secrets versions add telegram-bot-token --data-file=- --project <project_id>
echo -n "<telegram-webhook-secret>" | gcloud secrets versions add telegram-webhook-secret --data-file=- --project <project_id>
echo -n "<scheduler-shared-secret>" | gcloud secrets versions add scheduler-shared-secret --data-file=- --project <project_id>
echo -n "<app-database-url>" | gcloud secrets versions add app-database-url --data-file=- --project <project_id>
echo -n "<worker-database-url>" | gcloud secrets versions add worker-database-url --data-file=- --project <project_id>
Add optional secret versions only if those integrations are enabled.
For a functional household deployment, set both app_database_url_secret_id and
worker_database_url_secret_id in dev.tfvars before the apply that creates the Cloud Run
services. Otherwise the bot deploys without APP_DATABASE_URL and WORKER_DATABASE_URL, and mini
app auth, finance commands, reminders, purchase ingestion, and anonymous feedback remain disabled.
Keep DATABASE_URL out of normal runtime secrets. It is only required in GitHub Actions for the
migration step that runs before deploy.
Keep telegram_bot_token_secret_id = "telegram-bot-token" aligned with the actual bot token
secret name. CD uses that secret to sync the Telegram command menu after deploy.
Phase 4: Configure GitHub CD
Populate GitHub repository secrets with the Terraform outputs:
GCP_PROJECT_IDGCP_WORKLOAD_IDENTITY_PROVIDERGCP_SERVICE_ACCOUNTDATABASE_URL
If you prefer the GitHub CLI:
gh secret set GCP_PROJECT_ID
gh secret set GCP_WORKLOAD_IDENTITY_PROVIDER
gh secret set GCP_SERVICE_ACCOUNT
gh secret set DATABASE_URL
Set GitHub repository variables if you want to override the defaults used by .github/workflows/cd.yml.
- optional
TELEGRAM_BOT_TOKEN_SECRET_ID- only needed if your bot token secret name is not
telegram-bot-token
- only needed if your bot token secret name is not
Phase 5: Trigger the First Deployment
You have two safe options:
- Merge the deployment ref into
mainand letCDrun after successful CI. - Trigger
CDmanually from the GitHub Actions UI withworkflow_dispatch.
The workflow will:
- run
bun run db:migratebefore deploy - build and push bot and mini app images
- deploy both Cloud Run services
Phase 6: Telegram Webhook Cutover
After the bot service is live, set the webhook explicitly:
export TELEGRAM_BOT_TOKEN="$(gcloud secrets versions access latest --secret telegram-bot-token --project <project_id>)"
export TELEGRAM_WEBHOOK_SECRET="$(gcloud secrets versions access latest --secret telegram-webhook-secret --project <project_id>)"
export TELEGRAM_WEBHOOK_URL="${BOT_API_URL}/webhook/telegram"
bun run ops:telegram:webhook set
bun run ops:telegram:webhook info
If you want to discard queued updates during cutover:
export TELEGRAM_DROP_PENDING_UPDATES=true
bun run ops:telegram:webhook set
Phase 7: Post-Deploy Smoke Checks
Run the smoke script:
export BOT_API_URL
export MINI_APP_URL
export TELEGRAM_EXPECTED_WEBHOOK_URL="${BOT_API_URL}/webhook/telegram"
bun run ops:deploy:smoke
The smoke script verifies:
- bot health endpoint
- mini app root delivery
- mini app auth endpoint is mounted
- scheduler endpoint rejects unauthenticated requests
- Telegram webhook matches the expected URL when bot token is provided
Production deploys should also set MINI_APP_ALLOWED_ORIGINS explicitly. The browser path remains
bot API only; there is no supported direct browser access to Supabase.
Phase 8: Scheduler Enablement
First release:
- Keep
scheduler_paused = trueandscheduler_dry_run = trueon initial deploy. - After smoke checks pass, set
scheduler_paused = falseand apply Terraform. - Trigger one job manually:
gcloud scheduler jobs run household-dev-utilities --location <region> --project <project_id>
- Verify the reminder request succeeded and produced
dryRun: truelogs. - Set
scheduler_dry_run = falseand apply Terraform. - Trigger one job again and verify the delivery side behaves as expected.
Rollback
If the release is unhealthy:
- Pause scheduler jobs again in Terraform:
terraform -chdir=infra/terraform apply -var-file=dev.tfvars -var='scheduler_paused=true'
- Move Cloud Run traffic back to the last healthy revision:
gcloud run revisions list --service <bot-service-name> --region <region> --project <project_id>
gcloud run services update-traffic <bot-service-name> --region <region> --project <project_id> --to-revisions <previous-revision>=100
gcloud run revisions list --service <mini-service-name> --region <region> --project <project_id>
gcloud run services update-traffic <mini-service-name> --region <region> --project <project_id> --to-revisions <previous-revision>=100
- If webhook traffic must stop immediately:
bun run ops:telegram:webhook delete
- If migrations were additive, leave schema in place and roll application code back.
- If a destructive migration failed, stop and use the rollback SQL prepared in that PR.
Dev-to-Prod Promotion Notes
- Repeat the same sequence in a separate
prod.tfvarsand Terraform state. - Keep separate GCP projects for
devandprodwhen possible. - Do not unpause production scheduler jobs until prod smoke checks are complete.