๐Ÿ“… let's chat! explore the endless possibilities creating industries that don't exist. click here

lokalise-incident-runbook

Execute Lokalise incident response procedures with triage, mitigation, and postmortem. Use when responding to Lokalise-related outages, investigating errors, or running post-incident reviews for Lokalise integration failures. Trigger with phrases like "lokalise incident", "lokalise outage", "lokalise down", "lokalise on-call", "lokalise emergency", "translations broken". allowed-tools: Read, Grep, Bash(curl:*), Bash(lokalise2:*) version: 1.0.0 license: MIT author: Jeremy Longshore <jeremy@intentsolutions.io>

Allowed Tools

No tools specified

Provided by Plugin

lokalise-pack

Claude Code skill pack for Lokalise (24 skills)

saas packs v1.0.0
View Plugin

Installation

This skill is included in the lokalise-pack plugin:

/plugin install lokalise-pack@claude-code-plugins-plus

Click to copy

Instructions

# Lokalise Incident Runbook ## Overview Rapid incident response procedures for Lokalise-related outages and issues. ## Prerequisites - Access to Lokalise dashboard and status page - Application logs and monitoring dashboards - Communication channels (Slack, PagerDuty) - Rollback procedures documented ## Severity Levels | Level | Definition | Response Time | Examples | |-------|------------|---------------|----------| | P1 | Complete outage | < 15 min | Lokalise API unreachable, all translations missing | | P2 | Degraded service | < 1 hour | High latency, partial failures, key languages affected | | P3 | Minor impact | < 4 hours | Webhook delays, non-critical translations missing | | P4 | No user impact | Next business day | Monitoring gaps, sync delays | ## Quick Triage Commands ```bash # 1. Check Lokalise API status curl -s https://api.lokalise.com/api2/system/health | jq # 2. Check Lokalise status page curl -s https://status.lokalise.com/api/v2/status.json | jq '.status.description' # 3. Test API authentication curl -s -o /dev/null -w "%{http_code}" \ -H "X-Api-Token: $LOKALISE_API_TOKEN" \ "https://api.lokalise.com/api2/projects?limit=1" # 4. Check your app's translation health curl -s https://your-app.com/health/lokalise | jq # 5. Check recent error logs (last 5 min) grep -i "lokalise" /var/log/app/*.log | tail -50 ``` ## Decision Tree ``` Translations missing/broken? โ”œโ”€ YES: Is https://status.lokalise.com showing incident? โ”‚ โ”œโ”€ YES โ†’ Enable fallback translations. Monitor status page. โ”‚ โ””โ”€ NO โ†’ Check our integration. Continue triage below. โ””โ”€ NO: Is sync failing? โ”œโ”€ YES โ†’ Check CI/CD pipeline, API token validity โ””โ”€ NO โ†’ Likely localized issue. Check specific component. API returning errors? โ”œโ”€ 401 โ†’ Token expired/invalid. Rotate token. โ”œโ”€ 403 โ†’ Permission denied. Check project access. โ”œโ”€ 404 โ†’ Project/key not found. Verify IDs. โ”œโ”€ 429 โ†’ Rate limited. Enable backoff/queuing. โ””โ”€ 5xx โ†’ Lokalise issue. Enable fallback, monitor status. ``` ## Immediate Actions by Error Type ### 401/403 - Authentication/Authorization ```bash # Verify token is set echo "Token set: ${LOKALISE_API_TOKEN:+YES}" # Test token validity curl -s -H "X-Api-Token: $LOKALISE_API_TOKEN" \ "https://api.lokalise.com/api2/projects?limit=1" | jq '.projects[0].name // .error' # If token is invalid: # 1. Generate new token in Lokalise Profile Settings # 2. Update in secret manager # 3. Restart affected services ``` ### 429 - Rate Limited ```bash # Check current rate limit status curl -s -I -H "X-Api-Token: $LOKALISE_API_TOKEN" \ "https://api.lokalise.com/api2/projects" | grep -i "x-ratelimit" # Immediate mitigation: # 1. Reduce concurrent requests # 2. Add delays between API calls # 3. Enable request queuing # Application config export LOKALISE_RATE_LIMIT_MODE=queue export LOKALISE_REQUEST_DELAY_MS=200 ``` ### 5xx - Lokalise Server Errors ```bash # Check Lokalise status curl -s https://status.lokalise.com/api/v2/status.json | jq # Enable graceful degradation export LOKALISE_FALLBACK_ENABLED=true # Use bundled translations as fallback # Your app should fall back to static locale files # Monitor for resolution watch -n 60 'curl -s https://status.lokalise.com/api/v2/status.json | jq ".status.description"' ``` ### Missing Translations ```bash # Check if translations exist in Lokalise lokalise2 --token "$LOKALISE_API_TOKEN" \ --project-id "$LOKALISE_PROJECT_ID" \ key list --filter-untranslated-any 1 | head -20 # Check local translation files ls -la src/locales/ cat src/locales/en.json | jq 'keys | length' # If files missing, re-download npm run i18n:pull ``` ## Communication Templates ### Internal (Slack) ``` :red_circle: P1 INCIDENT: Lokalise Integration Status: INVESTIGATING Impact: [Translations not loading for all users] Current action: [Checking API status and enabling fallback] Next update: [Time + 15 min] Incident commander: @[name] ``` ### External (Status Page) ``` Translation Service Issue We're experiencing issues with our translation service. Some users may see English text instead of their preferred language. We're actively investigating and will provide updates. Last updated: [timestamp] ``` ## Post-Incident ### Evidence Collection ```bash # Generate debug bundle bash scripts/lokalise-debug-bundle.sh # Export relevant logs grep -i "lokalise" /var/log/app/*.log > incident-logs.txt # Export metrics snapshot curl "http://prometheus:9090/api/v1/query_range?query=lokalise_errors_total&start=-2h" > metrics.json # Screenshot dashboards # Document timeline in incident channel ``` ### Postmortem Template ```markdown ## Incident: Lokalise [Error Type] **Date:** YYYY-MM-DD **Duration:** X hours Y minutes **Severity:** P[1-4] **Incident Commander:** [Name] ### Summary [1-2 sentence description of what happened] ### Timeline (all times UTC) - HH:MM - [Alert triggered / Issue reported] - HH:MM - [Investigation started] - HH:MM - [Root cause identified] - HH:MM - [Mitigation applied] - HH:MM - [Service restored] ### Root Cause [Technical explanation of what went wrong] ### Impact - Users affected: [N] - Duration of user impact: [X minutes] - Translations affected: [Languages/keys] ### What Went Well - [Positive aspects of response] ### What Could Be Improved - [Areas for improvement] ### Action Items - [ ] [Preventive measure] - Owner - Due date - [ ] [Documentation update] - Owner - Due date - [ ] [Monitoring improvement] - Owner - Due date ``` ## Rollback Procedures ### Revert to Bundled Translations ```bash # If translations are corrupted, revert to last known good git checkout HEAD~1 -- src/locales/ # Rebuild and deploy npm run build npm run deploy # Disable auto-sync temporarily export LOKALISE_AUTO_SYNC=false ``` ### Re-download from Lokalise ```bash # Download reviewed translations only lokalise2 --token "$LOKALISE_API_TOKEN" \ --project-id "$LOKALISE_PROJECT_ID" \ file download \ --format json \ --filter-data reviewed \ --unzip-to ./src/locales ``` ## Output - Issue identified and categorized - Remediation applied - Stakeholders notified - Evidence collected for postmortem ## Error Handling | Issue | Cause | Solution | |-------|-------|----------| | Can't reach status page | Network issue | Use mobile or VPN | | Logs unavailable | Logging down | Check logging service | | Fallback not working | Not configured | Enable static fallback | | Token rotation fails | Permission denied | Escalate to admin | ## Examples ### One-Line Health Check ```bash curl -sf https://your-app.com/health/lokalise | jq '.status' || echo "UNHEALTHY" ``` ### Quick Fallback Enable ```typescript // Emergency fallback - use in incident process.env.LOKALISE_FALLBACK_ENABLED = "true"; // Your translation loader should check this const useFallback = process.env.LOKALISE_FALLBACK_ENABLED === "true"; if (useFallback) { return loadBundledTranslations(locale); } ``` ## Resources - [Lokalise Status Page](https://status.lokalise.com) - [Lokalise Support](mailto:support@lokalise.com) - [Community Forum](https://community.lokalise.com) ## Next Steps For data handling, see `lokalise-data-handling`.

Skill file: plugins/saas-packs/lokalise-pack/skills/lokalise-incident-runbook/SKILL.md