๐Ÿ“… let's chat! explore the endless possibilities creating industries that don't exist. click here

databricks-webhooks-events

Configure Databricks job notifications, webhooks, and event handling. Use when setting up Slack/Teams notifications, configuring alerts, or integrating Databricks events with external systems. Trigger with phrases like "databricks webhook", "databricks notifications", "databricks alerts", "job failure notification", "databricks slack". allowed-tools: Read, Write, Edit, Bash(databricks:*) version: 1.0.0 license: MIT author: Jeremy Longshore <jeremy@intentsolutions.io>

Allowed Tools

No tools specified

Provided by Plugin

databricks-pack

Claude Code skill pack for Databricks (24 skills)

saas packs v1.0.0
View Plugin

Installation

This skill is included in the databricks-pack plugin:

/plugin install databricks-pack@claude-code-plugins-plus

Click to copy

Instructions

# Databricks Webhooks & Events ## Overview Configure notifications and event handling for Databricks jobs and pipelines. ## Prerequisites - Databricks workspace access - Webhook endpoint (Slack, Teams, PagerDuty, etc.) - Job permissions for notification configuration ## Instructions ### Step 1: Configure Job Notifications ```yaml # resources/jobs.yml resources: jobs: etl_pipeline: name: etl-pipeline # Email notifications email_notifications: on_start: - team@company.com on_success: - success-alerts@company.com on_failure: - oncall@company.com - pagerduty@company.pagerduty.com on_duration_warning_threshold_exceeded: - oncall@company.com no_alert_for_skipped_runs: true no_alert_for_canceled_runs: false # Webhook notifications webhook_notifications: on_start: - id: ${var.slack_webhook_id} on_success: - id: ${var.slack_webhook_id} on_failure: - id: ${var.pagerduty_webhook_id} - id: ${var.slack_webhook_id} # System notification destinations notification_settings: no_alert_for_skipped_runs: true no_alert_for_canceled_runs: false ``` ### Step 2: Create Webhook Destinations ```python # scripts/setup_webhooks.py from databricks.sdk import WorkspaceClient from databricks.sdk.service.settings import ( CreateNotificationDestinationRequest, SlackConfig, MicrosoftTeamsConfig, PagerdutyConfig, GenericWebhookConfig, ) def setup_slack_webhook( w: WorkspaceClient, name: str, webhook_url: str, ) -> str: """Create Slack notification destination.""" destination = w.notification_destinations.create( display_name=name, config=SlackConfig(url=webhook_url), ) return destination.id def setup_teams_webhook( w: WorkspaceClient, name: str, webhook_url: str, ) -> str: """Create Microsoft Teams notification destination.""" destination = w.notification_destinations.create( display_name=name, config=MicrosoftTeamsConfig(url=webhook_url), ) return destination.id def setup_pagerduty_webhook( w: WorkspaceClient, name: str, integration_key: str, ) -> str: """Create PagerDuty notification destination.""" destination = w.notification_destinations.create( display_name=name, config=PagerdutyConfig( integration_key=integration_key, ), ) return destination.id def setup_generic_webhook( w: WorkspaceClient, name: str, webhook_url: str, username: str = None, password: str = None, ) -> str: """Create generic webhook destination.""" config = GenericWebhookConfig( url=webhook_url, username=username, password=password, ) destination = w.notification_destinations.create( display_name=name, config=config, ) return destination.id # Usage w = WorkspaceClient() slack_id = setup_slack_webhook( w, "Data Team Slack", "https://hooks.slack.com/services/T00/B00/XXX" ) print(f"Slack webhook ID: {slack_id}") ``` ### Step 3: Custom Webhook Handler ```python # webhook_handler/app.py from flask import Flask, request, jsonify import json from datetime import datetime app = Flask(__name__) @app.route("/databricks/webhook", methods=["POST"]) def handle_databricks_webhook(): """Handle incoming Databricks job notifications.""" payload = request.json # Parse Databricks webhook payload event_type = payload.get("event_type") job_id = payload.get("job_id") run_id = payload.get("run_id") run_page_url = payload.get("run_page_url") # Job run state info state = payload.get("state", {}) life_cycle_state = state.get("life_cycle_state") result_state = state.get("result_state") state_message = state.get("state_message") # Process based on event type if event_type == "jobs.on_failure": handle_job_failure(payload) elif event_type == "jobs.on_success": handle_job_success(payload) elif event_type == "jobs.on_start": handle_job_start(payload) return jsonify({"status": "received"}) def handle_job_failure(payload: dict): """Process job failure notification.""" job_name = payload.get("job_name", "Unknown Job") run_id = payload.get("run_id") error_message = payload.get("state", {}).get("state_message", "") # Send to Slack slack_message = { "blocks": [ { "type": "header", "text": { "type": "plain_text", "text": f"Job Failed: {job_name}", "emoji": True } }, { "type": "section", "fields": [ {"type": "mrkdwn", "text": f"*Run ID:* {run_id}"}, {"type": "mrkdwn", "text": f"*Time:* {datetime.now().isoformat()}"}, ] }, { "type": "section", "text": { "type": "mrkdwn", "text": f"*Error:* {error_message[:500]}" } }, { "type": "actions", "elements": [ { "type": "button", "text": {"type": "plain_text", "text": "View Run"}, "url": payload.get("run_page_url"), } ] } ] } # Send to Slack (implement your Slack client) # slack_client.send_message(slack_message) # Create PagerDuty incident # pagerduty_client.create_incident(...) def handle_job_success(payload: dict): """Process job success notification.""" # Log metrics, update dashboards, etc. pass def handle_job_start(payload: dict): """Process job start notification.""" # Update job tracking, send start notification, etc. pass if __name__ == "__main__": app.run(host="0.0.0.0", port=8080) ``` ### Step 4: System Tables for Event Monitoring ```sql -- Query job run events from system tables SELECT job_id, job_name, run_id, result_state, error_message, start_time, end_time, (end_time - start_time) / 1000 / 60 as duration_minutes FROM system.lakeflow.job_run_timeline WHERE start_time > current_timestamp() - INTERVAL 24 HOURS ORDER BY start_time DESC; -- Alert on failures SELECT * FROM system.lakeflow.job_run_timeline WHERE result_state = 'FAILED' AND start_time > current_timestamp() - INTERVAL 1 HOUR; -- Calculate failure rate by job SELECT job_name, COUNT(*) as total_runs, SUM(CASE WHEN result_state = 'FAILED' THEN 1 ELSE 0 END) as failures, ROUND(SUM(CASE WHEN result_state = 'FAILED' THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 2) as failure_rate FROM system.lakeflow.job_run_timeline WHERE start_time > current_timestamp() - INTERVAL 7 DAYS GROUP BY job_name HAVING failure_rate > 5 ORDER BY failure_rate DESC; ``` ### Step 5: SQL Alert Integration ```sql -- Create SQL alert for job failures CREATE ALERT job_failure_alert AS SELECT COUNT(*) as failure_count, COLLECT_LIST(job_name) as failed_jobs FROM system.lakeflow.job_run_timeline WHERE result_state = 'FAILED' AND start_time > current_timestamp() - INTERVAL 15 MINUTES HAVING failure_count > 0 SCHEDULE CRON '0/15 * * * *' NOTIFICATIONS ( email_addresses = ['oncall@company.com'], webhook_destinations = ['slack-alerts'] ); ``` ## Output - Configured notification destinations - Job notifications active - Custom webhook handler deployed - Monitoring queries ready ## Error Handling | Issue | Cause | Solution | |-------|-------|----------| | Webhook not triggered | Invalid destination ID | Verify destination exists | | Email not received | Invalid email | Check email addresses | | Duplicate notifications | Multiple configs | Deduplicate notification settings | | Webhook timeout | Slow handler | Optimize webhook endpoint | ## Examples ### Slack Block Kit Message ```python def format_slack_notification(run_info: dict) -> dict: """Format rich Slack notification.""" return { "blocks": [ { "type": "header", "text": { "type": "plain_text", "text": f"{run_info['status_emoji']} Job: {run_info['job_name']}" } }, { "type": "section", "fields": [ {"type": "mrkdwn", "text": f"*Status:* {run_info['status']}"}, {"type": "mrkdwn", "text": f"*Duration:* {run_info['duration']}"}, {"type": "mrkdwn", "text": f"*Run ID:* {run_info['run_id']}"}, {"type": "mrkdwn", "text": f"*Cluster:* {run_info['cluster']}"}, ] }, { "type": "actions", "elements": [ { "type": "button", "text": {"type": "plain_text", "text": "View Run"}, "url": run_info['run_url'], "style": "primary" }, { "type": "button", "text": {"type": "plain_text", "text": "View Logs"}, "url": run_info['logs_url'] } ] } ] } ``` ## Resources - [Job Notifications](https://docs.databricks.com/workflows/jobs/job-notifications.html) - [Notification Destinations](https://docs.databricks.com/administration-guide/workspace/notification-destinations.html) - [System Tables](https://docs.databricks.com/administration-guide/system-tables/index.html) ## Next Steps For performance tuning, see `databricks-performance-tuning`.

Skill file: plugins/saas-packs/databricks-pack/skills/databricks-webhooks-events/SKILL.md