# Databricks Audit Logs

## Overview

Databricks is a unified analytics platform built on Apache Spark. Audit logs capture account and workspace activity including user actions, API calls, and administrative changes.

Panther can ingest [Databricks audit logs](https://docs.databricks.com/aws/en/admin/account-settings/audit-log-delivery) delivered to an S3 bucket. These logs provide comprehensive visibility into administrative actions, user authentication patterns, data access, and notebook execution for security monitoring and compliance.

## How to onboard Databricks audit logs to Panther

### Prerequisites

* A Databricks account with audit log delivery configured
  * Databricks audit log delivery requires Databricks Premium or Enterprise tier
* An AWS S3 bucket where Databricks audit logs can be delivered
* Administrative access to configure Databricks audit log delivery

### Step 1: Configure Databricks audit log delivery to S3

1. Log in to your Databricks account console.
2. Navigate to **Settings** > **Account Settings** > **Audit Log Delivery**.
3. Click **Create log delivery**.
4. Configure the S3 destination:
   * **Destination**: Select **Amazon S3**.
   * **S3 Bucket**: Enter your S3 bucket name (e.g., `my-databricks-audit-logs`).
   * **S3 Prefix**: (Optional) Enter a prefix for organizing logs (e.g., `databricks/audit/`).
   * **Region**: Select the AWS region where your S3 bucket is located.
5. Configure delivery settings:
   * **Log Type**: Select **Audit Logs**.
   * **Delivery Path Pattern**: Databricks uses the pattern: `workspaceId=<workspaceId>/date=<yyyy-mm-dd>/auditlogs_<id>.json`.
6. Click **Create** to enable audit log delivery.

Databricks will begin delivering audit logs to your specified S3 bucket within a few hours.

### Step 2: Create a new S3 source in Panther

1. In the left-hand navigation bar of your Panther Console, click **Configure** > **Log Sources**.
2. Click **Create New.**
3. Search for "Databricks", then click its tile.
4. In the upper-right corner, click **Start Setup**.
5. On the **Configuration** page, fill in the following fields:
   * **Name**: Enter a descriptive name for the source, e.g. `Databricks Audit Logs`.
   * **AWS Account ID**: Enter the AWS account ID where your S3 bucket is located.
   * **Bucket Name**: Enter the S3 bucket name.
   * **KMS Key ARN**: (Optional) If your S3 bucket uses KMS encryption, enter the KMS key ARN.
   * **S3 Prefix Filter**: (Optional) If you specified a prefix in Step 1, enter it here to limit which objects Panther processes.
6. Click **Setup**.
7. On the **Infrastructure** page, you will see instructions for setting up the necessary AWS infrastructure to allow Panther to read from your S3 bucket. Follow the instructions to:
   * Create an IAM role for Panther to assume
   * Grant the role permissions to read from your S3 bucket
   * Configure S3 event notifications to notify Panther when new audit logs arrive
8. Click **Setup**.
9. You will be directed to a success screen:

   <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-e55cedf82c6a6adc66ec5c14ebdcb164c3b1dcca%2FScreenshot%202023-08-03%20at%204.33.30%20PM.png?alt=media" alt="The success screen reads, &#x22;Everything looks good! Panther will now automatically pull &#x26; process logs from your account&#x22;" width="281"><figcaption></figcaption></figure>

   * You can optionally enable one or more [Detection Packs](https://docs.panther.com/detections/panther-managed/packs).
   * The **Trigger an alert when no events are processed** setting defaults to **YES**. We recommend leaving this enabled, as you will be alerted if data stops flowing from the log source after a certain period of time. The timeframe is configurable, with a default of 24 hours.

     <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-c48119abd559990173004bde99ff4907fdd2ded2%2FScreenshot%202023-08-03%20at%204.26.54%20PM.png?alt=media" alt="The &#x22;Trigger an alert when no events are processed&#x22; toggle is set to YES. The &#x22;How long should Panther wait before it sends you an alert that no events have been processed&#x22; setting is set to 1 Day" width="320"><figcaption></figcaption></figure>

## Panther-managed detections

See [Panther-managed](https://docs.panther.com/detections/panther-managed) rules for Databricks in the [panther-analysis GitHub repository](https://github.com/panther-labs/panther-analysis/tree/main/rules/databricks_rules).

## Supported log types

### Databricks.Audit

Databricks audit logs capture account and workspace activity including user actions, API calls, and administrative changes.

Reference: [Databricks Audit Log Delivery Documentation](https://docs.databricks.com/aws/en/admin/account-settings/audit-log-delivery)

```yaml
schema: Databricks.Audit
description: Databricks audit logs capture account and workspace activity including user actions, API calls, and administrative changes.
referenceURL: https://docs.databricks.com/aws/en/admin/account-settings/audit-log-delivery
fields:
  - name: version
    description: Schema version (e.g., 2.0)
    type: string
  - name: auditLevel
    description: Scope of the event (ACCOUNT_LEVEL or WORKSPACE_LEVEL)
    type: string
  - name: timestamp
    required: true
    description: Event timestamp in Unix milliseconds
    type: timestamp
    timeFormats:
      - unix_ms
    isEventTime: true
  - name: orgId
    description: Organization identifier
    type: string
  - name: shardName
    description: Shard designation
    type: string
  - name: accountId
    description: Databricks account UUID
    type: string
  - name: sourceIPAddress
    description: IP address origin of the request
    type: string
    indicators:
      - ip
  - name: userAgent
    description: Client user agent string
    type: string
  - name: sessionId
    description: Session identifier
    type: string
  - name: requestId
    description: Unique request identifier
    type: string
  - name: serviceName
    required: true
    description: Service that performed the action
    type: string
  - name: actionName
    required: true
    description: Specific action executed
    type: string
  - name: userIdentity
    description: Information about the actor
    type: object
    fields:
      - name: email
        description: User's email address
        type: string
        indicators:
          - email
      - name: subjectName
        description: Alternative user identifier
        type: string
        indicators:
          - username
  - name: requestParams
    description: Action-specific request parameters
    type: json
  - name: response
    description: Response information
    type: object
    fields:
      - name: statusCode
        description: HTTP response status code
        type: bigint
      - name: errorMessage
        description: Error message if applicable
        type: string
      - name: result
        description: Operation result data
        type: json
  - name: MAX_LOG_MESSAGE_LENGTH
    description: Maximum log message length in bytes
    type: bigint
```
