# Databricks Audit Logs

## Overview

Databricks is a unified analytics platform built on Apache Spark. Audit logs capture account and workspace activity including user actions, API calls, and administrative changes.

Panther can ingest [Databricks audit logs](https://docs.databricks.com/aws/en/admin/account-settings/audit-log-delivery) delivered to an S3 bucket. These logs provide comprehensive visibility into administrative actions, user authentication patterns, data access, and notebook execution for security monitoring and compliance.

## How to onboard Databricks audit logs to Panther

### Prerequisites

* A Databricks account with audit log delivery configured
  * Databricks audit log delivery requires Databricks Premium or Enterprise tier
* An AWS S3 bucket where Databricks audit logs can be delivered
* Administrative access to configure Databricks audit log delivery

### Step 1: Configure Databricks audit log delivery to S3

1. Log in to your Databricks account console.
2. Navigate to **Settings** > **Account Settings** > **Audit Log Delivery**.
3. Click **Create log delivery**.
4. Configure the S3 destination:
   * **Destination**: Select **Amazon S3**.
   * **S3 Bucket**: Enter your S3 bucket name (e.g., `my-databricks-audit-logs`).
   * **S3 Prefix**: (Optional) Enter a prefix for organizing logs (e.g., `databricks/audit/`).
   * **Region**: Select the AWS region where your S3 bucket is located.
5. Configure delivery settings:
   * **Log Type**: Select **Audit Logs**.
   * **Delivery Path Pattern**: Databricks uses the pattern: `workspaceId=<workspaceId>/date=<yyyy-mm-dd>/auditlogs_<id>.json`.
6. Click **Create** to enable audit log delivery.

Databricks will begin delivering audit logs to your specified S3 bucket within a few hours.

### Step 2: Create a new S3 source in Panther

1. In the left-hand navigation bar of your Panther Console, click **Configure** > **Log Sources**.
2. Click **Create New.**
3. Search for "Databricks", then click its tile.
4. In the upper-right corner, click **Start Setup**.
5. On the **Configuration** page, fill in the following fields:
   * **Name**: Enter a descriptive name for the source, e.g. `Databricks Audit Logs`.
   * **AWS Account ID**: Enter the AWS account ID where your S3 bucket is located.
   * **Bucket Name**: Enter the S3 bucket name.
   * **KMS Key ARN**: (Optional) If your S3 bucket uses KMS encryption, enter the KMS key ARN.
   * **S3 Prefix Filter**: (Optional) If you specified a prefix in Step 1, enter it here to limit which objects Panther processes.
6. Click **Setup**.
7. On the **Infrastructure** page, you will see instructions for setting up the necessary AWS infrastructure to allow Panther to read from your S3 bucket. Follow the instructions to:
   * Create an IAM role for Panther to assume
   * Grant the role permissions to read from your S3 bucket
   * Configure S3 event notifications to notify Panther when new audit logs arrive
8. Click **Setup**.
9. You will be directed to a success screen:

   <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-e55cedf82c6a6adc66ec5c14ebdcb164c3b1dcca%2FScreenshot%202023-08-03%20at%204.33.30%20PM.png?alt=media" alt="The success screen reads, &#x22;Everything looks good! Panther will now automatically pull &#x26; process logs from your account&#x22;" width="281"><figcaption></figcaption></figure>

   * You can optionally enable one or more [Detection Packs](https://docs.panther.com/detections/panther-managed/packs).
   * The **Trigger an alert when no events are processed** setting defaults to **YES**. We recommend leaving this enabled, as you will be alerted if data stops flowing from the log source after a certain period of time. The timeframe is configurable, with a default of 24 hours.

     <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-c48119abd559990173004bde99ff4907fdd2ded2%2FScreenshot%202023-08-03%20at%204.26.54%20PM.png?alt=media" alt="The &#x22;Trigger an alert when no events are processed&#x22; toggle is set to YES. The &#x22;How long should Panther wait before it sends you an alert that no events have been processed&#x22; setting is set to 1 Day" width="320"><figcaption></figcaption></figure>

## Panther-managed detections

See [Panther-managed](https://docs.panther.com/detections/panther-managed) rules for Databricks in the [panther-analysis GitHub repository](https://github.com/panther-labs/panther-analysis/tree/main/rules/databricks_rules).

## Supported log types

### Databricks.Audit

Databricks audit logs capture account and workspace activity including user actions, API calls, and administrative changes.

Reference: [Databricks Audit Log Delivery Documentation](https://docs.databricks.com/aws/en/admin/account-settings/audit-log-delivery)

```yaml
schema: Databricks.Audit
description: Databricks audit logs capture account and workspace activity including user actions, API calls, and administrative changes.
referenceURL: https://docs.databricks.com/aws/en/admin/account-settings/audit-log-delivery
fields:
  - name: version
    description: Schema version (e.g., 2.0)
    type: string
  - name: auditLevel
    description: Scope of the event (ACCOUNT_LEVEL or WORKSPACE_LEVEL)
    type: string
  - name: timestamp
    required: true
    description: Event timestamp in Unix milliseconds
    type: timestamp
    timeFormats:
      - unix_ms
    isEventTime: true
  - name: orgId
    description: Organization identifier
    type: string
  - name: shardName
    description: Shard designation
    type: string
  - name: accountId
    description: Databricks account UUID
    type: string
  - name: sourceIPAddress
    description: IP address origin of the request
    type: string
    indicators:
      - ip
  - name: userAgent
    description: Client user agent string
    type: string
  - name: sessionId
    description: Session identifier
    type: string
  - name: requestId
    description: Unique request identifier
    type: string
  - name: serviceName
    required: true
    description: Service that performed the action
    type: string
  - name: actionName
    required: true
    description: Specific action executed
    type: string
  - name: userIdentity
    description: Information about the actor
    type: object
    fields:
      - name: email
        description: User's email address
        type: string
        indicators:
          - email
      - name: subjectName
        description: Alternative user identifier
        type: string
        indicators:
          - username
  - name: requestParams
    description: Action-specific request parameters
    type: json
  - name: response
    description: Response information
    type: object
    fields:
      - name: statusCode
        description: HTTP response status code
        type: bigint
      - name: errorMessage
        description: Error message if applicable
        type: string
      - name: result
        description: Operation result data
        type: json
  - name: MAX_LOG_MESSAGE_LENGTH
    description: Maximum log message length in bytes
    type: bigint
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.panther.com/data-onboarding/supported-logs/databricks.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
