# Managing Google Cloud Storage (GCS) Log Sources with Terraform (Beta)

## Overview

{% hint style="info" %}
Managing Google Cloud Storage (GCS) log sources with Terraform is in open beta starting with Panther version 1.121, and is available to all customers. Please share any bug reports and feature requests with your Panther support team.
{% endhint %}

You can define your Google Cloud Storage (GCS) log source in Terraform using the Panther [Terraform provider](https://registry.terraform.io/providers/panther-labs/panther/latest). This allows you to manage your GCS log sources as infrastructure as code, enabling version control and automated deployments.

Other methods to create a GCS log source include using the [Panther API](https://docs.panther.com/panther-developer-workflows/api/rest/log-sources/gcs-sources) directly and [manual creation in the Panther Console](https://docs.panther.com/data-onboarding/data-transports/google/cloud-storage).

## How to define your Panther GCS log source in Terraform

The following sections outline how to define your GCS log source in HashiCorp Configuration Language (HCL).

### Prerequisites

* Before starting, ensure you have an API URL and token with the `Manage Log Sources` permission. This is required to complete Step 3.
  * If needed, follow [these instructions for creating an API token in the Panther Console](https://docs.panther.com/api#how-to-create-a-panther-api-token).
* Set up your Google Cloud Platform (GCP) infrastructure following the [GCS Source setup guide](https://docs.panther.com/data-onboarding/data-transports/google/cloud-storage#step-2-create-required-google-cloud-platform-gcp-infrastructure).

### Step 1: Choose an authentication method

Select an authentication method for accessing your GCS bucket:

* **Service Account**: Uses a Google Cloud service account with a JSON key file
* **Workload Identity Federation**: Uses Google Cloud Workload Identity Federation with AWS

The authentication method you select will determine the variables you define in Step 2, below.

### Step 2: Define variables

Define a `variables.tf` file with the variables shown in the code block below.

```hcl
variable "panther_api_token" {
  description = "Panther API token"
  type        = string
  sensitive   = true
}

variable "panther_api_url" {
  description = "Panther API URL"
  type        = string
}

variable "integration_label" {
  description = "The name of the GCS log source integration"
  type        = string
}

variable "gcs_bucket" {
  description = "The name of the GCS bucket to pull logs from"
  type        = string
}

variable "subscription_id" {
  description = "The Pub/Sub subscription ID for GCS bucket notifications"
  type        = string
}

variable "credentials_type" {
  description = "Type of credentials (service_account or wif)"
  type        = string
  validation {
    condition     = contains(["service_account", "wif"], var.credentials_type)
    error_message = "Credentials type must be either 'service_account' or 'wif'."
  }
}

// Auth variables are specific to credentials_type. See table below
variable "credentials" {
  description = "Service account JSON key or WIF credential configuration file content"
  type        = string
  sensitive   = true
}

variable "project_id" {
  description = "Google Cloud Project ID (required for WIF, optional for service account)"
  type        = string
  default     = ""
}

// (Optional) Relevant only when log_stream_type = "JsonArray"
variable "json_array_envelope_field" {
  description = "Envelope field for json array stream"
  type        = string
  default     = ""
}

// (Optional) Relevant only when log_stream_type = "XML"
variable "xml_root_element" {
  description = "Root element name for XML log streams"
  type        = string
  default     = ""
}
```

### Step 3: Provide values for the defined variables

Add a `*.tfvars` file that assigns values to the variables you defined in Step 2. Note that to complete this section, you will need the API URL and token outlined in the Prerequisites section.

* Your `panther_api_url` value should be your root API URL. This is either:
  * A [GraphQL API URL](https://docs.panther.com/api/graphql#step-1-identify-your-panther-graphql-api-url) without the `/public/graphql` suffix
  * A [REST API URL](https://docs.panther.com/api/rest#step-1-identify-your-panther-rest-api-url) as-is (REST URLs do not have a suffix after the root URL)

```hcl
panther_api_token         = "XXXXXXXXXX"
panther_api_url           = "https://your-panther-url/v1"
integration_label         = "my-gcs-logs"
gcs_bucket                = "my-log-bucket"
subscription_id           = "my-panther-subscription"
credentials_type          = "service_account" // service_account or wif
credentials               = "{ ... }" // JSON keyfile or credential config content
project_id                = "my-gcp-project" // Required for WIF, optional for service_account
json_array_envelope_field = "" // Optional, relevant only when log_stream_type = "JsonArray"
xml_root_element          = "" // Optional, relevant only when log_stream_type = "XML"
```

#### **Authentication method-specific variables**

In your `variables.tf` file, include the values in the **Additional variables** column below for the authentication method you chose in Step 1.&#x20;

The `credentials_type` field must match the type of credentials provided in the `credentials` field.

<table><thead><tr><th width="243">Authentication method</th><th width="204">credentials_type value</th><th>Additional variables</th></tr></thead><tbody><tr><td>Service account authentication</td><td><code>service_account</code></td><td><code>credentials</code> (JSON keyfile content)</td></tr><tr><td>Workload Identity Federation authentication</td><td><code>wif</code></td><td><code>credentials</code> (credential configuration file content), <code>project_id</code></td></tr></tbody></table>

### Step 4: Define the Terraform provider

Add the [Panther](https://registry.terraform.io/providers/panther-labs/panther/latest) Terraform provider.

```hcl
terraform {
  required_providers {
    panther = {
      source  = "panther-labs/panther"
      version = "~> 0.2.10"
    }
  }
}
```

### Step 5: Define Panther GCS log source

The following HCL configuration defines the GCS log source in Panther.

```hcl
provider "panther" {
  token = var.panther_api_token
  url   = var.panther_api_url
}

resource "panther_gcssource" "demo_gcs_source" {
  integration_label = var.integration_label
  log_stream_type   = "JSON" // Options: Auto, JSON, JsonArray, Lines, and XML
  gcs_bucket        = var.gcs_bucket
  subscription_id   = var.subscription_id
  credentials_type  = var.credentials_type
  credentials       = var.credentials
  project_id        = var.project_id // Required for WIF, optional for service_account

  prefix_log_types = [{
    prefix            = "audit-logs/"
    excluded_prefixes = ["audit-logs/exclude/"]
    log_types         = ["GCP.AuditLog"]
  }, {
    prefix            = "cloudtrail/"
    excluded_prefixes = []
    log_types         = ["AWS.CloudTrail"]
  }]

  // (Optional) Configure based on log_stream_type
  log_stream_type_options = {
    json_array_envelope_field = var.json_array_envelope_field // Relevant for "JsonArray"
    xml_root_element          = var.xml_root_element          // Relevant for "XML"
  }
}
```

#### Prefix-based log type mapping

Unlike HTTP or S3 sources that use a simple array of log types, GCS sources use `prefix_log_types` to map different prefixes within the bucket to specific log types:

* **`prefix`**: The GCS object prefix to match (e.g., "audit-logs/", "application-logs/")
* **`excluded_prefixes`**: Optional array of prefixes to exclude within the main prefix
* **`log_types`**: Array of log type schemas to apply to objects matching this prefix

This allows a single GCS bucket to contain multiple types of logs organized by prefix, with different schemas applied to each.

## Complete example

For a complete working example with GCP infrastructure setup, see the [Panther auxiliary repository](https://github.com/panther-labs/panther-auxiliary/tree/main/terraform/panther_gcs_transport_type_infra).
