Data Lake Queries
The Panther API supports the following data lake operations:
  • Listing your data lake databases, tables, and columns
  • Executing a data lake (Data Explorer) query using SQL
  • Executing an Indicator Search query
  • Canceling any currently-running query
  • Fetching the details of any previously executed query
  • Listing all currently-running or previously-executed queries with optional filters
See the sections below for GraphQL queries and mutations around certain common operations, as well as end-to-end examples on typical workflows.

Common Operations

Below are some of the most common GraphQL operations in Panther. These examples demonstrate how you can use a GraphQL client (or curl) to make a call to Panther's GraphQL API.
For more information on what fields and operations exist in the API, please see the Discovering the Schema documentation.

Database Entities

Listing all database entities
Fetching the entities of a particular database
1
# `AllDatabaseEntities` is a nickname for the operation
2
query AllDatabaseEntities {
3
dataLakeDatabases {
4
name
5
description
6
tables {
7
name
8
description
9
columns {
10
name
11
description
12
type
13
}
14
}
15
}
16
}
Copied!
1
# `DatabaseEntities` is a nickname for the operation
2
query DatabaseEntities {
3
dataLakeDatabase(name: "panther_logs.public") {
4
name
5
description
6
tables {
7
name
8
description
9
columns {
10
name
11
description
12
type
13
}
14
}
15
}
16
}
Copied!

Executing queries

Executing a data lake (Data Explorer) query
Executing an Indicator Search query
Canceling a query
1
# `IssueDataLakeQuery` is a nickname for the operation
2
mutation IssueDataLakeQuery {
3
executeDataLakeQuery(input: {
4
sql: "select * from panther_logs.public.aws_alb limit 50"
5
}) {
6
id # the unique ID of the query
7
}
8
}
Copied!
1
# `IssueIndicatorSearchQuery` is a nickname for the operation
2
mutation IssueIndicatorSearchQuery {
3
executeIndicatorSearchQuery(input: {
4
indicators: ["286103014039", "126103014049"]
5
startTime: "2022-04-01T00:00:00.000Z",
6
endTime: "2022-04-30T23:59:59.000Z"
7
indicatorName: p_any_aws_account_ids # or leave blank for auto-detect
8
}) {
9
id # the unique ID of the query
10
}
11
}
Copied!
1
# `AbandonQuery` is a nickname for the operation
2
mutation AbandonQuery {
3
cancelDataLakeQuery(input: { id: "1234-5678" }) {
4
id # return the ID that got canceled
5
}
6
}
Copied!

Fetching results for a data lake or Indicator Search query

When you execute a data lake or Indicator Search query, it can take a few seconds to a few minutes for results to come back. To confirm that the query has completed, you must check the status of the query (polling).
You can use the following query to check the query status, while also fetching its results if available:
Fetching the first page of results
Fetching subsequent pages of results
1
# `QueryResults` is a nickname for the operation
2
query QueryResults {
3
dataLakeQuery(id: "1234-1234-1234-1234") { # the unique ID of the query
4
message
5
status
6
results {
7
edges {
8
node
9
}
10
}
11
}
12
}
Copied!
1
# `QueryResults` is a nickname for the operation
2
query QueryResults {
3
dataLakeQuery(id: "1234-1234-1234-1234") { # the unique ID of the query
4
message
5
status
6
results(input: { cursor: "5678-5678-5678-5678" }) { # the value of `endCursor`
7
edges {
8
node
9
}
10
pageInfo
11
endCursor
12
hasNextPage
13
}
14
}
15
}
Copied!
The expected values of status and results depend on the query's status:
  • If the query is still running:
    • status will have a value of running
    • results will have a value of null
  • If the query has failed:
    • status will have a value of failed
    • results will have a value of null and the error message will be available in the message key
  • If the query has completed
    • status will have a value of succeeded
    • results will be populated
All of the above (along with the possible values for status) , along with additional fields you are allowed to request can be found in our Documentation Explorer or GraphQL schema file).

Fetching metadata around a data lake or Indicator Search query

In the example above, we requested the results of a Panther query. It is also possible to request additional metadata around the query.
In the following example, we request these metadata along the first page of results:
1
# `QueryMetadata` is a nickname for the operation
2
query QueryMetadata {
3
dataLakeQuery(id: "1234-1234-1234-1234") { # the unique ID of the query
4
name
5
isScheduled
6
issuedBy {
7
... on User {
8
email
9
}
10
... on APIToken {
11
name
12
}
13
}
14
sql
15
message
16
status
17
startedAt
18
completedAt
19
results {
20
edges {
21
node
22
}
23
}
24
}
Copied!

Listing data lake and Indicator Search queries

Fetching the first page
Fetching subsequent pages
Fetching a filtered set
1
# `ListDataLakeQueries` is a nickname for the operation
2
query ListDataLakeQueries {
3
dataLakeQueries {
4
name
5
isScheduled
6
issuedBy {
7
... on User {
8
email
9
}
10
... on APIToken {
11
name
12
}
13
}
14
sql
15
message
16
status
17
startedAt
18
completedAt
19
results { # we're only fetching the first page of results for each query
20
edges {
21
node
22
}
23
}
24
}
Copied!
1
# `ListDataLakeQueries` is a nickname for the operation
2
query ListDataLakeQueries {
3
dataLakeQueries(input: { cursor: "5678-5678-5678-5678" }) { # the value of `endCursor`
4
name
5
isScheduled
6
issuedBy {
7
... on User {
8
email
9
}
10
... on APIToken {
11
name
12
}
13
}
14
sql
15
message
16
status
17
startedAt
18
completedAt
19
results { # we're only fetching the first page of results for each query
20
edges {
21
node
22
}
23
}
24
pageInfo {
25
endCursor
26
hasNextPage
27
}
28
}
Copied!
1
# `ListDataLakeQueries` is a nickname for the operation
2
query ListDataLakeQueries {
3
dataLakeQueries(input: { contains: "aws_alb", isScheduled: true }) {
4
name
5
isScheduled
6
issuedBy {
7
... on User {
8
email
9
}
10
... on APIToken {
11
name
12
}
13
}
14
sql
15
message
16
status
17
startedAt
18
completedAt
19
results { # we're only fetching the first page of results for each query
20
edges {
21
node
22
}
23
}
24
}
Copied!

End-to-end Examples

Below, we will build on the Common Operations examples to showcase an end-to-end flow.

Execute a data lake (Data Explorer) Query

NodeJS
Python
1
// npm install graphql graphql-request
2
3
import { GraphQLClient, gql } from 'graphql-request';
4
5
const client = new GraphQLClient(
6
'YOUR_PANTHER_API_URL',
7
{ headers: { 'X-API-Key': 'YOUR_API_KEY' }
8
});
9
10
// `IssueQuery` is a nickname for the query. You can fully omit it.
11
const issueQuery = gql`
12
mutation IssueQuery($sql: String!) {
13
executeDataLakeQuery(input: { sql: $sql }) {
14
id
15
}
16
}
17
`;
18
19
// `GetQueryResults` is a nickname for the query. You can fully omit it.
20
const getQueryResults = gql`
21
query GetQueryResults($id: ID!, $cursor: String) {
22
dataLakeQuery(id: $id) {
23
message
24
status
25
results(input: { cursor: $cursor }) {
26
edges {
27
node
28
}
29
pageInfo {
30
endCursor
31
hasNextPage
32
}
33
}
34
}
35
}
36
`;
37
38
(async () => {
39
try {
40
// an accumulator that holds all result nodes that we fetch
41
let allResults = [];
42
// a helper to know when to exit the loop
43
let hasMore = true;
44
// the pagination cursor
45
let cursor = null;
46
47
// issue a query
48
const mutationData = await client.request(issueQuery, {
49
sql: 'select * from panther_logs.public.aws_alb limit 5',
50
});
51
52
// Start polling the query until it returns results. From there,
53
// keep fetching pages until there are no more left
54
do {
55
const queryData = await client.request(getQueryResults, {
56
id: mutationData.executeDataLakeQuery.id,
57
cursor,
58
});
59
60
// if it's still running, print a message and keep polling
61
if (queryData.dataLakeQuery.status === 'running') {
62
console.log(queryData.dataLakeQuery.message);
63
continue;
64
}
65
66
// if it's not running & it's not completed, then it's
67
// either cancelled or it has errored out. In this case,
68
// throw an exception
69
if (queryData.dataLakeQuery.status !== 'completed') {
70
throw new Error(queryData.dataLakeQuery.message);
71
}
72
73
allResults = [...allResults, ...queryData.dataLakeQuery.results.edges.map(edge => edge.node)];
74
75
hasMore = queryData.dataLakeQuery.results.pageInfo.hasNextPage;
76
cursor = queryData.dataLakeQuery.results.pageInfo.endCursor;
77
} while (hasMore);
78
79
console.log(`Your query returned ${allResults.length} result(s)!`);
80
} catch (err) {
81
console.error(err.response);
82
}
83
})();
Copied!
1
# pip install gql aiohttp
2
3
from gql import gql, Client
4
from gql.transport.aiohttp import AIOHTTPTransport
5
6
transport = AIOHTTPTransport(
7
url="YOUR_PANTHER_API_URL",
8
headers={"X-API-Key": "YOUR_API_KEY"}
9
)
10
11
client = Client(transport=transport, fetch_schema_from_transport=True)
12
13
# `IssueQuery` is a nickname for the query. You can fully omit it.
14
issue_query = gql(
15
"""
16
mutation IssueQuery($sql: String!) {
17
executeDataLakeQuery(input: { sql: $sql }) {
18
id
19
}
20
}
21
"""
22
)
23
24
# `GetQueryResults` is a nickname for the query. You can fully omit it.
25
get_query_results = gql(
26
"""
27
query GetQueryResults($id: ID!, $cursor: String) {
28
dataLakeQuery(id: $id) {
29
message
30
status
31
results(input: { cursor: $cursor }) {
32
edges {
33
node
34
}
35
pageInfo {
36
endCursor
37
hasNextPage
38
}
39
}
40
}
41
}
42
"""
43
)
44
45
# an accumulator that holds all results that we fetch from all pages
46
all_results = []
47
# a helper to know when to exit the loop.
48
has_more = True
49
# the pagination cursor
50
cursor = None
51
52
# Issue a Data Lake (Data Explorer) query
53
mutation_data = client.execute(
54
issue_query,
55
variable_values={
56
"sql": "select * from panther_logs.public.aws_alb limit 5"
57
}
58
)
59
60
# Start polling the query until it returns results. From there,
61
# keep fetching pages until there are no more left
62
while has_more:
63
query_data = client.execute(
64
get_query_results,
65
variable_values = {
66
"id": mutation_data["executeDataLakeQuery"]["id"],
67
"cursor": cursor
68
}
69
)
70
71
# if it's still running, print a message and keep polling
72
if query_data["dataLakeQuery"]["status"] == "running":
73
print(query_data["dataLakeQuery"]["message"])
74
continue
75
76
# if it's not running & it's not completed, then it's
77
# either cancelled or it has errored out. In this case,
78
# throw an exception
79
if query_data["dataLakeQuery"]["status"] != "completed":
80
raise Exception(query_data["dataLakeQuery"]["message"])
81
82
all_results.extend([edge["node"] for edge in query_data["dataLakeQuery"]["results"]["edges"]])
83
has_more = query_data["dataLakeQuery"]["results"]["pageInfo"]["hasNextPage"]
84
cursor = query_data["dataLakeQuery"]["results"]["pageInfo"]["endCursor"]
85
86
print(f'Query returned {len(all_results)} results(s)!')
Copied!

Execute an Indicator Search query

NodeJS
Python
1
// npm install graphql graphql-request
2
3
import { GraphQLClient, gql } from 'graphql-request';
4
5
const client = new GraphQLClient(
6
'YOUR_PANTHER_API_URL',
7
{ headers: { 'X-API-Key': 'YOUR_API_KEY' }
8
});
9
10
// `IssueQuery` is a nickname for the query. You can fully omit it.
11
const issueQuery = gql`
12
mutation IssueQuery($input: ExecuteIndicatorSearchQueryInput!) {
13
executeIndicatorSearchQuery(input: $input) {
14
id
15
}
16
}
17
`;
18
19
// `GetQueryResults` is a nickname for the query. You can fully omit it.
20
const getQueryResults = gql`
21
query GetQueryResults($id: ID!, $cursor: String) {
22
dataLakeQuery(id: $id) {
23
message
24
status
25
results(input: { cursor: $cursor }) {
26
edges {
27
node
28
}
29
pageInfo {
30
endCursor
31
hasNextPage
32
}
33
}
34
}
35
}
36
`;
37
38
(async () => {
39
try {
40
// an accumulator that holds all result nodes that we fetch
41
let allResults = [];
42
// a helper to know when to exit the loop
43
let hasMore = true;
44
// the pagination cursor
45
let cursor = null;
46
47
// issue a query
48
const mutationData = await client.request(issueQuery, {
49
input: {
50
indicators: ["226103014039"],
51
startTime: "2022-03-29T00:00:00.001Z",
52
endTime: "2022-03-30T00:00:00.001Z",
53
indicatorName: "p_any_aws_account_ids"
54
}
55
});
56
57
// Keep fetching pages until there are no more left
58
do {
59
const queryData = await client.request(getQueryResults, {
60
id: mutationData.executeIndicatorSearchQuery.id,
61
cursor,
62
});
63
64
// if it's still running, print a message and keep polling
65
if (queryData.dataLakeQuery.status === 'running') {
66
console.log(queryData.dataLakeQuery.message);
67
continue;
68
}
69
70
// if it's not running & it's not completed, then it's
71
// either cancelled or it has errored out. In this case,
72
// throw an exception
73
if (queryData.dataLakeQuery.status !== 'completed') {
74
throw new Error(queryData.dataLakeQuery.message);
75
}
76
77
allResults = [...allResults, ...queryData.dataLakeQuery.results.edges.map(edge => edge.node)];
78
79
hasMore = queryData.dataLakeQuery.results.pageInfo.hasNextPage;
80
cursor = queryData.dataLakeQuery.results.pageInfo.endCursor;
81
} while (hasMore);
82
83
console.log(`Your query returned ${allResults.length} result(s)!`);
84
} catch (err) {
85
console.error(err.response);
86
}
87
})();
Copied!
1
# pip install gql aiohttp
2
3
from gql import gql, Client
4
from gql.transport.aiohttp import AIOHTTPTransport
5
6
transport = AIOHTTPTransport(
7
url="YOUR_PANTHER_API_URL",
8
headers={"X-API-Key": "YOUR_API_KEY"}
9
)
10
11
client = Client(transport=transport, fetch_schema_from_transport=True)
12
13
# `IssueQuery` is a nickname for the query. You can fully omit it.
14
issue_query = gql(
15
"""
16
mutation IssueQuery($input: ExecuteIndicatorSearchQueryInput!) {
17
executeIndicatorSearchQuery(input: $input) {
18
id
19
}
20
}
21
"""
22
)
23
24
# `GetQueryResults` is a nickname for the query. You can fully omit it.
25
get_query_results = gql(
26
"""
27
query GetQueryResults($id: ID!, $cursor: String) {
28
dataLakeQuery(id: $id) {
29
message
30
status
31
results(input: { cursor: $cursor }) {
32
edges {
33
node
34
}
35
pageInfo {
36
endCursor
37
hasNextPage
38
}
39
}
40
}
41
}
42
"""
43
)
44
45
# an accumulator that holds all results that we fetch from all pages
46
all_results = []
47
# a helper to know when to exit the loop
48
has_more = True
49
# the pagination cursor
50
cursor = None
51
52
# Issue an Indicator Search query
53
mutation_data = client.execute(
54
issue_query,
55
variable_values={
56
"input": {
57
"indicators": ["226103014039"],
58
"startTime": "2022-03-29T00:00:00.001Z",
59
"endTime": "2022-03-30T00:00:00.001Z",
60
"indicatorName": "p_any_aws_account_ids"
61
}
62
}
63
)
64
65
# Start polling the query until it returns results. From there,
66
# keep fetching pages until there are no more left
67
while has_more:
68
query_data = client.execute(
69
get_query_results,
70
variable_values = {
71
"id": mutation_data["executeIndicatorSearchQuery"]["id"],
72
"cursor": cursor
73
}
74
)
75
76
# if it's still running, print a message and keep polling
77
if query_data["dataLakeQuery"]["status"] == "running":
78
print(query_data["dataLakeQuery"]["message"])
79
continue
80
81
# if it's not running & it's not completed, then it's
82
# either cancelled or it has errored out. In this case,
83
# throw an exception
84
if query_data["dataLakeQuery"]["status"] != "completed":
85
raise Exception(query_data["dataLakeQuery"]["message"])
86
87
all_results.extend([edge["node"] for edge in query_data["dataLakeQuery"]["results"]["edges"]])
88
has_more = query_data["dataLakeQuery"]["results"]["pageInfo"]["hasNextPage"]
89
cursor = query_data["dataLakeQuery"]["results"]["pageInfo"]["endCursor"]
90
91
print(f'Query returned {len(all_results)} results(s)!')
Copied!