Skip to content

Commit 40313af

Browse files
committed
feat: improve s3 access log parsing along with documentation updates
Signed-off-by: Kavindu Dodanduwa <[email protected]>
1 parent 53ad61c commit 40313af

File tree

8 files changed

+66
-11
lines changed

8 files changed

+66
-11
lines changed

packages/awsfirehose/_dev/build/docs/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,11 +52,15 @@ This is a current limitation in Firehose, which we are working with AWS to resol
5252

5353
To effectively utilize your data, you need to install the necessary integrations. Log in to Kibana, navigate to **Management** > **Integrations** in the sidebar.
5454

55-
- First Install the **Amazon Data Firehose Integration**. This integration will receive and route your Firehose data to various backend datasets. Find it by searching or browse the catalog.
55+
- First Install **Amazon Data Firehose Integration** assets. Assets installed through this integration will receive and route your Firehose data to various backend datasets. Find it by searching or browse the catalog.
5656

5757
![Amazon Data Firehose](../img/amazonfirehose.png)
58-
59-
- Second install the **AWS integration**. This integration provides assets such as index templates, ingest pipelines, and dashboards that will serve as the destination points for the data routed by the Firehose Integration.
58+
59+
Navigate to the **Settings** tab and click **Install Amazon Data Firehose**. Confirm by clicking **Install Amazon Data Firehose** in the popup.
60+
61+
![Install Firehose assets](../img/firehose_settings.png)
62+
63+
- Then install the **AWS integration** assets. This integration provides assets such as index templates, ingest pipelines, and dashboards that will serve as the destination points for the data routed by the Firehose Integration.
6064
Find the **AWS** integration by searching or browsing the catalog.
6165

6266
![AWS integration](../img/aws.png)

packages/awsfirehose/changelog.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
# newer versions go on top
2+
- version: "1.9.0"
3+
changes:
4+
- description: Improve access log classification and improve documentation
5+
type: enhancement
6+
link: https://github.com/elastic/integrations/pull/15309
27
- version: "1.8.2"
38
changes:
49
- description: Remove regex from ingest pipeline

packages/awsfirehose/data_stream/logs/_dev/test/pipeline/test-s3access-log.json

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,21 @@
2929
"data_stream.dataset": "awsfirehose",
3030
"aws.kinesis.name": "firehose-s3access-logs-to-elastic",
3131
"event.id": "37670326805251200781477669690942747782212394134076063744"
32+
},
33+
{
34+
"cloud.region": "us-east-1",
35+
"aws.firehose.arn": "arn:aws:firehose:us-east-2:123456:deliverystream/firehose-s3access-logs-to-elastic",
36+
"data_stream.namespace": "default",
37+
"message": "7cd47ef2be amzn-s3-demo-bucket [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be e5042925-b524-4b3b-a869-f3881e78ff3a S3.COMPUTE.OBJECT.CHECKSUM example-object - - - - 1048576 - - - - -bPf7qjG4XwYdPgDQTl72GW/uotRhdPz2UryEyAFLDSRmKrakUkJCYLtAw6fdANcrsUYc1M/kIulXM1u5vZQT5g== - - - - - -",
38+
"aws.kinesis.type": "deliverystream",
39+
"data_stream.type": "logs",
40+
"aws.firehose.request_id": "971ae05f-a128-4a7f-b623-30f9bc513e55",
41+
"cloud.provider": "aws",
42+
"@timestamp": "2023-07-25T21:04:35Z",
43+
"cloud.account.id": "123456",
44+
"data_stream.dataset": "awsfirehose",
45+
"aws.kinesis.name": "firehose-s3access-logs-to-elastic",
46+
"event.id": "checksum-test"
3247
}
3348
]
3449
}

packages/awsfirehose/data_stream/logs/_dev/test/pipeline/test-s3access-log.json-expected.json

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,27 @@
4141
},
4242
"event.id": "37670326805251200781477669690942747782212394134076063744",
4343
"message": "36c1f05b76016b78528454e6e0c60e2b7ff7aa20c0a5e4c748276e5b0a2debd2 test-s3-ks [01/Aug/2019:00:24:41 +0000] 89.160.20.156 arn:aws:sts::123456:assumed-role/AWSServiceRoleForTrustedAdvisor/TrustedAdvisor_627959692251_784ab70b-8cc9-4d37-a2ec-2ff4d0c08af9 44EE8651683CB4DA REST.GET.LOCATION - \"GET /test-s3-ks/?location&aws-account=627959692251 HTTP/1.1\" 200 - 142 - 17 - \"-\" \"AWS-Support-TrustedAdvisor, aws-internal/3 aws-sdk-java/1.11.590 Linux/4.9.137-0.1.ac.218.74.329.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.212-b03 java/1.8.0_212 vendor/Oracle_Corporation\" - BsCfJedfuSnds2QFoxi+E/O7M6OEWzJnw4dUaes/2hyA363sONRJKzB7EOY+Bt9DTHYUn+HoHxI= SigV4 ECDHE-RSA-AES128-SHA AuthHeader - TLSv1.2"
44+
},
45+
{
46+
"@timestamp": "2023-07-25T21:04:35Z",
47+
"aws.firehose.arn": "arn:aws:firehose:us-east-2:123456:deliverystream/firehose-s3access-logs-to-elastic",
48+
"aws.firehose.request_id": "971ae05f-a128-4a7f-b623-30f9bc513e55",
49+
"aws.kinesis.name": "firehose-s3access-logs-to-elastic",
50+
"aws.kinesis.type": "deliverystream",
51+
"cloud.account.id": "123456",
52+
"cloud.provider": "aws",
53+
"cloud.region": "us-east-1",
54+
"data_stream.dataset": "aws.s3access",
55+
"data_stream.namespace": "default",
56+
"data_stream.type": "logs",
57+
"ecs": {
58+
"version": "8.11.0"
59+
},
60+
"event": {
61+
"dataset": "aws.s3access"
62+
},
63+
"event.id": "checksum-test",
64+
"message": "7cd47ef2be amzn-s3-demo-bucket [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be e5042925-b524-4b3b-a869-f3881e78ff3a S3.COMPUTE.OBJECT.CHECKSUM example-object - - - - 1048576 - - - - -bPf7qjG4XwYdPgDQTl72GW/uotRhdPz2UryEyAFLDSRmKrakUkJCYLtAw6fdANcrsUYc1M/kIulXM1u5vZQT5g== - - - - - -"
4465
}
4566
]
4667
}

packages/awsfirehose/data_stream/logs/elasticsearch/ingest_pipeline/default.yml

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@ processors:
6666
// AWS S3 Access Logs, CloudFront, and ELB Logs
6767
else {
6868
// Inlined logic for splitting tokens
69+
// Note - Tokenization split Time field into two sections, hence token count after index 3 (0 based) is increased by 1
6970
def tokens_result = new ArrayList();
7071
StringBuilder currentToken = new StringBuilder();
7172
boolean insideQuotes = false;
@@ -89,16 +90,21 @@ processors:
8990
def tokens = tokens_result.toArray(new String[0]);
9091
def tokenCount = tokens.length;
9192
92-
// Check for S3 Access logs first using a more reliable token count
93+
// Check for S3 Access logs first using a more reliable token content check
9394
if (tokenCount >= 24) {
9495
def hostHeader = tokens[23];
9596
if (hostHeader.contains('s3') && hostHeader.contains('amazonaws.com')) {
9697
ctx.event.dataset = 'aws.s3access';
98+
return;
99+
}
100+
101+
// Check for operation field content. Refer - https://docs.aws.amazon.com/AmazonS3/latest/userguide/LogFormat.html#log-record-fields
102+
def opField = tokens[7];
103+
if (opField.startsWith("SOAP.") || opField.startsWith("REST.") || opField.startsWith("BATCH.") || opField.startsWith("WEBSITE.") || opField.startsWith("S3.")) {
104+
ctx.event.dataset = 'aws.s3access';
105+
return;
97106
}
98107
}
99-
if (tokenCount == 25) {
100-
ctx.event.dataset = 'aws.s3access';
101-
}
102108
103109
// Fallback to CloudFront and ELB if S3 check fails
104110
if (ctx.event.dataset == null) {

packages/awsfirehose/docs/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,11 +52,15 @@ This is a current limitation in Firehose, which we are working with AWS to resol
5252

5353
To effectively utilize your data, you need to install the necessary integrations. Log in to Kibana, navigate to **Management** > **Integrations** in the sidebar.
5454

55-
- First Install the **Amazon Data Firehose Integration**. This integration will receive and route your Firehose data to various backend datasets. Find it by searching or browse the catalog.
55+
- First Install **Amazon Data Firehose Integration** assets. Assets installed through this integration will receive and route your Firehose data to various backend datasets. Find it by searching or browse the catalog.
5656

5757
![Amazon Data Firehose](../img/amazonfirehose.png)
58-
59-
- Second install the **AWS integration**. This integration provides assets such as index templates, ingest pipelines, and dashboards that will serve as the destination points for the data routed by the Firehose Integration.
58+
59+
Navigate to the **Settings** tab and click **Install Amazon Data Firehose**. Confirm by clicking **Install Amazon Data Firehose** in the popup.
60+
61+
![Install Firehose assets](../img/firehose_settings.png)
62+
63+
- Then install the **AWS integration** assets. This integration provides assets such as index templates, ingest pipelines, and dashboards that will serve as the destination points for the data routed by the Firehose Integration.
6064
Find the **AWS** integration by searching or browsing the catalog.
6165

6266
![AWS integration](../img/aws.png)
32.5 KB
Loading

packages/awsfirehose/manifest.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
format_version: "3.3.0"
22
name: awsfirehose
33
title: Amazon Data Firehose
4-
version: 1.8.2
4+
version: 1.9.0
55
description: Stream logs and metrics from Amazon Data Firehose into Elastic Cloud.
66
type: integration
77
categories:

0 commit comments

Comments
 (0)