diff --git a/packages/github/_dev/build/docs/README.md b/packages/github/_dev/build/docs/README.md index 52d81135cc8..61f72a66d2e 100644 --- a/packages/github/_dev/build/docs/README.md +++ b/packages/github/_dev/build/docs/README.md @@ -25,14 +25,22 @@ For Organizations: The GitHub audit log records all events related to the GitHub organization/enterprise. See [Organization audit log actions](https://docs.github.com/en/organizations/keeping-your-organization-secure/reviewing-the-audit-log-for-your-organization#audit-log-actions) and [Enterprise audit log actions](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/about-the-audit-log-for-your-enterprise) for more details. -Github integration can collect audit logs from three sources: [Github API](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/using-the-audit-log-api-for-your-enterprise), [Azure Event Hubs](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-azure-event-hubs), and [AWS S3 or AWS SQS](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-amazon-s3). +The GitHub integration can collect audit logs from the following sources: [GitHub API](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/using-the-audit-log-api-for-your-enterprise), [Azure Event Hubs](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-azure-event-hubs), [Azure Blob Storage](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-azure-blob-storage), [AWS S3 or AWS SQS](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-amazon-s3) and [Google Cloud Storage](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-google-cloud-storage). -When using Github API to collect audit log events, below requirements must be met for Personal Access Token (PAT): +When using GitHub API to collect audit log events, below requirements must be met for Personal Access Token (PAT): - You must use a Personal Access Token with `read:audit_log` scope. This applies to both organization and enterprise admins. - If you're an enterprise admin, ensure your token also includes `admin:enterprise` scope to access enterprise-wide logs. To collect audit log events from Azure Event Hubs, follow the [guide](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-azure-event-hubs) to setup audit log streaming. +To collect audit log events from Azure Blob Storage, follow the [guide](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-azure-blob-storage) to setup audit log streaming. To collect audit log events from AWS S3 or AWS SQS, follow the [guide](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-amazon-s3) to setup audit log streaming. For more details, refer to this [documentation](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise). +To collect audit log events from Google Cloud Storage, follow the [guide](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-google-cloud-storage) to setup audit log streaming. + +For Filebeat input documentation, refer to the following pages: + - [Azure Event Hub](https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-azure-eventhub) + - [Azure Blob Storage](https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-azure-blob-storage) + - [AWS S3](https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-aws-s3) + - [Google Cloud Storage](https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-gcs) *This integration is not compatible with GitHub Enterprise server.* diff --git a/packages/github/_dev/deploy/docker/docker-compose.yml b/packages/github/_dev/deploy/docker/docker-compose.yml index 4bffa205686..0fc10891651 100644 --- a/packages/github/_dev/deploy/docker/docker-compose.yml +++ b/packages/github/_dev/deploy/docker/docker-compose.yml @@ -12,3 +12,40 @@ services: - http-server - --addr=:8080 - --config=/files/config.yml + azure-blob-storage-emulator: + image: mcr.microsoft.com/azure-storage/azurite:latest + command: azurite-blob --blobHost 0.0.0.0 --blobPort 10000 --skipApiVersionCheck --disableProductStyleUrl + ports: + - "10000/tcp" + healthcheck: + test: nc 127.0.0.1 10000 -z + interval: 10s + timeout: 5s + azure-cli: + image: mcr.microsoft.com/azure-cli + depends_on: + azure-blob-storage-emulator: + condition: service_healthy + volumes: + - ./sample_logs:/sample_logs + entrypoint: > + sh -c " + sleep 5 && + export AZURE_STORAGE_CONNECTION_STRING='DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://azure-blob-storage-emulator:10000/devstoreaccount1;' && + az storage container create --name test-container && + az storage blob upload --container-name test-container --file /sample_logs/cloud-storage-data.log --name cloud-storage-data.log" + + gcs-mock-service: + image: golang:1.24.7-alpine + working_dir: /app + volumes: + - ./gcs-mock-service:/app + - ./files/manifest.yml:/files/manifest.yml:ro + - ./sample_logs/:/data:ro + ports: + - "4443/tcp" + healthcheck: + test: "wget --no-verbose --tries=1 --spider http://localhost:4443/health || exit 1" + interval: 10s + timeout: 5s + command: go run main.go -manifest /files/manifest.yml diff --git a/packages/github/_dev/deploy/docker/files/config.yml b/packages/github/_dev/deploy/docker/files/config.yml index 2e41dc7ae33..199abe1ef7c 100644 --- a/packages/github/_dev/deploy/docker/files/config.yml +++ b/packages/github/_dev/deploy/docker/files/config.yml @@ -766,7 +766,7 @@ rules: "closed_at": null, "author_association": "NONE", "active_lock_reason": null, - "body": "Structured Threat Information Expression (STIX) is a language for expressing cyber threat and observable information. While we have several Threat Intel integrations which map STIX formatted data to Elastic Common Schema, users will always have need to ingest IOC's from threat feeds that we don't support out of the box. 'How do I ingest STIX feeds' remains a very common questions across community Slack, Discuss, Github, etc. A custom package would solve for this. \r\n\r\nTo allow for the broad range of STIX formatted feeds, we should provide a way for users to ingest data from ANY STIX feed, via a 'Custom STIX' package. The package will leverage our httpjson input under the hood, but include an ingest pipeline which maps STIX fields to ECS (we expect there'll still be a need for custom fields, as not all STIX fields have a corresponding field in ECS). \r\n\r\nThere may be cases where some feeds/vendors don't strictly conform to STIX, and in those cases, users may have to modify our pipeline and that's ok. We'll focus on correctly formatted STIX data. ", + "body": "Structured Threat Information Expression (STIX) is a language for expressing cyber threat and observable information. While we have several Threat Intel integrations which map STIX formatted data to Elastic Common Schema, users will always have need to ingest IOC's from threat feeds that we don't support out of the box. 'How do I ingest STIX feeds' remains a very common questions across community Slack, Discuss, GitHub, etc. A custom package would solve for this. \r\n\r\nTo allow for the broad range of STIX formatted feeds, we should provide a way for users to ingest data from ANY STIX feed, via a 'Custom STIX' package. The package will leverage our httpjson input under the hood, but include an ingest pipeline which maps STIX fields to ECS (we expect there'll still be a need for custom fields, as not all STIX fields have a corresponding field in ECS). \r\n\r\nThere may be cases where some feeds/vendors don't strictly conform to STIX, and in those cases, users may have to modify our pipeline and that's ok. We'll focus on correctly formatted STIX data. ", "reactions": { "url": "https://api.github.com/repos/elastic/integrations/issues/4710/reactions", "total_count": 0, diff --git a/packages/github/_dev/deploy/docker/files/manifest.yml b/packages/github/_dev/deploy/docker/files/manifest.yml new file mode 100644 index 00000000000..8adc1f3afc2 --- /dev/null +++ b/packages/github/_dev/deploy/docker/files/manifest.yml @@ -0,0 +1,5 @@ +buckets: + testbucket: + files: + - path: /data/cloud-storage-data.log + content-type: application/x-ndjson diff --git a/packages/github/_dev/deploy/docker/gcs-mock-service/go.mod b/packages/github/_dev/deploy/docker/gcs-mock-service/go.mod new file mode 100644 index 00000000000..df08767f7e8 --- /dev/null +++ b/packages/github/_dev/deploy/docker/gcs-mock-service/go.mod @@ -0,0 +1,5 @@ +module gcs-mock-service + +go 1.24.7 + +require gopkg.in/yaml.v3 v3.0.1 diff --git a/packages/github/_dev/deploy/docker/gcs-mock-service/go.sum b/packages/github/_dev/deploy/docker/gcs-mock-service/go.sum new file mode 100644 index 00000000000..a62c313c5b0 --- /dev/null +++ b/packages/github/_dev/deploy/docker/gcs-mock-service/go.sum @@ -0,0 +1,4 @@ +gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM= +gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= +gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= +gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= diff --git a/packages/github/_dev/deploy/docker/gcs-mock-service/main.go b/packages/github/_dev/deploy/docker/gcs-mock-service/main.go new file mode 100644 index 00000000000..391e75c7efe --- /dev/null +++ b/packages/github/_dev/deploy/docker/gcs-mock-service/main.go @@ -0,0 +1,297 @@ +// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +// or more contributor license agreements. Licensed under the Elastic License; +// you may not use this file except in compliance with the Elastic License. + +package main + +import ( + "encoding/json" + "flag" + "fmt" + "io" + "log" + "net/http" + "os" + "strconv" + "strings" + + "gopkg.in/yaml.v3" +) + +func main() { + host := flag.String("host", "0.0.0.0", "host to listen on") + port := flag.String("port", "4443", "port to listen on") + manifest := flag.String("manifest", "", "path to YAML manifest file for preloading buckets and objects") + flag.Parse() + + addr := fmt.Sprintf("%s:%s", *host, *port) + + fmt.Printf("Starting mock GCS server on http://%s\n", addr) + if *manifest != "" { + m, err := readManifest(*manifest) + if err != nil { + log.Fatalf("error reading manifest: %v", err) + } + if err := processManifest(m); err != nil { + log.Fatalf("error processing manifest: %v", err) + } + } else { + fmt.Println("Store is empty. Create buckets and objects via API calls.") + } + + // setup HTTP handlers + mux := http.NewServeMux() + // health check + mux.HandleFunc("/health", healthHandler) + // standard gcs api calls + mux.HandleFunc("GET /storage/v1/b/{bucket}/o", handleListObjects) + mux.HandleFunc("GET /storage/v1/b/{bucket}/o/{object...}", handleGetObject) + mux.HandleFunc("POST /storage/v1/b", handleCreateBucket) + mux.HandleFunc("POST /upload/storage/v1/b/{bucket}/o", handleUploadObject) + mux.HandleFunc("POST /upload/storage/v1/b/{bucket}/o/{object...}", handleUploadObject) + // direct path-style gcs sdk calls + mux.HandleFunc("GET /{bucket}/o/{object...}", handleGetObject) + mux.HandleFunc("GET /{bucket}/{object...}", handleGetObject) + // debug: log all requests + loggedMux := loggingMiddleware(mux) + + if err := http.ListenAndServe(addr, loggedMux); err != nil { + log.Fatalf("failed to start server: %v", err) + } +} + +// loggingMiddleware logs incoming HTTP requests. +func loggingMiddleware(next http.Handler) http.Handler { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + fmt.Printf("%s %s\n", r.Method, r.URL.Path) + next.ServeHTTP(w, r) + }) +} + +// readManifest reads and parses the YAML manifest file. +func readManifest(path string) (*Manifest, error) { + data, err := os.ReadFile(path) + if err != nil { + return nil, fmt.Errorf("failed to read manifest: %w", err) + } + + var manifest Manifest + if err := yaml.Unmarshal(data, &manifest); err != nil { + return nil, fmt.Errorf("failed to parse manifest: %w", err) + } + + return &manifest, nil +} + +// processManifest creates buckets and uploads objects as specified in the manifest. +func processManifest(manifest *Manifest) error { + for bucketName, bucket := range manifest.Buckets { + for _, file := range bucket.Files { + fmt.Printf("preloading data for bucket: %s | path: %s | content-type: %s...\n", + bucketName, file.Path, file.ContentType) + + if err := createBucket(bucketName); err != nil { + return fmt.Errorf("failed to create bucket '%s': %w", bucketName, err) + } + data, err := os.ReadFile(file.Path) + if err != nil { + return fmt.Errorf("failed to read bucket data file '%s': %w", file.Path, err) + } + pathParts := strings.Split(file.Path, "/") + if _, err := uploadObject(bucketName, pathParts[len(pathParts)-1], data, file.ContentType); err != nil { + return fmt.Errorf("failed to create object '%s' in bucket '%s': %w", file.Path, bucketName, err) + } + } + } + return nil +} + +// healthHandler responds with a simple "OK" message for health checks. +func healthHandler(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(http.StatusOK) + fmt.Fprint(w, "OK") +} + +// handleListObjects lists all objects in the specified bucket. +func handleListObjects(w http.ResponseWriter, r *http.Request) { + bucketName := r.PathValue("bucket") + + if bucket, ok := inMemoryStore[bucketName]; ok { + response := GCSListResponse{ + Kind: "storage#objects", + Items: []GCSObject{}, + } + for name, object := range bucket { + item := GCSObject{ + Kind: "storage#object", + Name: name, + Bucket: bucketName, + Size: strconv.Itoa(len(object.Data)), + ContentType: object.ContentType, + } + response.Items = append(response.Items, item) + } + w.Header().Set("Content-Type", "application/json") + json.NewEncoder(w).Encode(response) + return + } + http.Error(w, "not found", http.StatusNotFound) +} + +// handleGetObject retrieves a specific object from a bucket. +func handleGetObject(w http.ResponseWriter, r *http.Request) { + bucketName := r.PathValue("bucket") + objectName := r.PathValue("object") + + if bucketName == "" || objectName == "" { + http.Error(w, "not found: invalid URL format", http.StatusNotFound) + return + } + + if bucket, ok := inMemoryStore[bucketName]; ok { + if object, ok := bucket[objectName]; ok { + w.Header().Set("Content-Type", object.ContentType) + w.Write(object.Data) + return + } + } + http.Error(w, "not found", http.StatusNotFound) +} + +// handleCreateBucket creates a new bucket. +func handleCreateBucket(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.NotFound(w, r) + return + } + + var bucketInfo struct { + Name string `json:"name"` + } + if err := json.NewDecoder(r.Body).Decode(&bucketInfo); err != nil { + http.Error(w, "invalid JSON body", http.StatusBadRequest) + return + } + if bucketInfo.Name == "" { + http.Error(w, "bucket name is required", http.StatusBadRequest) + return + } + + if err := createBucket(bucketInfo.Name); err != nil { + http.Error(w, err.Error(), http.StatusConflict) + return + } + + w.WriteHeader(http.StatusOK) + json.NewEncoder(w).Encode(bucketInfo) +} + +// handleUploadObject uploads an object to a specified bucket. +func handleUploadObject(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.NotFound(w, r) + return + } + + bucketName := r.PathValue("bucket") + objectName := r.URL.Query().Get("name") + if objectName == "" { + objectName = r.PathValue("object") + } + + if bucketName == "" || objectName == "" { + http.Error(w, "missing bucket or object name", http.StatusBadRequest) + return + } + + data, err := io.ReadAll(r.Body) + if err != nil { + http.Error(w, "failed to read request body", http.StatusInternalServerError) + return + } + defer r.Body.Close() + + contentType := r.Header.Get("Content-Type") + if contentType == "" { + contentType = "application/octet-stream" + } + + response, err := uploadObject(bucketName, objectName, data, contentType) + if err != nil { + http.Error(w, err.Error(), http.StatusNotFound) + return + } + + w.Header().Set("Content-Type", "application/json") + json.NewEncoder(w).Encode(response) +} + +func createBucket(bucketName string) error { + if _, exists := inMemoryStore[bucketName]; exists { + return fmt.Errorf("bucket already exists") + } + inMemoryStore[bucketName] = make(map[string]ObjectData) + log.Printf("created bucket: %s", bucketName) + return nil +} + +func uploadObject(bucketName, objectName string, data []byte, contentType string) (*GCSObject, error) { + if _, ok := inMemoryStore[bucketName]; !ok { + return nil, fmt.Errorf("bucket not found") + } + + inMemoryStore[bucketName][objectName] = ObjectData{ + Data: data, + ContentType: contentType, + } + log.Printf("created object '%s' in bucket '%s' with Content-Type '%s'", + objectName, bucketName, contentType) + + return &GCSObject{ + Kind: "storage#object", + Name: objectName, + Bucket: bucketName, + Size: strconv.Itoa(len(data)), + ContentType: contentType, + }, nil +} + +// The in-memory store to hold ObjectData structs. +var inMemoryStore = make(map[string]map[string]ObjectData) + +// ObjectData stores the raw data and its content type. +type ObjectData struct { + Data []byte + ContentType string +} + +// GCSListResponse mimics the structure of a real GCS object list response. +type GCSListResponse struct { + Kind string `json:"kind"` + Items []GCSObject `json:"items"` +} + +// GCSObject mimics the structure of a GCS object resource with ContentType. +type GCSObject struct { + Kind string `json:"kind"` + Name string `json:"name"` + Bucket string `json:"bucket"` + Size string `json:"size"` + ContentType string `json:"contentType"` +} + +// Manifest represents the top-level structure of the YAML file +type Manifest struct { + Buckets map[string]Bucket `yaml:"buckets"` +} + +// Bucket represents each bucket and its files +type Bucket struct { + Files []File `yaml:"files"` +} + +// File represents each file entry inside a bucket +type File struct { + Path string `yaml:"path"` + ContentType string `yaml:"content-type"` +} diff --git a/packages/github/_dev/deploy/docker/sample_logs/cloud-storage-data.log b/packages/github/_dev/deploy/docker/sample_logs/cloud-storage-data.log new file mode 100644 index 00000000000..1fd019d2515 --- /dev/null +++ b/packages/github/_dev/deploy/docker/sample_logs/cloud-storage-data.log @@ -0,0 +1,3 @@ +{"@timestamp": 1698579600000, "action": "user.login", "active": true, "actor": "john_doe", "actor_id": 12345, "actor_location": {"country_name": "USA", "ip": "192.168.1.1"}, "org_id": 67890, "org": "tech-corp", "user_id": 12345, "business_id": 56789, "business": "tech-enterprise", "message": "User logged in successfully.", "name": "John Doe", "device": "laptop", "login_method": "password"} +{"actor":"github-actor","org":"Example-Org","action":"organization_default_label.create","created_at":1583364251067} +{"actor":"github-actor","org":"Example-Org","created_at":1608939056939,"action":"org.oauth_app_access_approved","actor_location":{"country_code":"US"}} diff --git a/packages/github/changelog.yml b/packages/github/changelog.yml index 550578f2dff..48f1ad0e2db 100644 --- a/packages/github/changelog.yml +++ b/packages/github/changelog.yml @@ -1,4 +1,9 @@ # newer versions go on top +- version: "2.15.0" + changes: + - description: Added support for abs and gcs inputs in audit data stream. + type: enhancement + link: https://github.com/elastic/integrations/pull/15303 - version: "2.14.0" changes: - description: Add links panel widget in dashboards. @@ -196,7 +201,7 @@ link: https://github.com/elastic/integrations/pull/7976 - version: 1.23.1 changes: - - description: Fix docs for Github Audit log permissions. + - description: Fix docs for GitHub Audit log permissions. type: bugfix link: https://github.com/elastic/integrations/pull/7954 - version: 1.23.0 @@ -296,7 +301,7 @@ link: https://github.com/elastic/integrations/pull/5765 - version: "1.9.0" changes: - - description: Release Github datastreams as GA. + - description: Release GitHub datastreams as GA. type: enhancement link: https://github.com/elastic/integrations/pull/5677 - version: "1.8.2" @@ -306,12 +311,12 @@ link: https://github.com/elastic/integrations/pull/5123 - version: "1.8.1" changes: - - description: Fix pagination in Github audit + - description: Fix pagination in GitHub audit type: bugfix link: https://github.com/elastic/integrations/issues/5210 - version: "1.8.0" changes: - - description: Add Github Issues datastream + - description: Add GitHub Issues datastream type: enhancement link: https://github.com/elastic/integrations/pull/5292 - version: "1.7.0" @@ -346,7 +351,7 @@ link: https://github.com/elastic/integrations/pull/3881 - version: "1.2.2" changes: - - description: Update Github Secret Scanning fingerprint with resolved_at + - description: Update GitHub Secret Scanning fingerprint with resolved_at type: bugfix link: https://github.com/elastic/integrations/pull/3802 - version: "1.2.1" diff --git a/packages/github/data_stream/audit/_dev/test/policy/test-abs.expected b/packages/github/data_stream/audit/_dev/test/policy/test-abs.expected new file mode 100644 index 00000000000..dddc09c1aeb --- /dev/null +++ b/packages/github/data_stream/audit/_dev/test/policy/test-abs.expected @@ -0,0 +1,43 @@ +inputs: + - data_stream: + namespace: ep + meta: + package: + name: github + name: test-abs-github + streams: + - account_name: devstoreaccount1 + auth.shared_credentials.account_key: ${SECRET_0} + containers: + - name: test-container + data_stream: + dataset: github.audit + type: logs + file_selectors: null + max_workers: 3 + poll: true + poll_interval: 15s + publisher_pipeline.disable_host: true + tags: + - preserve_original_event + - preserve_duplicate_custom_fields + - forwarded + - github.audit + type: azure-blob-storage + use_output: default +output_permissions: + default: + _elastic_agent_checks: + cluster: + - monitor + _elastic_agent_monitoring: + indices: [] + uuid-for-permissions-on-related-indices: + indices: + - names: + - logs-github.audit-ep + privileges: + - auto_configure + - create_doc +secret_references: + - {} diff --git a/packages/github/data_stream/audit/_dev/test/policy/test-abs.yml b/packages/github/data_stream/audit/_dev/test/policy/test-abs.yml new file mode 100644 index 00000000000..b9564bd3a77 --- /dev/null +++ b/packages/github/data_stream/audit/_dev/test/policy/test-abs.yml @@ -0,0 +1,13 @@ +input: azure-blob-storage +vars: +data_stream: + vars: + account_name: devstoreaccount1 + service_account_key: "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==" + number_of_workers: 3 + poll: true + poll_interval: 15s + containers: | + - name: test-container + preserve_original_event: true + preserve_duplicate_custom_fields: true diff --git a/packages/github/data_stream/audit/_dev/test/policy/test-gcs.expected b/packages/github/data_stream/audit/_dev/test/policy/test-gcs.expected new file mode 100644 index 00000000000..1933ba72070 --- /dev/null +++ b/packages/github/data_stream/audit/_dev/test/policy/test-gcs.expected @@ -0,0 +1,43 @@ +inputs: + - data_stream: + namespace: ep + meta: + package: + name: github + name: test-gcs-github + streams: + - auth.credentials_json.account_key: ${SECRET_0} + buckets: + - name: testbucket + data_stream: + dataset: github.audit + type: logs + file_selectors: null + max_workers: 3 + poll: true + poll_interval: 15s + project_id: gcs-project + publisher_pipeline.disable_host: true + tags: + - preserve_original_event + - preserve_duplicate_custom_fields + - forwarded + - github.audit + type: gcs + use_output: default +output_permissions: + default: + _elastic_agent_checks: + cluster: + - monitor + _elastic_agent_monitoring: + indices: [] + uuid-for-permissions-on-related-indices: + indices: + - names: + - logs-github.audit-ep + privileges: + - auto_configure + - create_doc +secret_references: + - {} diff --git a/packages/github/data_stream/audit/_dev/test/policy/test-gcs.yml b/packages/github/data_stream/audit/_dev/test/policy/test-gcs.yml new file mode 100644 index 00000000000..65c4885b493 --- /dev/null +++ b/packages/github/data_stream/audit/_dev/test/policy/test-gcs.yml @@ -0,0 +1,13 @@ +input: gcs +vars: +data_stream: + vars: + project_id: gcs-project + number_of_workers: 3 + poll: true + poll_interval: 15s + service_account_key: "{\"type\":\"service_account\",\"project_id\":\"fake-gcs-project\"}" + buckets: | + - name: testbucket + preserve_original_event: true + preserve_duplicate_custom_fields: true diff --git a/packages/github/data_stream/audit/_dev/test/system/test-abs-config.yml b/packages/github/data_stream/audit/_dev/test/system/test-abs-config.yml new file mode 100644 index 00000000000..c2de315a68c --- /dev/null +++ b/packages/github/data_stream/audit/_dev/test/system/test-abs-config.yml @@ -0,0 +1,16 @@ +deployer: docker +service: azure-blob-storage-emulator +input: azure-blob-storage +vars: +data_stream: + vars: + account_name: devstoreaccount1 + service_account_key: "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==" + storage_url: "http://{{Hostname}}:{{Port}}/devstoreaccount1/" + number_of_workers: 3 + poll: true + poll_interval: 15s + containers: | + - name: test-container +assert: + hit_count: 3 diff --git a/packages/github/data_stream/audit/_dev/test/system/test-gcs-config.yml b/packages/github/data_stream/audit/_dev/test/system/test-gcs-config.yml new file mode 100644 index 00000000000..d937cb37dc4 --- /dev/null +++ b/packages/github/data_stream/audit/_dev/test/system/test-gcs-config.yml @@ -0,0 +1,16 @@ +deployer: docker +service: gcs-mock-service +input: gcs +vars: +data_stream: + vars: + project_id: fake-gcs-project + alternative_host: "http://{{Hostname}}:{{Port}}" + number_of_workers: 1 + poll: true + poll_interval: 15s + service_account_key: "{\"type\":\"service_account\",\"project_id\":\"fake-gcs-project\"}" + buckets: | + - name: testbucket +assert: + hit_count: 3 diff --git a/packages/github/data_stream/audit/agent/stream/abs.yml.hbs b/packages/github/data_stream/audit/agent/stream/abs.yml.hbs new file mode 100644 index 00000000000..683822fa2ad --- /dev/null +++ b/packages/github/data_stream/audit/agent/stream/abs.yml.hbs @@ -0,0 +1,60 @@ +{{#if account_name}} +account_name: {{account_name}} +{{/if}} +{{#if oauth2}} +auth.oauth2: + client_id: {{client_id}} + client_secret: {{client_secret}} + tenant_id: {{tenant_id}} +{{/if}} +{{#if service_account_key}} +auth.shared_credentials.account_key: {{service_account_key}} +{{/if}} +{{#if service_account_uri}} +auth.connection_string.uri: {{service_account_uri}} +{{/if}} +{{#if storage_url}} +storage_url: {{storage_url}} +{{/if}} +{{#if number_of_workers}} +max_workers: {{number_of_workers}} +{{/if}} +{{#if poll}} +poll: {{poll}} +{{/if}} +{{#if poll_interval}} +poll_interval: {{poll_interval}} +{{/if}} + +{{#if containers}} +containers: +{{containers}} +{{/if}} +{{#if file_selectors}} +file_selectors: +{{file_selectors}} +{{/if}} +{{#if timestamp_epoch}} +timestamp_epoch: {{timestamp_epoch}} +{{/if}} +{{#if expand_event_list_from_field}} +expand_event_list_from_field: {{expand_event_list_from_field}} +{{/if}} + +tags: +{{#if preserve_original_event}} + - preserve_original_event +{{/if}} +{{#if preserve_duplicate_custom_fields}} + - preserve_duplicate_custom_fields +{{/if}} +{{#each tags as |tag|}} + - {{tag}} +{{/each}} +{{#contains "forwarded" tags}} +publisher_pipeline.disable_host: true +{{/contains}} +{{#if processors}} +processors: +{{processors}} +{{/if}} diff --git a/packages/github/data_stream/audit/agent/stream/gcs.yml.hbs b/packages/github/data_stream/audit/agent/stream/gcs.yml.hbs new file mode 100644 index 00000000000..f72ce4a7acd --- /dev/null +++ b/packages/github/data_stream/audit/agent/stream/gcs.yml.hbs @@ -0,0 +1,53 @@ +{{#if project_id}} +project_id: {{project_id}} +{{/if}} +{{#if alternative_host}} +alternative_host: {{alternative_host}} +{{/if}} +{{#if service_account_key}} +auth.credentials_json.account_key: {{service_account_key}} +{{/if}} +{{#if service_account_file}} +auth.credentials_file.path: {{service_account_file}} +{{/if}} +{{#if number_of_workers}} +max_workers: {{number_of_workers}} +{{/if}} +{{#if poll}} +poll: {{poll}} +{{/if}} +{{#if poll_interval}} +poll_interval: {{poll_interval}} +{{/if}} +{{#if buckets}} +buckets: +{{buckets}} +{{/if}} +{{#if file_selectors}} +file_selectors: +{{file_selectors}} +{{/if}} +{{#if timestamp_epoch}} +timestamp_epoch: {{timestamp_epoch}} +{{/if}} +{{#if expand_event_list_from_field}} +expand_event_list_from_field: {{expand_event_list_from_field}} +{{/if}} + +tags: +{{#if preserve_original_event}} + - preserve_original_event +{{/if}} +{{#if preserve_duplicate_custom_fields}} + - preserve_duplicate_custom_fields +{{/if}} +{{#each tags as |tag|}} + - {{tag}} +{{/each}} +{{#contains "forwarded" tags}} +publisher_pipeline.disable_host: true +{{/contains}} +{{#if processors}} +processors: +{{processors}} +{{/if}} diff --git a/packages/github/data_stream/audit/fields/beats.yml b/packages/github/data_stream/audit/fields/beats.yml index b024cda7f40..18b2ddf7138 100644 --- a/packages/github/data_stream/audit/fields/beats.yml +++ b/packages/github/data_stream/audit/fields/beats.yml @@ -19,3 +19,27 @@ - name: log.offset type: long description: Log offset. +- name: azure.storage + type: group + fields: + - name: container.name + type: keyword + description: The name of the Azure Blob Storage container + - name: blob.name + type: keyword + description: The name of the Azure Blob Storage blob object + - name: blob.content_type + type: keyword + description: The content type of the Azure Blob Storage blob object +- name: gcs.storage + type: group + fields: + - name: bucket.name + type: keyword + description: The name of the Google Cloud Storage Bucket. + - name: object.name + type: keyword + description: The content type of the Google Cloud Storage object. + - name: object.content_type + type: keyword + description: The content type of the Google Cloud Storage object. diff --git a/packages/github/data_stream/audit/manifest.yml b/packages/github/data_stream/audit/manifest.yml index 86468a21966..5eca98c27b6 100644 --- a/packages/github/data_stream/audit/manifest.yml +++ b/packages/github/data_stream/audit/manifest.yml @@ -98,7 +98,9 @@ streams: multi: false required: false show_user: false - description: "Processors are used to reduce the number of fields in the exported event or to enhance the event with metadata. \nThis executes in the agent before the logs are parsed. \nSee [Processors](https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html) for details.\n" + description: > + "Processors are used to reduce the number of fields in the exported event or to enhance the event with metadata. \nThis executes in the agent before the logs are parsed. \nSee [Processors](https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html) for details.\n" + - input: azure-eventhub description: Collect GitHub audit logs from Azure Event Hub title: GitHub Audit Logs from Azure Event Hub @@ -424,3 +426,307 @@ streams: # yvgJ38BRsFOtkRuAGSf6ZUwTO8JJRRIFnpUzXflAnGivK9M13D5GEQMmIl6U9Pvk # sxSmbIUfc2SGJGCJD4I= # -----END CERTIFICATE----- + - input: azure-blob-storage + title: GitHub Audit Logs + description: Collect GitHub audit logs from Azure Blob Storage + template_path: abs.yml.hbs + enabled: false + vars: + - name: account_name + type: text + title: Account Name + description: | + This attribute is required for various internal operations with respect to authentication, creating service clients and blob clients which are used internally for various processing purposes. + required: true + show_user: true + - name: client_id + type: text + title: Client ID (OAuth2) + description: Client ID of Azure Account. This is required if 'Collect logs using OAuth2 authentication' is enabled. + required: false + show_user: true + secret: true + - name: client_secret + type: password + title: Client Secret (OAuth2) + description: Client Secret of Azure Account. This is required if 'Collect logs using OAuth2 authentication' is enabled. + required: false + show_user: true + secret: true + - name: tenant_id + type: text + title: Tenant ID (OAuth2) + description: Tenant ID of Azure Account. This is required if 'Collect logs using OAuth2 authentication' is enabled. + multi: false + required: false + show_user: true + - name: service_account_key + type: password + title: Service Account Key + description: | + This attribute contains the access key, found under the Access keys section on Azure Cloud, under the respective storage account. A single storage account can contain multiple containers, and they will all use this common access key. + required: false + show_user: true + secret: true + - name: service_account_uri + type: text + title: Service Account URI + description: | + This attribute contains the connection string, found under the Access keys section on Azure Cloud, under the respective storage account. A single storage account can contain multiple containers, and they will all use this common connection string. + required: false + show_user: false + - name: storage_url + type: text + title: Storage URL + description: | + Use this attribute to specify a custom storage URL if required. By default it points to azure cloud storage. Only use this if there is a specific need to connect to a different environment where blob storage is available. + URL format : {{protocol}}://{{account_name}}.{{storage_uri}}. + required: false + show_user: false + - name: number_of_workers + type: integer + title: Maximum number of workers + multi: false + required: false + show_user: true + default: 3 + description: Determines how many workers are spawned per container. + - name: poll + type: bool + title: Polling + multi: false + required: false + show_user: true + default: true + description: Determines if the container will be continuously polled for new documents. + - name: poll_interval + type: text + title: Polling interval + multi: false + required: false + show_user: true + default: 15s + description: Determines the time interval between polling operations. + - name: containers + type: yaml + title: Containers + description: > + "This attribute contains the details about a specific container like, name, number_of_workers, poll, poll_interval etc. The attribute 'name' is specific to a container as it describes the container name, while the fields number_of_workers, poll, poll_interval can exist both at the container level and at the global level. \nIf you have already defined the attributes globally, then you can only specify the container name in this yaml config. If you want to override any specific attribute for a container, then, you can define it here. Any attribute defined in the yaml will override the global definitions. \nPlease see the relevant [documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-azure-blob-storage.html#attrib-containers) for further information.\n" + + required: true + show_user: true + default: | + #- name: azure-container1 + # max_workers: 3 + # poll: true + # poll_interval: 15s + #- name: azure-container2 + # max_workers: 3 + # poll: true + # poll_interval: 10s + - name: file_selectors + type: yaml + title: File Selectors + multi: false + required: false + show_user: false + default: | + # - regex: "event/" + description: > + "If the container will have events that correspond to files that this integration shouldn’t process, file_selectors can be used to limit the files that are downloaded. \nThis is a list of selectors which is made up of regex patters. The regex should match the container filepath. Regexes use [RE2 syntax](https://pkg.go.dev/regexp/syntax). \nFiles that don’t match one of the regexes will not be processed. \nThis process happens locally on the host hence it is an expensive operation. It is recommended to use this attribute only if there is a specific need to filter out files locally.\n" + + - name: timestamp_epoch + type: integer + title: Timestamp Epoch + multi: false + required: false + description: > + "This attribute can be used to filter out files/blobs which have a timestamp older than the specified value. The value of this attribute should be in unix epoch (seconds) format. \nThis process happens locally on the host hence it is an expensive operation. It is recommended to use this attribute only if there is a specific need to filter out files locally.\n" + + show_user: false + - name: expand_event_list_from_field + type: text + title: Expand Event List From Field + multi: false + required: false + show_user: false + description: > + "If the file-set using this input expects to receive multiple messages bundled under a specific field or an array of objects then the config option for 'expand_event_list_from_field' can be specified. This setting will be able to split the messages under the group value into separate events. \nThis can be specified at the global level or at the container level. For more info please refer to the [documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-azure-blob-storage.html#attrib-expand_event_list_from_field).\n" + + - name: preserve_original_event + required: true + show_user: true + title: Preserve original event + description: Preserves a raw copy of the original event, added to the field `event.original`. + type: bool + multi: false + default: false + - name: preserve_duplicate_custom_fields + required: true + show_user: false + title: Preserve duplicate custom fields + description: Preserve github.audit fields that were copied to Elastic Common Schema (ECS) fields. + type: bool + multi: false + default: false + - name: processors + type: yaml + title: Processors + multi: false + required: false + show_user: false + description: | + Processors are used to reduce the number of fields in the exported event or to enhance the event with metadata. This executes in the agent before the logs are parsed. See [Processors](https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html) for details. + - name: tags + type: text + title: Tags + description: Tags to include in the published event. + required: true + default: + - forwarded + - github.audit + multi: true + show_user: false + - input: gcs + title: GitHub Audit Logs + description: Collect GitHub audit logs from Google Cloud Storage. + template_path: gcs.yml.hbs + enabled: false + vars: + - name: project_id + type: text + title: Project Id + description: | + This attribute is required for various internal operations with respect to authentication, creating service clients and bucket clients which are used internally for various processing purposes. + multi: false + required: true + show_user: true + default: my-project-id + - name: alternative_host + type: text + title: Alternative Host + description: Used to override the default host for the storage client (default is storage.googleapis.com) + required: false + multi: false + show_user: false + - name: service_account_key + type: password + title: Credentials JSON Key + description: | + This attribute contains the JSON service account credentials string, which can be generated from the google cloud console. Refer to [Service Account Keys](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) for details. + Required if a Service Account File is not provided. + multi: false + required: false + show_user: true + secret: true + - name: service_account_file + type: text + title: Credentials File Path + description: | + This attribute contains the service account credentials file, which can be generated from the google cloud console. Refer to [Service Account Keys](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) for details. + Required if a Service Account Key is not provided. + multi: false + required: false + show_user: false + - name: number_of_workers + type: integer + title: Maximum number of workers + multi: false + required: false + show_user: true + default: 3 + description: Determines how many workers are spawned per bucket. + - name: poll + type: bool + title: Polling + multi: false + required: false + show_user: true + default: true + description: Determines if the bucket will be continuously polled for new documents. + - name: poll_interval + type: text + title: Polling Interval + multi: false + required: false + show_user: true + default: 15s + description: Determines the time interval between polling operations. + - name: buckets + type: yaml + title: Buckets + description: >- + This attribute contains the details about a specific bucket like, name, max_workers, poll and poll_interval. The attribute 'name' is specific to a bucket as it describes the bucket name, while the fields max_workers, poll and poll_interval can exist both at the bucket level and at the global level. If you have already defined the attributes globally, then you can only specify the name in this yaml config. If you want to override any specific attribute for a specific bucket, then, you can define it here. Any attribute defined in the yaml will override the global definitions. Please see the relevant[Documentation](https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-gcs#attrib-buckets) for further information. + required: true + show_user: true + default: | + # You can define as many buckets as you want here. + #- name: gcs_bucket_1 + #- name: gcs_bucket_2 + # The config below is an example of how to override the global config. + #- name: gcs_bucket_3 + # max_workers: 3 + # poll: true + # poll_interval: 10s + - name: file_selectors + type: yaml + title: File Selectors + multi: false + required: false + show_user: false + default: | + # - regex: "event/" + description: > + "If the bucket will have events that correspond to files that this integration shouldn’t process, file_selectors can be used to limit the files that are processed. \nThis is a list of selectors which is made up of regex patters. The regex should match the bucket filepath. Regexes use [RE2 syntax](https://pkg.go.dev/regexp/syntax). \nFiles that don’t match one of the regexes will not be processed. \nThis process happens locally on the host hence it is an expensive operation. It is recommended to use this attribute only if there is a specific need to filter out files locally.\n" + + - name: timestamp_epoch + type: integer + title: Timestamp Epoch + multi: false + required: false + description: > + "This attribute can be used to filter out files/objects which have a timestamp older than the specified value. The value of this attribute should be in unix epoch (seconds) format. \nThis process happens locally on the host hence it is an expensive operation. It is recommended to use this attribute only if there is a specific need to filter out files locally.\n" + + show_user: false + - name: expand_event_list_from_field + type: text + title: Expand Event List From Field + multi: false + required: false + show_user: false + description: > + "If the file-set using this input expects to receive multiple messages bundled under a specific field or an array of objects then the config option for 'expand_event_list_from_field' can be specified. This setting will be able to split the messages under the group value into separate events. \nThis can be specified at the global level or at the bucket level. For more info please refer to the [documentation](https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-gcs#attrib-expand_event_list_from_field-gcs).\n" + + - name: preserve_original_event + required: true + show_user: true + title: Preserve original event + description: Preserves a raw copy of the original event, added to the field `event.original`. + type: bool + multi: false + default: false + - name: preserve_duplicate_custom_fields + required: true + show_user: false + title: Preserve duplicate custom fields + description: Preserve github.audit fields that were copied to Elastic Common Schema (ECS) fields. + type: bool + multi: false + default: false + - name: processors + type: yaml + title: Processors + multi: false + required: false + show_user: false + description: | + Processors are used to reduce the number of fields in the exported event or to enhance the event with metadata. This executes in the agent before the logs are parsed. See [Processors](https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html) for details. + - name: tags + type: text + title: Tags + multi: true + required: true + show_user: false + default: + - forwarded + - github.audit diff --git a/packages/github/data_stream/dependabot/elasticsearch/ingest_pipeline/default.yml b/packages/github/data_stream/dependabot/elasticsearch/ingest_pipeline/default.yml index 34e0371e621..32da2270fc8 100644 --- a/packages/github/data_stream/dependabot/elasticsearch/ingest_pipeline/default.yml +++ b/packages/github/data_stream/dependabot/elasticsearch/ingest_pipeline/default.yml @@ -260,7 +260,7 @@ processors: source: >- ZonedDateTime start = ZonedDateTime.parse(ctx.event.start); ZonedDateTime end = ZonedDateTime.parse(ctx.event.end); ctx.event.duration = ChronoUnit.NANOS.between(start, end); ################################# - # For Github Overview Dashboard # + # For GitHub Overview Dashboard # ################################# - lowercase: field: github.dependabot.state diff --git a/packages/github/data_stream/issues/manifest.yml b/packages/github/data_stream/issues/manifest.yml index 41f4ce1e2be..a1335e109ff 100644 --- a/packages/github/data_stream/issues/manifest.yml +++ b/packages/github/data_stream/issues/manifest.yml @@ -1,5 +1,5 @@ type: logs -title: Github Issue +title: GitHub Issue release: beta streams: - input: httpjson @@ -119,5 +119,5 @@ streams: show_user: false description: "Processors are used to reduce the number of fields in the exported event or to enhance the event with metadata. \nThis executes in the agent before the logs are parsed. \nSee [Processors](https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html) for details.\n" template_path: httpjson.yml.hbs - title: Github Issues + title: GitHub Issues description: Collect GitHub issues as events via the API diff --git a/packages/github/docs/README.md b/packages/github/docs/README.md index 10cd3e9447b..023d9d15a1d 100644 --- a/packages/github/docs/README.md +++ b/packages/github/docs/README.md @@ -25,14 +25,22 @@ For Organizations: The GitHub audit log records all events related to the GitHub organization/enterprise. See [Organization audit log actions](https://docs.github.com/en/organizations/keeping-your-organization-secure/reviewing-the-audit-log-for-your-organization#audit-log-actions) and [Enterprise audit log actions](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/about-the-audit-log-for-your-enterprise) for more details. -Github integration can collect audit logs from three sources: [Github API](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/using-the-audit-log-api-for-your-enterprise), [Azure Event Hubs](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-azure-event-hubs), and [AWS S3 or AWS SQS](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-amazon-s3). +The GitHub integration can collect audit logs from the following sources: [GitHub API](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/using-the-audit-log-api-for-your-enterprise), [Azure Event Hubs](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-azure-event-hubs), [Azure Blob Storage](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-azure-blob-storage), [AWS S3 or AWS SQS](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-amazon-s3) and [Google Cloud Storage](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-google-cloud-storage). -When using Github API to collect audit log events, below requirements must be met for Personal Access Token (PAT): +When using GitHub API to collect audit log events, below requirements must be met for Personal Access Token (PAT): - You must use a Personal Access Token with `read:audit_log` scope. This applies to both organization and enterprise admins. - If you're an enterprise admin, ensure your token also includes `admin:enterprise` scope to access enterprise-wide logs. To collect audit log events from Azure Event Hubs, follow the [guide](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-azure-event-hubs) to setup audit log streaming. +To collect audit log events from Azure Blob Storage, follow the [guide](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-azure-blob-storage) to setup audit log streaming. To collect audit log events from AWS S3 or AWS SQS, follow the [guide](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-amazon-s3) to setup audit log streaming. For more details, refer to this [documentation](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise). +To collect audit log events from Google Cloud Storage, follow the [guide](https://docs.github.com/en/enterprise-cloud@latest/admin/monitoring-activity-in-your-enterprise/reviewing-audit-logs-for-your-enterprise/streaming-the-audit-log-for-your-enterprise#setting-up-streaming-to-google-cloud-storage) to setup audit log streaming. + +For Filebeat input documentation, refer to the following pages: + - [Azure Event Hub](https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-azure-eventhub) + - [Azure Blob Storage](https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-azure-blob-storage) + - [AWS S3](https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-aws-s3) + - [Google Cloud Storage](https://www.elastic.co/docs/reference/beats/filebeat/filebeat-input-gcs) *This integration is not compatible with GitHub Enterprise server.* @@ -44,11 +52,17 @@ To collect audit log events from AWS S3 or AWS SQS, follow the [guide](https://d | aws.s3.bucket.arn | The AWS S3 bucket ARN. | keyword | | aws.s3.bucket.name | The AWS S3 bucket name. | keyword | | aws.s3.object.key | The AWS S3 Object key. | keyword | +| azure.storage.blob.content_type | The content type of the Azure Blob Storage blob object | keyword | +| azure.storage.blob.name | The name of the Azure Blob Storage blob object | keyword | +| azure.storage.container.name | The name of the Azure Blob Storage container | keyword | | data_stream.dataset | Data stream dataset name. | constant_keyword | | data_stream.namespace | Data stream namespace. | constant_keyword | | data_stream.type | Data stream type. | constant_keyword | | event.dataset | Event dataset | constant_keyword | | event.module | Event module | constant_keyword | +| gcs.storage.bucket.name | The name of the Google Cloud Storage Bucket. | keyword | +| gcs.storage.object.content_type | The content type of the Google Cloud Storage object. | keyword | +| gcs.storage.object.name | The content type of the Google Cloud Storage object. | keyword | | github.active | | boolean | | github.actor_id | The id of the actor who performed the action. | keyword | | github.actor_ip | The IP address of the entity performing the action. | ip | diff --git a/packages/github/elasticsearch/transform/latest_code_scanning/transform.yml b/packages/github/elasticsearch/transform/latest_code_scanning/transform.yml index a46e300f258..25f3111fa94 100644 --- a/packages/github/elasticsearch/transform/latest_code_scanning/transform.yml +++ b/packages/github/elasticsearch/transform/latest_code_scanning/transform.yml @@ -22,7 +22,7 @@ latest: - github.code_scanning.created_at sort: "event.ingested" description: >- - Latest Code Scanning Alerts from Github's Code Scanning. As code scanning alerts get updated (dismissed/reopened), this transform stores only the latest state of each code scanning alert inside the destination index. Thus the transform's destination index contains only the latest state of the alerts. + Latest Code Scanning Alerts from GitHub's Code Scanning. As code scanning alerts get updated (dismissed/reopened), this transform stores only the latest state of each code scanning alert inside the destination index. Thus the transform's destination index contains only the latest state of the alerts. frequency: 30s settings: # This is required to prevent the transform from clobbering the Fleet-managed mappings. diff --git a/packages/github/elasticsearch/transform/latest_dependabot/transform.yml b/packages/github/elasticsearch/transform/latest_dependabot/transform.yml index 98874cb833f..7c3ff6acb4b 100644 --- a/packages/github/elasticsearch/transform/latest_dependabot/transform.yml +++ b/packages/github/elasticsearch/transform/latest_dependabot/transform.yml @@ -22,7 +22,7 @@ latest: - github.dependabot.created_at sort: "event.ingested" description: >- - Latest Alerts from Github's Dependabot. As Alerts get updated, this transform stores only the latest state of each alert inside the destination index. Thus the transform's destination index contains only the latest state of the alert. + Latest Alerts from GitHub's Dependabot. As Alerts get updated, this transform stores only the latest state of each alert inside the destination index. Thus the transform's destination index contains only the latest state of the alert. frequency: 30s settings: # This is required to prevent the transform from clobbering the Fleet-managed mappings. diff --git a/packages/github/elasticsearch/transform/latest_issues/transform.yml b/packages/github/elasticsearch/transform/latest_issues/transform.yml index 390de440073..0324d6d4d8a 100644 --- a/packages/github/elasticsearch/transform/latest_issues/transform.yml +++ b/packages/github/elasticsearch/transform/latest_issues/transform.yml @@ -22,7 +22,7 @@ latest: - github.issues.created_at sort: "event.ingested" description: >- - Latest Issues from Github. As issues get updated, this transform stores only the latest state of each issue inside the destination index. Thus the transform's destination index contains only the latest state of the issue. + Latest Issues from GitHub. As issues get updated, this transform stores only the latest state of each issue inside the destination index. Thus the transform's destination index contains only the latest state of the issue. frequency: 30s settings: # This is required to prevent the transform from clobbering the Fleet-managed mappings. diff --git a/packages/github/elasticsearch/transform/latest_secret_scanning/transform.yml b/packages/github/elasticsearch/transform/latest_secret_scanning/transform.yml index 10a2a1aabc2..29c5f5c694d 100644 --- a/packages/github/elasticsearch/transform/latest_secret_scanning/transform.yml +++ b/packages/github/elasticsearch/transform/latest_secret_scanning/transform.yml @@ -22,7 +22,7 @@ latest: - github.secret_scanning.created_at sort: "event.ingested" description: >- - Latest Secret Scanning Alerts from Github's Secret Scanning. As secret scanning alerts get updated, this transform stores only the latest state of each secret scanning alert inside the destination index. Thus the transform's destination index contains only the latest state of the alerts. + Latest Secret Scanning Alerts from GitHub's Secret Scanning. As secret scanning alerts get updated, this transform stores only the latest state of each secret scanning alert inside the destination index. Thus the transform's destination index contains only the latest state of the alerts. frequency: 30s settings: # This is required to prevent the transform from clobbering the Fleet-managed mappings. diff --git a/packages/github/kibana/dashboard/github-4da91aa0-12fc-11ed-af77-016e1a977d80.json b/packages/github/kibana/dashboard/github-4da91aa0-12fc-11ed-af77-016e1a977d80.json index 198711177d2..7267acdb683 100644 --- a/packages/github/kibana/dashboard/github-4da91aa0-12fc-11ed-af77-016e1a977d80.json +++ b/packages/github/kibana/dashboard/github-4da91aa0-12fc-11ed-af77-016e1a977d80.json @@ -1490,7 +1490,7 @@ "id": "", "params": { "fontSize": 12, - "markdown": "This dashboard provides an overview of the alerts ingested from Github Code Scanning.\n\nThe dashboard provides details on code scanning alerts that are open and resolved. It deep-dives into the top 10 repositories where code scanning alerts are found. It also calculates the mean-time to resolve (or dismiss) an open code scanning alert. The dashboard presents a view of alerts by severity and code scanning rules defining the alerts. Finally, it gives a layout of top users resolving the code scanning alerts.\n\n[**Integrations Page**](/app/integrations/detail/github/overview)", + "markdown": "This dashboard provides an overview of the alerts ingested from GitHub Code Scanning.\n\nThe dashboard provides details on code scanning alerts that are open and resolved. It deep-dives into the top 10 repositories where code scanning alerts are found. It also calculates the mean-time to resolve (or dismiss) an open code scanning alert. The dashboard presents a view of alerts by severity and code scanning rules defining the alerts. Finally, it gives a layout of top users resolving the code scanning alerts.\n\n[**Integrations Page**](/app/integrations/detail/github/overview)", "openLinksInNewTab": false }, "title": "", diff --git a/packages/github/kibana/dashboard/github-591d69e0-17b6-11ed-809a-7b4be950fe9c.json b/packages/github/kibana/dashboard/github-591d69e0-17b6-11ed-809a-7b4be950fe9c.json index ba125cd54d5..bcd8bea67fc 100644 --- a/packages/github/kibana/dashboard/github-591d69e0-17b6-11ed-809a-7b4be950fe9c.json +++ b/packages/github/kibana/dashboard/github-591d69e0-17b6-11ed-809a-7b4be950fe9c.json @@ -1219,7 +1219,7 @@ "id": "", "params": { "fontSize": 12, - "markdown": "This dashboard provides an overview of the events ingested from Github.\n\nThe dashboard provides details on secret scanning alerts that are open and resolved. It deep-dives into the top 10 repositories where secret scanning alerts are found. It also calculates the mean-time to resolve (or dismiss) an open secret scanning alert. The dashboard presents a view of the type of secrets that are currently open. Finally, it gives a layout of top users resolving the secret scanning alerts.\n\n[**Integrations Page**](/app/integrations/detail/github/overview)", + "markdown": "This dashboard provides an overview of the events ingested from GitHub.\n\nThe dashboard provides details on secret scanning alerts that are open and resolved. It deep-dives into the top 10 repositories where secret scanning alerts are found. It also calculates the mean-time to resolve (or dismiss) an open secret scanning alert. The dashboard presents a view of the type of secrets that are currently open. Finally, it gives a layout of top users resolving the secret scanning alerts.\n\n[**Integrations Page**](/app/integrations/detail/github/overview)", "openLinksInNewTab": false }, "title": "", diff --git a/packages/github/kibana/dashboard/github-6197be80-220c-11ed-88c4-e3caca48250a.json b/packages/github/kibana/dashboard/github-6197be80-220c-11ed-88c4-e3caca48250a.json index c32d789f02c..17fe4bcf867 100644 --- a/packages/github/kibana/dashboard/github-6197be80-220c-11ed-88c4-e3caca48250a.json +++ b/packages/github/kibana/dashboard/github-6197be80-220c-11ed-88c4-e3caca48250a.json @@ -1268,7 +1268,7 @@ "id": "", "params": { "fontSize": 12, - "markdown": "This dashboard provides an overview of the alerts ingested from Github Code Scanning.\n\nThe dashboard provides details on code scanning alerts that are open and resolved. It deep-dives into the top 10 repositories where code scanning alerts are found. It also calculates the mean-time to resolve (or dismiss) an open code scanning alert. The dashboard presents a view of alerts by severity and code scanning rules defining the alerts. Finally, it gives a layout of top users resolving the code scanning alerts.\n\n[**Integrations Page**](/app/integrations/detail/github/overview)", + "markdown": "This dashboard provides an overview of the alerts ingested from GitHub Code Scanning.\n\nThe dashboard provides details on code scanning alerts that are open and resolved. It deep-dives into the top 10 repositories where code scanning alerts are found. It also calculates the mean-time to resolve (or dismiss) an open code scanning alert. The dashboard presents a view of alerts by severity and code scanning rules defining the alerts. Finally, it gives a layout of top users resolving the code scanning alerts.\n\n[**Integrations Page**](/app/integrations/detail/github/overview)", "openLinksInNewTab": false }, "title": "", diff --git a/packages/github/kibana/dashboard/github-6a6d7c40-17ab-11ed-809a-7b4be950fe9c.json b/packages/github/kibana/dashboard/github-6a6d7c40-17ab-11ed-809a-7b4be950fe9c.json index 991d9334771..f7e43cab2dd 100644 --- a/packages/github/kibana/dashboard/github-6a6d7c40-17ab-11ed-809a-7b4be950fe9c.json +++ b/packages/github/kibana/dashboard/github-6a6d7c40-17ab-11ed-809a-7b4be950fe9c.json @@ -1475,7 +1475,7 @@ "id": "", "params": { "fontSize": 12, - "markdown": "This dashboard provides an overview of the alerts ingested from Github Code Scanning, Secret Scanning, and Dependabot.\n\nThe dashboard provides an overview of code scanning, secret scanning, and dependabot alerts that are open and resolved. It deep-dives into the top 10 repositories where alerts are found. The dashboard presents a view of alerts by severity. The dashboard gives a view alerts by type of GHAS Product.\n\n[**Integrations Page**](/app/integrations/detail/github/overview)", + "markdown": "This dashboard provides an overview of the alerts ingested from GitHub Code Scanning, Secret Scanning, and Dependabot.\n\nThe dashboard provides an overview of code scanning, secret scanning, and dependabot alerts that are open and resolved. It deep-dives into the top 10 repositories where alerts are found. The dashboard presents a view of alerts by severity. The dashboard gives a view alerts by type of GHAS Product.\n\n[**Integrations Page**](/app/integrations/detail/github/overview)", "openLinksInNewTab": false }, "title": "", diff --git a/packages/github/kibana/dashboard/github-f0104680-ae18-11ed-83fa-df5d96a45724.json b/packages/github/kibana/dashboard/github-f0104680-ae18-11ed-83fa-df5d96a45724.json index b2d7d847ea0..c704906c8b4 100644 --- a/packages/github/kibana/dashboard/github-f0104680-ae18-11ed-83fa-df5d96a45724.json +++ b/packages/github/kibana/dashboard/github-f0104680-ae18-11ed-83fa-df5d96a45724.json @@ -76,7 +76,7 @@ "store": "appState" }, "meta": { - "alias": "Github Issues", + "alias": "GitHub Issues", "disabled": false, "indexRefName": "kibanaSavedObjectMeta.searchSourceJSON.filter[0].meta.index", "negate": false, @@ -1301,7 +1301,7 @@ "id": "", "params": { "fontSize": 12, - "markdown": "This dashboard provides an overview of the issues ingested from Github.\n\nThe dashboard provides details on issues that are open and resolved. It provides a view of the top 10 repositories with issues. It also calculates the mean-time to fix (or close) an issue. The dashboard presents a view of top labels that are assigned to the issues. Finally, it gives a layout of top users creating and fixing the issues.\n\n[**Integrations Page**](/app/integrations/detail/github/overview)", + "markdown": "This dashboard provides an overview of the issues ingested from GitHub.\n\nThe dashboard provides details on issues that are open and resolved. It provides a view of the top 10 repositories with issues. It also calculates the mean-time to fix (or close) an issue. The dashboard presents a view of top labels that are assigned to the issues. Finally, it gives a layout of top users creating and fixing the issues.\n\n[**Integrations Page**](/app/integrations/detail/github/overview)", "openLinksInNewTab": false }, "title": "", diff --git a/packages/github/manifest.yml b/packages/github/manifest.yml index 56a22bda6fd..5ac74c604c3 100644 --- a/packages/github/manifest.yml +++ b/packages/github/manifest.yml @@ -1,13 +1,13 @@ name: github title: GitHub -version: "2.14.0" +version: "2.15.0" description: Collect logs from GitHub with Elastic Agent. type: integration format_version: "3.4.0" categories: [security, "productivity_security"] conditions: kibana: - version: "^8.16.0 || ^9.0.0" + version: "^8.17.1 || ^9.0.0" icons: - src: /img/github.svg title: GitHub @@ -68,6 +68,12 @@ policy_templates: - type: azure-eventhub title: "Collect GitHub logs from Azure Event Hub" description: "Collect GitHub logs from Azure Event Hub" + - type: azure-blob-storage + title: Collect GitHub logs from Azure Blob Storage + description: Collect GitHub logs from Azure Blob Storage. + - type: gcs + title: Collect GitHub logs from Google Cloud Storage + description: Collect GitHub logs from Google Cloud Storage. - type: cel title: Collect GitHub Security Advisories data via API description: Collect GitHub Security Advisories data via API.