Incident Detection and Response in Google Cloud Platform (GCP)
Author Eric Evans
Detecting events and having the ability to react to these events in a rapid fashion is extremely important to bolstering security posture in the cloud. One of the benefits of having infrastructure in the cloud is having the ability to quickly develop and iterate on solutions, usually driven by event-driven architecture via the use of serverless functions that key off events of interest. These events of interest typically reside within the logs that are generated by your cloud service provider (CSP), ingested by a system that can detect patterns within these logs to look for indicators of compromise (IoC), and then finally having infrastructure in place that can react to the events for remediating any findings. In this article, we will explore the options that are natively available in Google Cloud Platform (GCP). Please note that there are third-party solutions to event detection and response in GCP, each with their own pros and cons.
Google Cloud Platform
In GCP terms, there is an entire offering called Operations (formerly known as Stackdriver) that is used to monitor your cloud environment by enhancing observability in the form of logs, metrics, alerts, and so on. This service is where a good detection strategy foundation starts with GCP. Once the data that is coming into the platform is optimized in the right areas, filters can be used to export detections wherever they may need to go (known as destination sinks). Cloud-native detection systems such as Security Health Analytics (SHA) and Event Threat Detection (ETD) can be used to bring protentional IoCs to light using these logs as a source of truth, and finally the use of Pub/Sub and Cloud Functions can be used to respond to findings generated by these tools.
As noted above, Operations is where a robust security foundation starts in Google Cloud. The main component of Operations that is useful for security practitioners in GCP is Cloud Logging. Through Cloud Logging, you can have a variety of sources to give information about events that happen within your cloud environment. These include:
Data Access Logs – contain information about API calls that happen within Google Cloud. This includes any events that read, create, or modify data within the platform.
Admin Activity Logs – contain information about API calls that performs administrative actions on resources within Google Cloud. This includes events that create or modify infrastructure.
System Event Logs – contain information about administrative actions that modify your environment that happens via Google’s systems (not API calls).
Agent Logs – contain logs that are forwarded to Cloud Logging via an installed agent. These are useful for custom or hybrid logging solutions that may not fit into the other categories
Access Transparency Logs – contain information about actions that Google performs for support and/or troubleshooting purposes.
VPC Flow Logs – contains information about network traffic within your Virtual Private Clouds (VPCs). Very useful for network monitoring, forensics, security analysis, and cost management.
Firewall Logs – contains information about actions that happen because of a firewall within GCP. Useful for debugging firewall configurations and checking on allowed/denied traffic based on Firewall rules.
Security Health Analytics
Security Health Analytics (SHA) is a service that monitors your GCP environment for particular events of interest that may be of security concern. SHA uses Cloud Audit logs such as Admin Activity and Data Access logs to generate findings that are shown in Security Command Center (SCC). Some findings that SHA can find are:
Users that don’t use two-step verification (2SV)
Unrestricted GCP API keys