DevOps training

Logo Logging, Monitoring, and Observability in Google Cloud

Logging, Monitoring, and Observability in Google Cloud

Control your infrastructure and application

50% theory, 50% practise
(3.7/5) 20 ratings
The opinions come from end-of-training evaluations. The score is an average based on the following themes: Richness of content • Quality of presentation • Theory/practice ratio • Relevance of examples • Interest in practical work
Duration 3 days • 21 hours Get a quote
Official icon Official
New icon New
On-site icon On-site
Remote icon Remote
Certifying icon Download the course datasheet

  • Introduction to Google Cloud Monitoring Tools

    • Explain the purpose and capabilities of Google Cloud operations-focused components: Logging, Monitoring, Error Reporting, and Incident Response and Management (IRM)
    • Explain the purpose and capabilities of Google Cloud application performance management focused components: Debugger, Trace, Profiler, and Service Monitoring
  • Avoiding Customer Pain

    • Construct a monitoring base from the four golden signals: latency, traffic, errors, and saturation
    • Define critical system measures with Service Level Indicators (SLIs)
    • Use Service Level Objectives (SLOs) and Service Level Agreements (SLAs) to measure, and avoid, customer pain
    • Achieve developer and operation harmony with SLO based error budgets
  • Monitoring Critical Systems

    • Choose best practice monitoring project architectures
    • Differentiate Cloud IAM roles for monitoring
    • Use the default dashboards appropriately
    • Build custom dashboards to show resource consumption and application load
    • Define uptime checks to track aliveness and latency
  • Alerting Policies

    • Develop alerting strategies
    • Define alerting policies
    • Add notification channels
    • Identify types of alerts and common uses for each
    • Construct and alert on resource groups
    • Manage alerting policies programmaticall
  • Advanced Logging and Analysis

    • Identify and choose among resource tagging approaches
    • Define log sinks (inclusion filters) and exclusion filters
    • Create metrics based on logs
    • Export logs to BigQuery
  • Working with Audit Logs

    • Use Admin Activity, Data Access, and System Event audit logs
    • Track who, did what, and when
  • Configuring Google Cloud Services for Observability

    • Integrate Logging and Monitoring agents into Compute Engine VMs and images
    • Enable and utilize Kubernetes Monitoring
    • Extend and clarify Kubernetes Monitoring with Prometheus
    • Expose custom metrics through code, and with the help of OpenCensus
  • Monitoring the Google Cloud VPC

    • Collect and analyze VPC Flow, Firewall Rules, and Cloud NAT logs
    • Enable Packet Mirroring
    • Explain the capabilities of Network Intelligence Center
  • Managing Incidents

    • Handle incidents systematically
    • Define incident management roles and communication channels
    • Mitigate incident impact
    • Troubleshoot root causes
    • Resolve the incident
    • Document incident in a postmortem process
  • Investigating Application Performance Issues

    • Use Error Reporting to identify and understand your application errors.
    • Debug production code to correct code defects
    • Trace latency through layers of service interaction to eliminate performance bottlenecks
    • Profile and identify resource-intensive functions in an application
  • Optimizing the Costs of Monitoring

    • Analyze resource utilization cust for monitoring related components within Google Cloud
    • Implement best practices for controlling the cost of monitoring within Google Cloud

Dernière mise à jour : le 04/05/2024 à 13:05