O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Prometheus Up & Running

Book Description

Get up to speed with Prometheus, the metrics-based monitoring system used by thousands of organizations in production. This practical guide provides application developers, sysadmins, and DevOps practitioners with a hands-on introduction to the important aspects of Prometheus, including infrastructure and application monitoring, dashboarding and alerting, direct code instrumentation, and metric collection from third-party systems with exporters.

This open source system has gained popularity over the past few years for good reason. With its simple yet powerful data model and query language, Prometheus does one thing and it does it well. Author and Prometheus core developer Brian Brazil guides you through Prometheus setup, the Node Exporter, and the Alertmanager, then guides you through its use in application and infrastructure monitoring.

  • Know where and how much to apply instrumentation to your application code
  • Expose metrics through client libraries to make them available to Prometheus
  • Identify metrics with labels: unique key-value pairs associated with a time series
  • Get an introduction to Grafana, a popular tool for building dashboards
  • Learn how to use the node exporter to monitor your infrastructure
  • Use service discovery to provide different views of your machines and services
  • Use Prometheus with Kubernetes, and examine exporters you can use with containers
  • Convert data from other monitoring systems into the Prometheus format

Table of Contents

  1. Preface
    1. Expanding the Known
  2. I. Introduction
  3. 1. What is Prometheus?
    1. What is Monitoring?
      1. A Brief and Incomplete History of Monitoring
      2. Categories of Monitoring
    2. Prometheus Architecture
      1. Client Libraries
      2. Exporters
      3. Service Discovery
      4. Scraping
      5. Storage
      6. Dashboards
      7. Recording Rules and Alerts
      8. Alertmanagement
      9. Long Term Storage
    3. What is Prometheus Not?
  4. 2. Getting Started with Prometheus
    1. Running Prometheus
    2. Using the Expression Browser
    3. Running the Node Exporter
    4. Alerting
  5. II. Application Monitoring
  6. 3. Instrumentation
    1. A Simple Program
    2. The Counter
      1. Counting Exceptions
      2. Counting Size
    3. The Gauge
      1. Using Gauges
      2. Callbacks
    4. The Summary
    5. The Histogram
      1. Buckets
    6. Unittesting Instrumentation
    7. Approaching Instrumentation
      1. What should I instrument?
      2. How Much Should I instrument?
      3. What should I name my metrics?
  7. 4. Exposition
    1. Python
      1. WSGI
      2. Twisted
      3. Multiprocess with Gunicorn
    2. Go
    3. Java
      1. HTTPServer
      2. Servlet
    4. Pushgateway
    5. Bridges
    6. Parsers
    7. Exposition Format
      1. Metric Types
      2. Labels
      3. Escaping
      4. Timestamps
      5. check metrics
  8. 5. Labels
    1. What Are Labels?
    2. Instrumentation and Target Labels
    3. Instrumentation
      1. Metric
      2. Multiple Labels
      3. Child
    4. Aggregating
    5. Label Patterns
      1. Enum
      2. Info
    6. When to use Labels
      1. Cardinality
  9. 6. Dashboarding with Grafana
    1. Installation
    2. Data Source
    3. Dashboards, Rows, and Panels
      1. Avoiding the Wall of Graphs
    4. Graph Panel
      1. Time Controls
    5. Singlestat Panel
    6. Table Panel
    7. Templating
  10. III. Infrastructure Monitoring
  11. 7. Node Exporter
    1. CPU Collector
    2. Filesystem Collector
    3. Diskstats Collector
    4. Netdev Collector
    5. Meminfo Collector
    6. Hwmon Collector
    7. Stat Collector
    8. Uname Collector
    9. Loadavg Collector
    10. Textfile Collector
      1. Using the Textfile Collector
      2. Timestamps
  12. 8. Service Discovery
    1. Service Discovery Mechanisms
      1. Static
      2. File
      3. Consul
      4. EC2
    2. Relabelling
      1. Choosing What To Scrape
      2. Target Labels
    3. How To Scrape
      1. metric_relabel_configs
      2. Label Clashes and honor_labels
  13. 9. Containers and Kubernetes
    1. cAdvisor
      1. CPU
      2. Memory
      3. Labels
    2. Kubernetes
      1. Running in Kubernetes
      2. Service Discovery
      3. kube-state-metrics
  14. Index