Monitoring a home Docker setup

When running a home server consisting of one or more nodes, with some or all services in Docker you could feel the need to monitor your environment, or even better, have full observability.

The most often written possibility for this is a combination of Prometheus with Grafana. A solution that requires a lot of work on setting this up fully, and requires work on ones applications and detailed setup for full visibility. Another possibility is to use the free tire of NewRelic that has the possibility of remote insights on metrics and logs but requires work on containers or applications to have full visibility.

Monitoring with Beszel

First and simple runner up to make this possible is Beszel. Beszel can be run as a local service or in docker and consists of a web frontend and and agent that can be used on multiple systems and even supports Windows and MacOS. Installation is an easy job in docker and once it's running there is insightful information on system metrics and docker services and even some logs.

Observability with Coroot

My personal choice on monitoring a home server system is Coroot. In my current setup on a Rocky Linux 9.X system Coroot exist out of a Clickhouse server to store metrics, logs, traces and profiles, the Coroot service, a node-agent and cluster-agent. The node agent collects all metrics and logs of the present services thru eBPF and the cluster agent is needed if one wants detailed information on databases like MySQL, Postgres or Redis.

Another advantage one has with Coroot is the use of AI Root cause analyses that can provide helpful insights in investigating incidents. With a account you will have ten analyses for free each month.


Because of the control I want over Clickhouse it's running as a local service for  better convenience. The control is because of scaling down memory usage of Clickhouse, scaling down logging on disk and the database and easy changes on the data. The only downside is updating Clickhouse manually with yum/dnf.

The Coroot services run in docker thru a docker-compose file. In a normal setup Prometheus is required, in this setup Clickhouse is used as a supported alternative.

Installing Clickhouse

Installing Clickhouse is done by adding the repo, installing Clickhouse, making some adjustments and starting it up.

sudo dnf install -y yum-utils
sudo dnf-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo

sudo dnf install -y clickhouse-server clickhouse-client

Before staring the service create a file: /etc/clickhouse-server/config.d/z_log_disable.xml and put the following contend in the file:

<?xml version="1.0"?>
<clickhouse>
<asynchronous_metric_log remove="1"/>
<metric_log remove="1"/>
<latency_log remove="1"/>
<query_thread_log remove="1" />
<query_log remove="1" />
<query_views_log remove="1" />
<part_log remove="1"/>
<session_log remove="1"/>
<text_log remove="1" />
<trace_log remove="1"/>
<crash_log remove="1"/>
<opentelemetry_span_log remove="1"/>
<zookeeper_log remove="1"/>
</clickhouse>

After this adjust cache sizes in /etc/clickhouse-server/config.xml:

  <mark_cache_size>268435456</mark_cache_size>
  <index_mark_cache_size>67108864</index_mark_cache_size>
  <uncompressed_cache_size>16777216</uncompressed_cache_size>

Adjust memory usage ratio in /etc/clickhouse-server/config.xml:

<max_server_memory_usage_to_ram_ratio>0.75</max_server_memory_usage_to_ram_ratio>

Lower the tread pool size in /etc/clickhouse-server/config.xml:

 <!-- <max_thread_pool_size>10000</max_thread_pool_size> -->
 <max_thread_pool_size>5000</max_thread_pool_size>

And stating things up:

sudo systemctl deamon-reload
sudo systemctl enable clickhouse-server
sudo systemctl start clickhouse-server

Installing Coroot

Before installing Coroot take a look if the requirements are met. This is at least kernel 5.1 although 4.2 is also supported. This installation is different from the original docker-compose file. Prometheus is not used in this setup, and Clickhouse runs as a local service. Another distinction is the retention of the data, that's normally seven days for traces, logs, profiles and metrics and Coroot having an own local cache for metrics for 30 days. In this setup the data retention stored in Clickhouse is set up for 14 days. With eighteen local and docker services the amount of data kept for all of this is 3GB on average in my system.

Coroot, the node-agent and cluster-agent run as a docker service with docker-compose that you have to create locally with the following content in docker-compose.yaml that you have to create locally.

name: coroot

volumes:
  node_agent_data: {}
  cluster_agent_data: {}
  coroot_data: {}

services:
  coroot:
    restart: always
    image: ghcr.io/coroot/coroot${LICENSE_KEY:+-ee} # set 'coroot-ee' as the image if LICENSE_KEY is defined
    pull_policy: always
    user: root
    volumes:
      - coroot_data:/data
    ports:
      - 8080:8080
    command:
      - '--data-dir=/data'
      - '--bootstrap-refresh-interval=15s'
      - '--bootstrap-clickhouse-address=127.0.0.1:9000'
      - '--bootstrap-prometheus-url=http://127.0.0.1:9090'
      - '--global-prometheus-use-clickhouse'
      - '--global-prometheus-url=http://127.0.0.1:9090'
      - '--global-refresh-interval=15s'
      - '--cache-ttl=31d'
      - '--traces-ttl=21d'
      - '--logs-ttl=21d'
      - '--profiles-ttl=21d'
      - '--metrics-ttl=21d'
    environment:
      - LICENSE_KEY=${LICENSE_KEY:-}
      - GLOBAL_PROMETHEUS_USE_CLICKHOUSE
      - CLICKHOUSE_SPACE_MANAGER_USAGE_THRESHOLD=75         # Set cleanup threshold to 75%
      - CLICKHOUSE_SPACE_MANAGER_MIN_PARTITIONS=2           # Always keep at least 2 partitions
    network_mode: host

  node-agent:
    restart: always
    image: ghcr.io/coroot/coroot-node-agent
    pull_policy: always
    privileged: true
    pid: "host"
    volumes:
      - /sys/kernel/tracing:/sys/kernel/tracing
      - /sys/kernel/debug:/sys/kernel/debug
      - /sys/fs/cgroup:/host/sys/fs/cgroup
      - node_agent_data:/data
    command:
      - '--collector-endpoint=http://192.168.1.160:8080'
      - '--cgroupfs-root=/host/sys/fs/cgroup'
      - '--wal-dir=/data'

  cluster-agent:
    restart: always
    image: ghcr.io/coroot/coroot-cluster-agent
    pull_policy: always
    volumes:
      - cluster_agent_data:/data
    command:
      - '--coroot-url=http://192.168.1.160:8080'
      - '--metrics-scrape-interval=15s'
      - '--metrics-wal-dir=/data'
    depends_on:
      - coroot

After creating this file, and making some adjustments to your own likings and network preferences do a docker compose up -d and go to your IP address on port 8080 and you have acces to Coroot where is askes you to give in the admin credentials.

In my setup Watchtower takes care of updating docker containers and this works well for the Coroot services.

Happy observability :-)

Reacties