For a year ago I was assigned with a task to investigate the future possibilities on monitoring for our team with a new application architecture coming our way. We are a small group and keeping some ~600 non standard servers up and running that are responsible for safe and fast vessel voyage handeling and communications with key partners.
Documentation: docs.coroot.com
Currently we use CheckMK and Graylog to get a grip on our environment consisting of more or less monolithic/soa Linux and windows architectures running in VMware and some on remote sites. The new upcoming system is replacing a twenty years old ERP application, is a microservices architecture in a docker k8s containerized environment. Something that is going to happen with a lot of our applications in the future.
The developers delivering the new system are working in a modern DevOps style so logically things like Istio, Grafana and Prometheus are already in place. Seeking on the internet and comparing some monitoring and observability systems we soon found out that all of them require a lot of work in setting them up to get a visible working oversight of the coming system.
And than there was Coroot. I don't even know how I bumped in to it, but as someone liking things simple and manageable by a bigger group I immediately loved it.
I set up a small prove of concept on a Ubuntu Linux laptop that has some docker containers running with a Coroot version that did not had RBAC in it at the time. I did the docker setup and only had struggles with some ports already in use. After that I gave a demonstration to the group so they could get a grip to what is coming, coroot even shows local services like redis and NtopNG running on the laptop. Recently I gave the group a microservices architecture training to get them acquainted with this new architecture and why we need this kind of observability.
It's an amazing experience to set something up in less then thirty minutes and are almost immediately done in having a visual knowledge of (containerized) applications. I recommend anyone using da containerized landscape or running services locally to use it because of the insights it shows on ones IT services.
Things that are immediately available after starting it up are:
- Automatic service discovery
- Pre build dashboards
- CPU, memory & disk storage
- DNS information if applicable
- Networking I/O metrics
- Application loggings and patterns
- Heatmaps thru continuous profiling
- Traces
- JVM behavior
- Deployment status
- Database response times
- SLO
- Costs of running on AWS, Azure or GCP if you need this
With some extra set-up, there are metrics and state information available for different databases like Redis, mongodb, MySQL and PostgreSQL.
To boost things up we are now examining a bare metal setup as shown in the above pictures and monitor some servers running Graylog, Elasticsearch and Wazuh on bare metal. It just works and gives us application level observability right out of the box that helps. Because Coroot is based on observing containers it puts every application running as the same service name together 🙃🙂
Things we really like is that it has SRE in place, so there is a modern quality based observability on all parts concerning ones applications. Next to that notifications are possible to configure with often used providers like Pagerduty, Slack, Microsoft Teams and others. RBAC is in well place since a few months.
Coroot basically uses eBPF to gather its information with a node-agent that sends the data to Coroot and is stored in Prometheus and Clickhouse. Next to that there is keen intelligence in coroot to visualize the collected data. Because of eBPF Coroot only works in Linux with a minimum kernel version of 5.1 or later. In the paid version there is Ai in place that can examine a problem and point to the place that causes that problem. There is good thinking on using Ai for this.
The people behind Coroot are smart and know what they are doing. Development is an ongoing thing for Coroot, and also on making deployment more easy every day. There is an open Community and paid Enterprise edition, and with being open they are fully present on Github. Documentation is well in place on their website. When there was a problem like we experienced with a new node-agent they react fast on Github where the problem was solved within 30 minutes. Good work.
Website: coroot.com
Demonstration website: demo.coroot.com
Reacties
Een reactie posten