Add logging to native health check #231

fsamuel-bs · 2018-03-21T14:01:18Z

We fail after 2 minutes when checking the native health check.
This might not be enough time to have all the docker containers up.
We can configure the timeout, but it was quite hard to debug it without logging.

We fail after 2 minutes when checking the native health check. This might not be enough time to have all the docker containers up. We can configure the timeout, but it was quite hard to debug this without logging.

fsamuel-bs · 2018-03-21T14:01:55Z

...le-core/src/main/java/com/palantir/docker/compose/connection/waiting/ClusterHealthCheck.java


 @FunctionalInterface
 public interface ClusterHealthCheck {
+    Logger log = LoggerFactory.getLogger(ClusterHealthCheck.class);


Having this here is sad. Should I make ClusterHealthCheck an abstract class and have this as a private variable?

Nope - instead, note that this is only used in NativeHealthCheck and do

static ClusterHealthCheck nativeHealthChecks() { return new ClusterHealthCheck() { static final Logger log = LoggerFactory.getLogger(ClusterHealthCheck.class); @Override SuccessOrFailure isClusterHealthy(Cluster cluster) throws InterruptedException { Set<String> unhealthyContainers = new LinkedHashSet<>(); ......... } } }

j-baker · 2018-03-27T15:44:07Z

...le-core/src/main/java/com/palantir/docker/compose/connection/waiting/ClusterHealthCheck.java

@@ -52,13 +56,18 @@ static ClusterHealthCheck nativeHealthChecks() {
        return cluster -> {
            Set<String> unhealthyContainers = new LinkedHashSet<>();
            try {
+                log.info("Checking health of containers {}",


these may be a little spammy, FYI. Think this is done in a fairly tight loop, so you may end up with many log lines.

Instead, I'd recommend doing this like:

Set<String> lastUnhealthyContainers = new LinkedHashSet<>(); return new ClusterHealthCheck() { @Override public SuccessOrFailure isClusterHealthy(Cluster cluster) throws InterruptedException { Set<String> currentUnhealthyContainers = new LinkedHashSet<>(); boolean healthyContainerSetChanged = false; for (Container container : cluster.allContainers()) { State state = container.state(); if (state == State.UNHEALTHY) { currentUnhealthyContainers.add(container.getContainerName()); } } if (currentUnhealthyContainers.equals(lastUnhealthyContainers)) { log(currentUnhealthyContainers); lastUnhealthyContainers.clear(); lastUnhealthyContainers.addAll(currentUnhealthyContainers); } if (!currentUnhealthyContainers.isEmpty()) { return SuccessOrFailure.failure("The following containers are not healthy: " ...); } return SuccessOrFailure.success(); } }

where you only log if the state has actually changed.

Add logging to native health check

0578a60

We fail after 2 minutes when checking the native health check. This might not be enough time to have all the docker containers up. We can configure the timeout, but it was quite hard to debug this without logging.

fsamuel-bs commented Mar 21, 2018

View reviewed changes

j-baker reviewed Mar 27, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add logging to native health check #231

Add logging to native health check #231

fsamuel-bs commented Mar 21, 2018

fsamuel-bs Mar 21, 2018

j-baker Mar 27, 2018 •

edited

Loading

j-baker Mar 27, 2018

j-baker Mar 27, 2018

Add logging to native health check #231

Are you sure you want to change the base?

Add logging to native health check #231

Conversation

fsamuel-bs commented Mar 21, 2018

fsamuel-bs Mar 21, 2018

Choose a reason for hiding this comment

j-baker Mar 27, 2018 • edited Loading

Choose a reason for hiding this comment

j-baker Mar 27, 2018

Choose a reason for hiding this comment

j-baker Mar 27, 2018

Choose a reason for hiding this comment

j-baker Mar 27, 2018 •

edited

Loading