Health Checks

Gateway health monitoring and zombie process reaping.

Overview

The gateway runs periodic health checks to ensure the system is operating correctly. This includes monitoring service health, connection status, and cleaning up orphaned processes that can accumulate during normal operation.

Health Check Timer

The health check runs every 60 seconds and performs:

Channel connection status checks
Memory database accessibility
Zombie process reaping
Disk space monitoring (optional)

Zombie Process Reaper

During normal operation, the gateway spawns claude --stream-json subprocesses for agent interactions. If these processes crash or timeout without cleanup, they become zombies that consume system resources.

The reaper:

Scans for claude processes with the stream-json flag
Checks their age using POSIX etime format
Kills any that are older than 5 minutes

# What the reaper does internally (macOS-compatible)
ps -eo pid,etime,command | grep "stream-json"

The reaper specifically greps for "stream-json" only. It must NEVER grep broadly for "claude" as that would kill active Claude Code sessions running on the same machine.

etime Parsing

The reaper uses the POSIX etime format ([[dd-]hh:]mm:ss) which works on macOS. It does not use etimes (seconds format) which is Linux-only.

Checking Health

Via CLI

argent gateway status

Output includes:

Gateway process status (running/stopped)
RPC probe result (ok/failed)
Listening port
Uptime
Connected channels

Via Dashboard

The dashboard header shows a connection indicator:

Green: Connected and healthy
Yellow: Connected with warnings
Red: Disconnected or unhealthy

Common Health Issues

RPC Probe Fails

The gateway is running but not responding to RPC calls:

Check the gateway logs: argent gateway logs
Verify the port is not blocked by a firewall
Check for native module ABI mismatches (see Configuration)

High Zombie Count

If you see many zombie processes accumulating:

The reaper should handle these automatically
If they persist, check if the 5-minute timeout is appropriate for your workload
Manual cleanup: argent gateway restart

Memory Database Locked

If the health check reports a locked database:

Check for other processes accessing ~/.argentos/memory.db
The WAL (Write-Ahead Logging) mode should prevent most lock issues
Restart the gateway if the lock persists

Gateway Configuration

Configure the gateway port, host, authentication, and service options.

Heartbeat System

Periodic agent wake-ups for autonomous task work and proactive check-ins.

On this page

Overview Health Check Timer Zombie Process Reaper etime Parsing Checking Health Via CLI Via Dashboard Common Health Issues RPC Probe Fails High Zombie Count Memory Database Locked