The Agent Manager serves as the central control plane for managing the lifecycle, configuration, and secure communication of all deployed UTMStack agents. It is responsible for bootstrapping the core infrastructure, ensuring database schema consistency, and establishing the gRPC server that agents use to phone home.
[Unknown component: div]
System Architecture
The Agent Manager orchestrates several critical sub-components upon startup. It acts as the bridge between the backend database, the update distribution system, and the remote agent nodes.
Agent Manager Initialization
The initialization phase of the Agent Manager is strictly ordered to ensure that dependencies (like the database) are fully available before accepting connections from remote agents.
The Agent Manager operates with a fail-fast design during database migration. If the database schema cannot be verified or migrated, the service will intentionally crash to prevent data corruption.
Startup Sequence
The system bootstraps the catcher logging utility (via the ThreatWinds Go SDK) to ensure all startup events, including the initial agent-manager process launch, are properly recorded and shipped.
The manager connects to the database and runs MigrateDatabase(). This ensures the schema matches the current binary version.
A dedicated goroutine (updates.InitUpdatesManager()) is spawned to handle asynchronous agent updates without blocking the main thread.
Finally, agent.InitGrpcServer() is called to bind the network ports and begin accepting incoming agent connections.
Implementation Details
The core initialization logic resides in agent-manager/main.go. Notice the deliberate 5-second sleep on database failure—this ensures asynchronous logs have time to flush to the logging backend before the process exits.
// agent-manager/main.go
package main
import (
"os"
"time"
"github.com/threatwinds/go-sdk/catcher"
"github.com/utmstack/UTMStack/agent-manager/agent"
"github.com/utmstack/UTMStack/agent-manager/database"
"github.com/utmstack/UTMStack/agent-manager/updates"
)
func main() {
catcher.Info("Starting Agent Manager", map[string]any{"process": "agent-manager"})
// Step 1: Ensure database schema is up-to-date
err := database.MigrateDatabase()
if err != nil {
_ = catcher.Error("failed to migrate database", err, map[string]any{"process": "agent-manager"})
// Sleep allows the logging buffer to flush before termination
time.Sleep(5 * time.Second)
os.Exit(1)
}
// Step 2: Start the updates manager asynchronously
go updates.InitUpdatesManager()
// Step 3: Start the blocking gRPC server
agent.InitGrpcServer()
}
Because the gRPC server initialization is a blocking call, it must be the last function executed in the main() block. Any services that need to run concurrently (like the Updates Manager) must be spawned in separate goroutines beforehand.
Agent Node Initialization
While the Agent Manager handles the server-side control plane, the individual Agent nodes have their own lightweight initialization lifecycle. The agent binary is designed to be minimal, delegating complex routing and execution to a command-line interface (CLI) framework.
// agent/main.go
package main
import (
"github.com/utmstack/UTMStack/agent/cmd"
"github.com/utmstack/UTMStack/agent/config"
"github.com/utmstack/UTMStack/agent/utils"
)
func main() {
// Initialize local file-based logging
utils.InitLogger(config.ServiceLogFile)
// Execute the root CLI command
cmd.Execute()
}Component Breakdown
Troubleshooting
This almost always indicates a database migration failure. Check the container logs for the failed to migrate database error.
Common causes:
The database credentials in the environment variables are incorrect.
The database server is unreachable due to network policies.
The database user lacks schema modification privileges.
If the Agent Manager is running but agents cannot connect:
Verify that the gRPC server initialized successfully (it should be the last log entry on successful startup).
Ensure network firewalls allow inbound traffic on the designated gRPC port.
Check the local agent logs (
ServiceLogFile) to verify the agent is attempting to connect to the correct Manager IP/hostname.
The Updates Manager runs in a background goroutine. If updates are failing:
Verify that the
InitUpdatesManager()routine didn't encounter a panic.Check if the Agent Manager has sufficient disk space or network access to retrieve the upstream update payloads before distributing them.