Skip to content
Pipelines and Pizza 🍕
Go back

Grafana Alloy on Kubernetes: Three Deployments, One Collector

13 min read

Before this project our observability tooling was basic SolarWinds. It could tell us when a server went down, but rarely why. No centralized log collection. No log analytics. No correlation between a network blip and an application failure. We had security tools, antivirus suites, and the usual compliance stack, but operationally? We were flying mostly blind, calling vendors one at a time and hoping one of them had the answer.

This is part 3a of the LGTM on Nutanix series. In article 1 we covered the architecture overview. In article 2 we set up Nutanix Objects as the storage backend. Today we put the collection layer on the floor — the part of the kitchen that takes raw ingredients from every supplier and gets them prepped before they hit the oven.


Why Alloy

Grafana Alloy is Grafana Labs’ OpenTelemetry Collector distribution. It replaced Grafana Agent (now in maintenance mode) and is the recommended collector across the Grafana ecosystem. One binary handles metrics, logs, traces, and continuous profiles — the universal mixer in our pizza kitchen.

Picking Alloy was the easy call. Here is the back-of-the-napkin matrix:

OptionWhy We LookedWhy We Passed (or Picked)
Grafana AlloyNative LGTM integration, one binaryPicked. Aligned with Grafana’s direction; one config language to learn
Grafana Agent (legacy)Familiar to existing Grafana usersEOL — Alloy supersedes it
Promtail + Prometheus AgentTwo well-known toolsEOL (Promtail) and adds collector sprawl
OpenTelemetry CollectorVendor-neutral standardComponent coverage is broader, but more glue work for LGTM specifically
Telegraf onlyMassive plugin libraryStrong for niche protocols, weak for Kubernetes-native discovery

For a Grafana-native stack, Alloy is the path of least resistance. Pick a tool that the upstream is investing in, and you spend less time fighting the runtime and more time tuning pipelines.

A few things to know before you write your first config:

  • Component-based architecture. You wire discrete blocks together into pipelines. Each block does one thing: discover, relabel, scrape, process, write.
  • Alloy syntax (formerly River). HCL-shaped, not YAML. If you have written Terraform you will feel at home. The learning curve was very small for us.
  • Built-in UI on port 12345. Live component graph, health, and debug — invaluable when a pipeline silently stops moving data.
  • OTLP first-class. Receive and send via OTLP without translation hops.
  • Spiritual successor to Promtail and Grafana Agent. Same job, less memory, single config file.

The Three-Deployment Topology

Most tutorials show a single Alloy instance. We run three. Different jobs, different scaling profiles, different blast radii.

+-------------------------------------------------------------------+
|                       KUBERNETES CLUSTER (per DC)                  |
|                                                                    |
|  +------------------------------+                                  |
|  |  Alloy DaemonSet             |   Pod logs, node metrics,        |
|  |  (1 pod / node)              |-> kubelet, audit logs,           |
|  |  hostPath: /var/log, etc.    |   Alloy self-metrics             |
|  +------------------------------+                                  |
|                                                                    |
|  +------------------------------+                                  |
|  |  Alloy Deployment: Network   |   Syslog (514, 6514, 1515),      |
|  |  HPA: 2-5 replicas @ 60% CPU |-> gNMI dial-in (57400),          |
|  |  MetalLB VIP                 |   SNMP scrape, Nutanix Prism     |
|  +------------------------------+   <-- Network devices, switches  |
|                                                                    |
|  +------------------------------+                                  |
|  |  Alloy Deployment: Traces    |   OTLP gRPC 4317                 |
|  |  2 replicas (fixed)          |-> OTLP HTTP 4318                 |
|  |                              |   --> Tempo                      |
|  +------------------------------+                                  |
|                                                                    |
|  +------------------------------+                                  |
|  |  Telegraf Deployment         |   NX-OS 10.x dial-out (gRPC/GPB),|
|  |  (where Alloy cannot)        |-> vSphere, Meraki                |
|  |  MetalLB VIP                 |   --> Mimir                      |
|  +------------------------------+   <-- Cisco NX-OS switches       |
+-------------------------------------------------------------------+

1. DaemonSet — Pod Logs and Node Metrics

One Alloy pod on every node. Anything that requires node-local access lives here:

  • Pod logs via loki.source.kubernetes — tailing container stdout from the node filesystem
  • Node metrics via prometheus.exporter.unix — CPU, memory, disk, network
  • Kubelet metrics via prometheus.scrape — Kubernetes node agent internals
  • Pod annotation scraping — any pod with prometheus.io/scrape: "true" is collected automatically
  • Kubernetes audit logs — file tail on /var/log/kube-audit/audit.log with JSON parsing
  • Alloy self-metrics — scraping localhost:12345 for the collector’s own health

What we are deliberately not collecting yet: Kubernetes events (pod scheduling, image pulls, OOM kills, node conditions) via loki.source.kubernetes_events. They are valuable for incident archaeology and the component is trivial to add. We pulled it from the initial scope to keep the audit-log pipeline simple while we tuned cardinality and retention. It is on the roadmap. Calling it out so you do not assume it is silently happening — when you wire up your own Alloy, decide intentionally.

Three nodes per cluster, two clusters → six DaemonSet pods. Each writes logs to Loki and metrics to Mimir using dual-write (local DC plus remote DC endpoint) so a single DC outage cannot blackhole observability data.

2. Deployment — Network Receivers (Syslog, SNMP, gNMI)

A standard Deployment with HPA, fronted by a MetalLB VIP so network devices have a stable IP target. Everything that originates outside the Kubernetes cluster lands here:

  • Syslog — UDP/514, TCP/514, TLS/6514 for switches, firewalls, and ISE
  • Nutanix CVM syslog — 1515/UDP on its own port because Nutanix CVMs send ISO 8601 timestamps that strict RFC 3164 parsing rejects (raw mode required)
  • gNMI dial-in — 57400/TCP for IOS-XE devices
  • SNMP exporter scraping — for legacy NX-OS 9.3.x switches without dial-out support
  • Nutanix Prism Central scraping — custom exporter for inventory and DR readiness

HPA scales 2-5 replicas at 60% CPU. The threshold is intentionally lower than default — a misconfigured switch can flood the receiver, and you want headroom before pods start dropping data. Better to over-provision a little than to lose audit logs during a routing event.

A VIP is non-negotiable. Network devices need a stable IP. They cannot resolve a Kubernetes ClusterIP, and they will not retry forever if the load balancer rotates.

3. Deployment — OTLP Traces Receiver

A fixed 2-replica Deployment that receives OTLP traces on gRPC (4317) and HTTP (4318), then fans them out to Tempo. Tempo was initially out of scope (we pulled it in March), then back in (April), once the core stack stabilized.

Trace traffic has different scaling characteristics than syslog — bursty, latency-sensitive, span-count rather than byte-count. Splitting it off keeps trace spikes from starving the syslog pipeline and vice versa.

Why Not One Big Alloy?

Three reasons that show up the moment you scale:

  1. Blast radius. A syslog flood should not knock out pod log collection. A trace burst should not stall metric scraping. One Alloy means one OOM kill takes everything with it.
  2. Scaling models. Network receiver wants HPA. DaemonSet scales with nodes. Trace receiver is fixed at 2 replicas. Mashing them together is mashing three pizzas with three different bake times into one oven.
  3. Security boundaries. DaemonSet needs hostPath mounts and system-node-critical priority. Network receiver needs a LoadBalancer and external port exposure. Trace receiver needs neither. Splitting them means each gets only the privileges it actually requires.

Why Telegraf Still Has a Seat at the Table

If Alloy is universal, why is Telegraf next to it?

One word: Cisco.

Cisco NX-OS 10.x switches stream high-frequency interface telemetry via gRPC dial-out using GPB (Google Protocol Buffers) encoding. Alloy’s gNMI component is great for IOS-XE dial-in subscriptions, but on NX-OS 10.x the Device YANG path subscriptions silently fail. We chased that bug for a couple of days before accepting that gNMI subscribe is not viable on this code train. At somepoint well revist. That is noted in our ADR’s and is on the annaul review list.

Telegraf’s cisco_telemetry_mdt input plugin handles NX-OS dial-out natively. It listens on a MetalLB VIP at port 57000/TCP, parses GPB, and writes metrics to Mimir. Battle-tested, in production at hundreds of large networks, and unlikely to lose data on us.

Telegraf also covers VMware vSphere and Meraki via their respective plugins — niches where Telegraf’s plugin ecosystem is deeper than Alloy’s component library. The principle: Alloy handles the 90% case. Telegraf is the specialty topping you add when the protocol calls for it.


Helm Deployment

We deploy Alloy through a wrapper chart pattern — a local Chart.yaml that pulls the upstream grafana/alloy chart as a dependency, with values-common.yaml plus per-DC overrides. Everything ships through ArgoCD; we never run helm install by hand against production.

DaemonSet Values

# values-common.yaml (DaemonSet)
alloy:
  configMap:
    create: true
    # Config sourced from observability/alloy/configs/daemonset.alloy
  mounts:
    varlog: true
    dockercontainers: true
  resources:
    requests:
      cpu: 200m
      memory: 512Mi
    limits:
      cpu: 1000m
      memory: 8Gi

controller:
  type: daemonset
  tolerations:
    - operator: Exists
  priorityClassName: system-node-critical

The 8 GiB memory limit looks generous because it is. Pod logs from a busy cluster, plus Kubernetes audit logs, plus node metrics, plus a hot reload during a config push — they add up. Memory WAS cheap; missing logs at 2 a.m. is not. Start at 8 GiB, watch usage in Grafana, tune down only when you have data.

Network Receiver Values

# values-common.yaml (Network Deployment)
controller:
  type: deployment
  replicas: 2

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 5
  targetCPUUtilizationPercentage: 60

service:
  type: LoadBalancer
  annotations:
    metallb.universe.tf/loadBalancerIPs: "203.0.113.10"  # Replace with your MetalLB VIP
  ports:
    - name: syslog-udp
      port: 514
      protocol: UDP
    - name: syslog-tcp
      port: 514
      protocol: TCP
    - name: syslog-tls
      port: 6514
      protocol: TCP
    - name: nutanix-syslog
      port: 1515
      protocol: UDP
    - name: gnmi
      port: 57400
      protocol: TCP

Traces Receiver Values

# values-common.yaml (Traces Deployment)
controller:
  type: deployment
  replicas: 2

service:
  ports:
    - name: otlp-grpc
      port: 4317
      protocol: TCP
    - name: otlp-http
      port: 4318
      protocol: TCP

Alloy Configuration Basics

Alloy’s component model is the same for all three deployments. Pipelines look like this:

[Discovery] → [Relabeling] → [Collection] → [Processing] → [Writing]

Here is the simplest end-to-end pipeline — pod log discovery, relabeling, collection, and writing to Loki:

// Discover all pods in the cluster
discovery.kubernetes "pods" {
  role = "pod"
}

// Extract useful metadata into labels
discovery.relabel "pod_logs" {
  targets = discovery.kubernetes.pods.targets

  rule {
    source_labels = ["__meta_kubernetes_namespace"]
    target_label  = "namespace"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_name"]
    target_label  = "pod"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_container_name"]
    target_label  = "container"
  }
}

// Tail the logs
loki.source.kubernetes "pod_logs" {
  targets    = discovery.relabel.pod_logs.output
  forward_to = [loki.write.default.receiver]
}

// Write to Loki with cluster + DC labels
loki.write "default" {
  endpoint {
    url = "http://loki-gateway.loki.svc.cluster.local:3100/loki/api/v1/push"
  }
  external_labels = {
    cluster = "observability",
    dc      = env("DC_LABEL"),
  }
}

Each block exports a target the next block consumes. Adding a new source is mostly adding a new discovery block and wiring it in — no XML, no nested YAML hellscape, no custom DSL to learn. We will work through full production configs (syslog DC classification by source IP, audit log JSON parsing, per-stream retention labels, the Windows fleet module architecture) in the next article.


Verifying the Deployment

After ArgoCD syncs, walk through the standard checks:

# Pods are running on every node (DaemonSet)
kubectl -n alloy get ds alloy

# Network and traces deployments are at the expected replica counts
kubectl -n alloy get deploy

# MetalLB assigned the VIP
kubectl -n alloy get svc

# Alloy UI is reachable from inside the cluster
kubectl -n alloy port-forward svc/alloy 12345:12345
# Then visit http://localhost:12345 in a browser

The built-in UI shows the live component graph. If a component is red, there is your starting point. If everything is green but no data is flowing, check the destination side first (Loki, Mimir, Tempo) — Alloy is happy to keep buffering for a long time before it complains.

We use the UI for real, not just during install. The off-cluster Windows agent runbook tells operators to hit http://server:12345 after a module sync to verify imports loaded cleanly. The Linux agent runbook does the same. Network policies in our cluster explicitly allow probes against port 12345 so the platform can blackbox-monitor the collectors. The UI is not a curiosity — it is the fastest path to “is this component actually doing what I think it is doing?”


Troubleshooting at the Door

A few issues you will hit early. None of them are showstoppers, all of them are searchable, but knowing them up front saves an afternoon.

SymptomMost Likely CauseQuick Fix
Network receiver pods OOM during a syslog spikeNo best_effort mode in Alloy syslog receiver; defaults are too generousSet max_message_size and connection limits explicitly; raise memory limit
Nutanix CVM syslog parsed as garbageRFC 3164 strict parser rejects ISO 8601 timestampsUse raw mode for the 1515/UDP listener; classify post-hoc in a processing stage
loki.source.kubernetes not picking up podsMissing RBAC for pods/log subresourcesApply the chart’s RBAC bundle, or extend it for your cluster’s policies
MetalLB VIP shows <pending>No L2 advertisement or address pool defined for that subnetConfirm IPAddressPool and L2Advertisement cover the requested IP
HPA does not scale during syslog floodMetrics-server missing or default 80% threshold too highVerify metrics-server health; lower target to 60%

Each of these fits a “20 minutes once you know, 4 hours if you don’t” pattern. They are documented in Grafana Labs’ issues; pin the relevant ones in your runbook.


Security: An Ounce of Prevention

Get RBAC, network policies, and secrets right during deployment, not after. It is tempting to deploy with wide-open access and tighten later — that approach creates audit findings in regulated environments and risks, Audit, InfoSec, and a change control window when someone insists on closing the gap.

What to walk your infosec team through before the first ArgoCD sync:

  • Service accounts — one per Alloy deployment, only the permissions each needs
  • hostPath mounts — DaemonSet only, scoped to /var/log and container log paths
  • Network policies — DaemonSet egress to Loki/Mimir/Tempo; network receiver ingress from device subnets only; trace receiver ingress from app namespaces only
  • MetalLB VIPs — documented in your IPAM, listed in change control with port maps
  • Secrets — pulled from your secret store via External Secrets Operator, never committed

Lock it down on the way in. An ounce of prevention beats a pound of cure, and the cure usually involves explaining to a regulator why ACL’s are wider than they needed to be.


What Is Next

Article 3b goes deep on what these three Alloys are actually doing — the full DaemonSet config with audit log parsing, the syslog DC classification pipeline, the centralized module architecture for our Windows fleet agents, per-stream retention labels, and resource tuning under real load.

We will also cover the pain points: why the Alloy syslog receiver lacks a best_effort mode and what to do about it, how we worked around NX-OS gNMI subscribe being broken, and the day a single CVM convinced us that “RFC 3164 strict” is more of a suggestion than a standard.

Happy automating!


This is article 3a of 10 in the LGTM on Nutanix series. Next up: Article 3b — Alloy in Production: Logs, Metrics, and Scaling.