Table of contents
Kernloom IQ (kliq) β Full reference
Kernloom IQ is the local Policy Decision Point (PDP). It reads telemetry from Shield, scores each source, and applies progressive enforcement through the active PEP adapter.
This page covers:
- Decision engine β how IQ reasons
- Profiles β starting configurations with real values
- Graph Learner β Zero Trust path enforcement
- Autotune β self-tuning baseline
- Exemptions β whitelist and feedback
- CLI flag reference β every argument
Decision engine
Per-tick inputs
Every tick (--interval, default 1s), IQ reads per-source deltas from Shield’s telemetry maps:
- PPS β packets per second
- SYN/s β TCP SYN packets per second
- scan/s β distinct destination ports contacted per second (port-scan signal)
- DropRL/s β packets dropped by Shield’s rate limiter per second
Severity score
Each signal is normalised by a trigger threshold and capped:
nPPS = min(PPS / trig-pps, sev-cap)
nSYN = min(SYN/s / trig-syn, sev-cap)
nSCAN = min(scan/s/ trig-scan, sev-cap)
severity = w-pps Γ nPPS + w-syn Γ nSYN + w-scan Γ nSCAN
Strikes β levels
Severity is converted to strikes using step thresholds (--sev-step*, --sev-delta*).
Strike counts are mapped to enforcement levels:
| Threshold flag | Level reached |
|---|---|
--soft-at | RATE_SOFT |
--hard-at | RATE_HARD |
--block-at | BLOCK (subject to block gate) |
Hysteresis
IQ prevents rapid oscillation using:
--up-need/--down-needβ consecutive high/low ticks required to change level--min-hold-soft/--min-hold-hardβ minimum time in each level--cooldownβ minimum time between any level changes
Non-compliance escalation
If a source is in RATE_HARD and continues producing DropRL/s (it is hitting the limiter and not backing off), IQ accelerates escalation to BLOCK. Controlled by --noncomp-at, --noncomp-drop, --noncomp-sev.
Block gate
Blocking behind NAT is risky (one bad source can block many legitimate users). Gate it:
--block-min-sev 3.0 # only block if severity >= 3.0 ...
--block-min-dur 60s # ... for at least 60 seconds
If the gate fails, IQ stays at HARD instead of BLOCK.
Configuration
IQ has two independent configuration axes in v0.2.0.
PDPConfig (--pdp-config) β what to measure
A PDPConfig YAML file is the primary configuration mechanism. It controls signal engine thresholds, the autotune schedule, progressive enforcement parameters, and adapter parameters. It replaces the older --profile flag for most use cases.
--profile still works as a shorthand but the PDPConfig approach is recommended β it is Forge-compatible and gives full control over the bootstrap schedule.
16 PDPConfig files ship in configs/pdp/ (8 bootstrap + 8 production):
| Bootstrap profile | Production profile | Role |
|---|---|---|
ziti-controller-bootstrap | ziti-controller | Public Ziti controller |
ziti-router-bootstrap | ziti-router | Public Ziti router |
web-server-bootstrap | web-server | Public web server |
reverse-proxy-bootstrap | reverse-proxy | Reverse proxy |
idp-bootstrap | idp | Identity provider |
database-bootstrap | database | Database server |
api-server-bootstrap | api-server | Internal API |
nas-bootstrap | nas | NAS / storage |
Public-facing profiles (ziti-controller, ziti-router, web-server, reverse-proxy) have graph.enabled: false β graph learning is not useful when clients are unknown internet IPs and the flow map fills up immediately. Internal profiles (idp, database, api-server, nas) have graph.enabled: true.
Bootstrap profiles start with blocking disabled (block-at=999) and use max_down: 0.10 (10%/hour) so thresholds converge from the cold-start value within 48h. Switch to the production profile once you have observed stable operation.
Feature profile (--feature-profile) β which subsystems are active
A feature profile controls which IQ subsystems run. It is auto-derived from the --graph flag when not set explicitly:
| Profile | Requires | What runs |
|---|---|---|
klshield-light | klshield only, no kliq | XDP + static deny/allow. No learning. |
dos-light | klshield + kliq | Source heuristic + autotune. No graph, no SQLite. |
iq-learning | klshield + kliq | dos-light + per-source EWMA baseline. |
graph-learning | klshield + kliq | iq-learning + flow telemetry + graph + SQLite. |
graph-enforce | klshield + kliq | graph-learning + XDP tuple enforcement. |
Check which subsystems are active at runtime:
kliq runtime status graph-learning
kliq runtime status klshield-light # β explains no kliq needed
Legacy profile values
For reference: the older --profile flag seeds these initial values (all adapted by autotune):
SoftRate and HardRate are in packets per second. BlockAt=999 means blocking is effectively disabled.
| Profile | TrigPPS | TrigSyn | TrigScan | SoftAt | HardAt | BlockAt | SoftRate | HardRate |
|---|---|---|---|---|---|---|---|---|
ziti-router | 8 000 | 200 | 30 | 2 | 5 | 12 | 3 000 | 800 |
ziti-controller | 80 | 20 | 5 | 1 | 3 | 9 | 20 | 5 |
ziti-router-bootstrap | 25 000 | 600 | 120 | 3 | 8 | 999 | 6 000 | 1 500 |
ziti-controller-bootstrap | 400 | 120 | 30 | 2 | 6 | 999 | 60 | 20 |
public-web | 1 200 | 250 | 20 | 2 | 5 | 12 | 500 | 120 |
public-api | 2 500 | 500 | 30 | 2 | 4 | 10 | 1 000 | 300 |
idp | 350 | 180 | 10 | 1 | 3 | 8 | 50 | 10 |
internal-app | 800 | 150 | 8 | 3 | 6 | 999 | 200 | 50 |
ssh-bastion | 60 | 25 | 5 | 1 | 2 | 6 | 5 | 1 |
Recommended rollout
1. Copy the right PDPConfig for your node to /opt/kernloom/attested/etc/pdp/node.yaml
2. Start with --dry-run=true and --whitelist-learn=true
3. Let IQ observe and autotune on real traffic for 7β14 days
4. Review state transitions; add whitelist entries for known-good sources
5. Enable enforcement: --dry-run=false
Graph Learner
The Graph Learner is an optional module that builds a baseline of observed communication paths and can enforce Zero Trust once the baseline is frozen.
Enable with --graph. Control the mode with --graph-mode.
Modes
| Mode | What happens |
|---|---|
learn | Record sourceβdestination flows as graph edges. No enforcement on unknown paths. |
frozen-observe | Baseline is frozen. Unknown edges inject extra FSM strikes and emit signals, but enforcement is still gradual. Good for catching false positives before going strict. |
frozen-enforce | Unknown edges force the source immediately to BLOCK, bypassing normal strike accumulation. Strict Zero Trust posture. |
Edge lifecycle
candidate β learned β frozen
β
approved (manual)
denied (manual, never overwritten)
| State | Meaning |
|---|---|
candidate | Seen, evidence building (count, distinct time windows, age) |
learned | Promoted: evidence criteria met |
frozen | Locked into the baseline |
approved | Manually confirmed β carries extra trust weight |
denied | Explicitly blocked β never promoted, never overwritten |
Suspicious sources (currently RATE_SOFT, RATE_HARD, or BLOCK) are automatically excluded from the baseline to prevent polluting it with attack traffic.
Workflow
Step 1 β learn your baseline:
sudo /opt/kernloom/attested/kliq \
--pdp-config=/opt/kernloom/attested/etc/pdp/idp-bootstrap.yaml \
--graph --graph-mode=learn \
--dry-run=true --whitelist-learn=true
Run for several days to a week to capture representative traffic patterns.
Step 2 β review and clean up:
kliq graph export # export all edges
kliq graph export --sort=state # grouped by state
kliq graph edges --sort=state # overview with state counts
kliq graph baselines --sort=obs # per-edge EWMA stats (PPS/BPS peaks)
kliq graph approve-ip <ip> # mark an edge as explicitly approved
kliq graph deny-ip <ip> # mark an edge as denied
Step 3 β check readiness before freezing:
kliq graph freeze --dry-run
Reports how many edges would be frozen, how many candidates are still immature, how many low-confidence edges exist. Does not write anything.
Step 4 β freeze the baseline:
kliq graph freeze
This locks all learned edges to frozen. New traffic after this point will be compared against the baseline.
Step 5 β observe before enforcing:
sudo /opt/kernloom/attested/kliq --graph --graph-mode=frozen-observe
Run for a few days. Watch for unexpected signals (legitimate sources you forgot to include). Add them with approve-ip if needed.
Step 6 β full enforcement:
sudo /opt/kernloom/attested/kliq --graph --graph-mode=frozen-enforce
Any source taking a path not in the frozen baseline is immediately forced to BLOCK.
Independence from behaviour-based enforcement
The graph enforces path-based Zero Trust only. A source with a known, frozen edge is still fully subject to IQ’s severity scoring.
If a known node starts sending a SYN flood, scanning ports, or generating unusual packet volume, progressive enforcement applies normally β the graph has no say in this. IQ asks “is this behaviour acceptable?” independently of whether the graph knows the path.
This matters in practice: a compromised workload may deliberately stay on known communication paths to avoid triggering the graph, but its traffic patterns will still be anomalous to IQ. The two mechanisms catch different things and neither exempts a source from the other.
The only exception is the IQ whitelist (--whitelist): sources explicitly whitelisted in IQ are exempt from all enforcement. Graph approval is not the same as an IQ whitelist entry β they are separate and independent.
Storage
The graph is stored in a unified SQLite database (kliq.db, default path: /var/lib/kernloom/iq/kliq.db). Since v0.2.0 this database also contains source baseline data and edge baseline EWMA stats in separate tables. It persists across restarts independently of state.json.
New in v0.2.0: learning improvements
Source baseline
Active when feature-profile β₯ iq-learning. IQ tracks per-source IP EWMA statistics separately. A known high-traffic source gets an effective trigger of max(global_trigger, source_peak Γ 1.2), so a source that normally sends 250 PPS does not trip a global trigger of 100 PPS. Unknown sources fall back to the global trigger.
Edge baseline improvements
- Two-phase EWMA alpha: bootstrap
alpha=0.10while observations < 30, stablealpha=0.02after. The baseline converges quickly but resists being moved by short spikes. - Decaying peak (
peak_decay_half_life): a single historical spike no longer permanently defines the ceiling. A peak from 14 days ago is worth 50% of its original value (peak_decay_half_life: "336h").
Anti-poisoning (three layers)
Three layers prevent attacks from corrupting baselines:
- TrigPPS cap β observations above the host-level trigger are never written
- SuspiciousRegistry β source AND edge-level suspicious state tracked separately; a freeze violation on one edge no longer blocks learning for all edges from that source
- 30s pending buffer β baseline updates are delayed 30s and dropped if the source or edge was flagged in that window
kliq graph baselines
kliq graph baselines [--all] [--sort=obs|state|src|port|pps|bps]
kliq graph baselines reset
Shows per-edge EWMA stats including PPS_PEAK and BPS_PEAK columns.
Autotune
IQ can learn your trigger thresholds (trig-pps, trig-syn, trig-scan) from observed traffic using Median + MAD statistics.
Learning only happens on clean ticks β ticks where:
- the fraction of high-severity sources is below
--learn-frac-gt - no source is in BLOCK (if
--learn-skip-if-blocks=true) - global drop ratio is below
--learn-max-drop-ratio
This prevents attack traffic from poisoning the baseline.
Bootstrap schedule
The bootstrap schedule runs autotune more aggressively for the first ~14 days, then slows down to a steady-state interval. It has three phases with decreasing update rates and increasing conservatism.
| Phase | Duration | Autotune interval |
|---|---|---|
| Phase 1 | 0 β 48h | 1h |
| Phase 2 | 48h β 5d | 6h |
| Phase 3 | 5d β 14d | 24h |
| Steady-state | after 14d | 84h |
State is saved to /var/lib/kernloom/iq/state.json and reloaded on restart, so the schedule survives process restarts.
Bug fix (v0.2.0): Versions before v0.2.0 had a bug where the autotune could get stuck on quiet nodes and never apply updates (the timer reset on every skip, creating a permanent loop). This is fixed in v0.2.0. Additionally, bootstrap phase 1 previously used
max_down: 0.02(2%/hour), meaning triggers could take 70+ hours to converge from the cold-start value. The new PDPConfig profiles usemax_down: 0.10(10%/hour) so convergence happens within 48h.Troubleshooting: If autotune triggers are not moving after 2β3 revisions, delete
/var/lib/kernloom/iq/state.jsonand restart IQ.
Exemptions
Whitelist (permanent)
Sources in the whitelist are never scored, never enforced. Add IPs, IPv6 addresses, or CIDRs β one per line:
# /opt/kernloom/attested/etc/whitelist.txt
203.0.113.7
203.0.113.0/24
2001:db8::1
Reloaded automatically every --whitelist-reload (default 10s).
Feedback (temporary)
Use feedback for time-bound exemptions without permanently whitelisting:
[
{"target":"203.0.113.7","action":"forgive","ttl":"24h","notes":"partner NAT"},
{"target":"198.51.100.0/24","action":"whitelist","until":"2026-06-01T00:00:00Z"}
]
Reloaded every --feedback-reload (default 10s). Prefer until over ttl for stable expiry across restarts.
CLI flag reference
Core runtime
| Flag | Type | Default | Notes |
|---|---|---|---|
--interval | duration | 1s | Poll and decision tick |
--top | int | 200 | Evaluate top-N sources per tick |
--min-pps | float | 10 | Skip sources below this PPS |
--min-sev | float | 0 | Include candidates with severity β₯ this |
--dry-run | bool | true | Never write enforcement maps |
Profile and persistence
| Flag | Type | Default | Notes |
|---|---|---|---|
--pdp-config | string | `` | Path to PDPConfig YAML (recommended over –profile) |
--feature-profile | string | `` | Override active subsystems: dos-light, iq-learning, graph-learning, graph-enforce |
--profile | string | controller | Legacy seed profile. Aliases: routerβziti-router, controllerβziti-controller, internalβinternal-app |
--state-file | string | /var/lib/kernloom/iq/state.json | Persist tuned thresholds. Empty disables. |
--max-state-age | duration | 336h | Ignore persisted state older than this |
--state-history | int | 30 | Keep last N history entries |
Graph Learner
| Flag | Type | Default | Notes |
|---|---|---|---|
--graph | bool | false | Enable the graph learner |
--graph-mode | string | learn | One of: learn, frozen-observe, frozen-enforce |
--graph-db | string | /var/lib/kernloom/iq/kliq.db | Unified SQLite database path (graph edges + baselines) |
Whitelist
| Flag | Type | Default | Notes |
|---|---|---|---|
--whitelist | string | /opt/kernloom/attested/etc/whitelist.txt | IPv4/IPv6/CIDR, one per line |
--whitelist-reload | duration | 10s | Auto-reload interval (0 disables) |
--whitelist-learn | bool | false | Allow whitelisted sources to contribute to learning |
Feedback
| Flag | Type | Default | Notes |
|---|---|---|---|
--feedback-file | string | /var/lib/kernloom/iq/feedback.json | JSON array of temporary exemptions |
--feedback-reload | duration | 10s | Auto-reload interval |
--feedback-learn | bool | false | Allow feedback-exempt sources in learning |
--feedback-deenforce-cidr | bool | true | Actively scan and remove RL/deny entries for CIDR feedback |
--feedback-cidr-every | duration | 30s | CIDR de-enforcement scan interval |
--feedback-cidr-max | int | 5000 | Max map deletions per scan |
Bootstrap
| Flag | Type | Default | Notes |
|---|---|---|---|
--bootstrap | bool | true | Enable bootstrap autotune schedule |
--bootstrap-window | duration | 336h | Total bootstrap duration |
--bootstrap-phase1-end | duration | 48h | End of phase 1 |
--bootstrap-phase2-end | duration | 120h | End of phase 2 |
--bootstrap-every1/2/3 | duration | 1h/6h/24h | Autotune interval per phase |
--steady-every | duration | 84h | Post-bootstrap autotune interval |
--bootstrap-k-start | float | 4.0 | k at start (higher = fewer false positives) |
--bootstrap-k-final | float | 3.5 | k at bootstrap end |
--bootstrap-allow-block | bool | false | Allow BLOCK during bootstrap. Default: cap at RATE_HARD to protect against bad learning |
--bootstrap-min-windows | int | 0 | Minimum completed autotune cycles before allowing downscale (0 = disabled) |
Autotune
| Flag | Type | Default | Notes |
|---|---|---|---|
--autotune | bool | true | Enable threshold learning |
--autotune-k | float | 3.5 | k for median + kΓMAD |
--autotune-min-samples | int | 5000 | Minimum clean samples before applying |
--autotune-max-change | float | 0.05 | Max relative change per update (Β±5%) |
--autotune-alpha | float | 0.2 | Smoothing factor (0 disables) |
--autotune-floor-pps | float | 100 | Minimum trig-pps |
--autotune-floor-syn | float | 50 | Minimum trig-syn |
--autotune-floor-scan | float | 20 | Minimum trig-scan |
Clean tick gates (anti-poison)
| Flag | Type | Default | Notes |
|---|---|---|---|
--learn-sev-gt | float | 1.0 | Severity threshold for “dirty” source |
--learn-frac-gt | float | 0.005 | Max fraction of dirty sources for a clean tick |
--learn-max-sev | float | 0.8 | Only learn from sources with sev β€ this |
--learn-skip-if-blocks | bool | true | Skip learning if any IP is in BLOCK |
--learn-max-drop-ratio | float | 0.02 | Skip if global drop ratio exceeds this |
Severity model
| Flag | Type | Default | Notes |
|---|---|---|---|
--trig-pps | float | 0 | PPS trigger (0 β profile / state) |
--trig-syn | float | 0 | SYN/s trigger |
--trig-scan | float | 0 | scan/s trigger |
--w-pps | float | 0 | PPS weight (0 β profile) |
--w-syn | float | 0 | SYN weight |
--w-scan | float | 0 | scan weight |
--sev-cap | float | 0 | Normalisation cap (0 β profile) |
Strike mapping
| Flag | Type | Default | Notes |
|---|---|---|---|
--sev-step1/2/3 | float | 1.0/2.0/3.0 | Severity thresholds for delta1/2/3 |
--sev-delta1/2/3 | int | 1/2/3 | Strikes added at each step |
--sev-decay-below | float | 0.25 | Allow strike decay below this severity |
Level thresholds
| Flag | Type | Default | Notes |
|---|---|---|---|
--soft-at | int | 0 | Strikes β₯ this β RATE_SOFT (0 β profile) |
--hard-at | int | 0 | Strikes β₯ this β RATE_HARD |
--block-at | int | 0 | Strikes β₯ this β BLOCK |
Enforcement actions
| Flag | Type | Default | Notes |
|---|---|---|---|
--soft-rate | uint64 | 0 | SOFT rate limit in PPS (0 β profile) |
--soft-burst | uint64 | 0 | SOFT burst tokens |
--soft-ttl | duration | 0 | SOFT level TTL |
--hard-rate | uint64 | 0 | HARD rate limit in PPS |
--hard-burst | uint64 | 0 | HARD burst tokens |
--hard-ttl | duration | 0 | HARD level TTL |
--block-ttl | duration | 0 | BLOCK TTL (0 β profile) |
--cooldown | duration | 0 | Minimum time between level changes |
Block gate
| Flag | Type | Default | Notes |
|---|---|---|---|
--block-min-sev | float | NaN | Minimum severity to allow BLOCK (NaN β profile; 0 disables) |
--block-min-dur | duration | -1 | Severity must be sustained for this long (-1 β profile; 0 disables) |
Hysteresis
| Flag | Type | Default | Notes |
|---|---|---|---|
--up-need | int | 0 | Consecutive high ticks before escalating (0 β profile) |
--down-need | int | 0 | Consecutive low ticks before stepping down |
--min-hold-soft | duration | 0 | Minimum time in SOFT before stepping down |
--min-hold-hard | duration | 0 | Minimum time in HARD before stepping down |
Non-compliance escalation
| Flag | Type | Default | Notes |
|---|---|---|---|
--noncomp-at | int | 0 | Non-compliance ticks before accelerating to BLOCK (0 β profile) |
--noncomp-drop | float | 0 | DropRL/s threshold for non-compliance tick |
--noncomp-sev | float | 0 | Severity threshold for non-compliance tick |
--noncomp-reset-below | float | 0 | Reset counter when sev < this and DropRL/s = 0 |
Housekeeping
| Flag | Type | Default | Notes |
|---|---|---|---|
--prev-ttl | duration | 10m | Forget delta snapshot if source not seen |
--state-ttl | duration | 60m | Forget OBSERVE-only state if not seen |
See also
| Getting started | Install, bootstrap, and go from dry-run to enforcement |
| Shield reference | XDP commands, tuple enforcement, pinned maps |
| Architecture | How IQ and Shield fit into the PDP/PEP model |
| Integration Patterns | Real-world PDPConfig choices per node type |
| Operations | systemd units, log interpretation, troubleshooting |