← Genome
Genome / Phylogeny

Behavioral Phylogeny

How similar are attacker families, really? Each family is represented by its behavioral atom bag - the weighted distribution of primitive actions observed across all sessions. Pairwise cosine distances feed a UPGMA hierarchical clustering, producing this evolutionary tree. Families that cluster together share behavioral DNA. Families that stand alone are genuinely novel.

11
Canonical Families
3,464
Sessions Analyzed
0.055
Closest Merge Height
0.465
Root Divergence
4
Observation Nodes

UPGMA Cladogram

Horizontal dendrogram. Root at left, leaf nodes at right. Branch position encodes merge height (behavioral distance). Leaf size scales with log(sessions). Color encodes phylum.

Worm
Scanner
Botnet
Miner
Operator
Unknown

Behavioral Distance Matrix

Pairwise cosine distance (1 - similarity) between all family atom bags. Ordered by UPGMA leaf sequence - clusters appear as dark blocks on the diagonal. Hover for details.

Phylogenetic Insights

Worm Clade
h = 0.055
Three worm families share near-identical behavioral DNA. worm_network_recon_ssh_persistence and worm_ssh_persistence are so similar they likely represent the same tool family with slightly different operator configs.
Distance 0.109 - closest observed pair in the genome.
Scanner Isolation
Solana Scanner
The Solana Scanner is behaviorally isolated from every other family - distance 0.78+ to all others. Its only action is a single uname recon atom, producing maximum behavioral entropy. It shares nothing with any worm, botnet, or operator family.
cce77f53 - 940 sessions across all 4 nodes.
Unknown Shadow
D929 / Dota
Unknown family d9298a10 merges with Dota/MDRFCKR at height 0.095 - the second-closest pair in the genome. It exhibits Dota-like resource recon and persistence behavior but lacks ssh_persistence and competitor_cleanup genes. A stripped or early-stage variant.
73 sessions on node1-de and node3-fr.

Method

Each session produces a sequence of behavioral atoms via the SessionDNA pipeline. Atoms are deduplicated and counted into a weighted atom bag per family variant. Family-level bags are computed as session-weighted averages across all variants.

Pairwise distance is defined as 1 - cosine(bag_i, bag_j). Distance 0 = identical behavioral profile; distance 1 = completely disjoint behavior. The hierarchical clustering uses UPGMA (Unweighted Pair Group Method with Arithmetic Mean) - at each step, the two closest clusters merge, and new distances are the weighted average of constituent distances.

Tree is rebuilt automatically when the genome registry is updated. Last build: 2026-04-04.

Phylum
Sessions
Distance
Genes
Nodes