NullRoute Behavioral Atlas
Post-authentication SSH session data from live honeypot sensors. What attackers do after they get in - not just that they tried. Behavioral atoms, genome family classifications, complexity scores.
This dataset is a structured record of post-authentication attacker behavior captured on NullRoute's live SSH honeypot network. It is not a raw sensor export, an IOC feed, or a credential list.
Every session in this dataset represents an attacker who successfully authenticated to one of four honeypot nodes. The data captures what they did next - the sequence of commands, normalized into behavioral atoms, and classified into genome families where a match exists.
Most sessions (~97%) are zero-command: automated scanners that authenticate and disconnect without running anything. The research value is in the remaining ~3% - use command_count > 0 to filter, and complexity_score to rank by behavioral richness.
The family column is NULL for the majority of sessions. This is expected and not a data quality issue - it means the session did not match any known genome family, either because it had no commands or because its behavior has not yet been classified. Only sessions with a confirmed behavioral match carry a family name.
You are free to share and adapt this dataset for any purpose, including commercial, as long as you give appropriate credit. Full terms: creativecommons.org/licenses/by/4.0
Plain text:
BibTeX:
This dataset was collected on live honeypot sensors running specific personas (generic Linux, healthcare PACS, AI/ML research, security ops). Behavioral patterns observed here may reflect targeting of those personas specifically and should not be treated as representative of the global attacker population.
Cowrie is a medium-interaction SSH honeypot. Its terminal emulation may cause sophisticated operators to detect the environment and abort, introducing selection bias toward automated tools and less sophisticated attackers. Factor this into any generalization claims.
NullRoute provides this dataset for research and educational use. No warranty is made regarding completeness, accuracy, or fitness for any particular purpose.