Open Dataset v0.1 - 2026

NullRoute Behavioral Atlas

Post-authentication SSH session data from live honeypot sensors. What attackers do after they get in - not just that they tried. Behavioral atoms, genome family classifications, complexity scores.

152,728Sessions
47,384Behavioral Atoms
11Genome Families
4Sensor Nodes
24 daysCollection Window
Download nullroute-atlas-v0.1.0.zip sessions.parquet + atoms.parquet + families.parquet + manifest.json + SCHEMA.md + LICENSE  |  ~2.6 MB
Download v0.1
What this is

This dataset is a structured record of post-authentication attacker behavior captured on NullRoute's live SSH honeypot network. It is not a raw sensor export, an IOC feed, or a credential list.

Every session in this dataset represents an attacker who successfully authenticated to one of four honeypot nodes. The data captures what they did next - the sequence of commands, normalized into behavioral atoms, and classified into genome families where a match exists.

Most sessions (~97%) are zero-command: automated scanners that authenticate and disconnect without running anything. The research value is in the remaining ~3% - use command_count > 0 to filter, and complexity_score to rank by behavioral richness.

The family column is NULL for the majority of sessions. This is expected and not a data quality issue - it means the session did not match any known genome family, either because it had no commands or because its behavior has not yet been classified. Only sessions with a confirmed behavioral match carry a family name.

Files
sessions.parquet One row per authenticated session. Node, timestamps, duration, command count, family classification, complexity score, novelty flag. 152,728 rows / 2.5 MB
atoms.parquet One row per behavioral atom within a session. Normalized command categories (e.g. recon_uname, inject_ssh_key, download_remote). Joins to sessions on session_id. 47,384 rows / 59 KB
families.parquet One row per genome family. Name, phylum, kingdom, session count, date range, observed nodes, and canonical behavioral sequence. 11 rows / 3 KB
SCHEMA.md Full column documentation, atom vocabulary, session_id decision, disclaimer, license, and citation. documentation
What was removed
Source IP addresses Removed. Operational and privacy reasons.
Credentials Removed permanently. Never included in any public release.
Raw commands Removed. May contain honeypot filesystem paths or IP addresses. Replaced by normalized atoms.
Source geolocation Deferred. Enrichment coverage below 5% in v0.1.
Inter-command timing Deferred. Timing capture not yet consistent across all nodes.
Original session IDs Replaced with salted SHA-256 hashes. The original Cowrie IDs appear in the live API; hashing prevents cross-reference joins that could reconstruct IPs. Hashed IDs are stable across atlas versions.
License
Creative Commons Attribution 4.0 International (CC BY 4.0)

You are free to share and adapt this dataset for any purpose, including commercial, as long as you give appropriate credit. Full terms: creativecommons.org/licenses/by/4.0

Citation

Plain text:

NullRoute Behavioral Atlas v0.1 (2026). Live honeypot behavioral genome dataset. https://nullroute.live/data. Licensed under CC BY 4.0.

BibTeX:

@dataset{nullroute_atlas_2026, title = {NullRoute Behavioral Atlas v0.1}, author = {NullRoute}, year = {2026}, url = {https://nullroute.live/data}, license = {CC BY 4.0} }
Disclaimer
Before using this data

This dataset was collected on live honeypot sensors running specific personas (generic Linux, healthcare PACS, AI/ML research, security ops). Behavioral patterns observed here may reflect targeting of those personas specifically and should not be treated as representative of the global attacker population.

Cowrie is a medium-interaction SSH honeypot. Its terminal emulation may cause sophisticated operators to detect the environment and abort, introducing selection bias toward automated tools and less sophisticated attackers. Factor this into any generalization claims.

NullRoute provides this dataset for research and educational use. No warranty is made regarding completeness, accuracy, or fitness for any particular purpose.

Context