Blog / Hiding Viruses Inside of Files

The danger with archives boils down to the limitations on doing a virus scan inside an archive (like taking a peak into Pandora’s Box before opening it, to see if it’s safe). It’s more complicated than people think. Enough preamble; let’s get down to brass taxes.

So first things firstIs it possible to do a Virus scan inside an archive? Yes. BUT only if the scanner can understand the archive format.
Oh, that’s easy. How many different formats can there possibly be?  So many … So SOOOooooo many. To list a few formats, there are: ZIP, ARJ, TAR, GZIP, CAB, RAR, LHA, 7Z, JAR, ACE, BZ2… and that doesn’t even scratch the surface, so I’m just going to stop there. I think you get the idea. If you’re curious, check out

Each format may as well be a totally different language (Comparing ZIP and TAR is like comparing English and Japanese). Oh! and don’t forget: There are also different versions of these formats. Archive formats get updated, improved and changed over time. This means that not all ZIP files are created equal (For example: French spoken in Quebec is not the same as the French spoken in France).

What if the archive format is unreadable? That’s easy:A virus scan cannot be performed. This leads to the question: “What does the virus scanner do with files it fails to virus scan?” Automatically block them, automatically allow them or make the behaviour configurable? It’s a hard choice, that I won’t go into.

So, the first requirement is that the virus scanner can understand the archive. As you can hopefully see, that is not as easy as it sounds.

Assuming the archive can be understood, what is the second requirement?
The virus scan needs to finish in a reasonable amount of time.

People often forget about this, but it’s true. We’re talking about attachments to email and people NEED their email to be delivered quickly (if it’s not delivered quickly, some people can get very upset). While the attachment is being scanned, the email is being held up in limbo. These scans aren’t like virus scanning on your computer where you can just pause it if things get slow.

What if the Virus scan takes “too long”? The scan needs to be terminated and you have another situation where a file cannot be virus-scanned.

So, the second requirement is speed. You can’t take forever to do your scanning.

What about the third requirement? Well, the uncompressed file needs to fit in memory.

You need to think a bit like a bad guy here, but there are ways of setting up your archives so that they become incredibly large when decompressed. My favourite example of this is a time when an associate of mine created a small archive of about 12 KB in size that expanded to over 2 GB in memory. To be fair it took some creative thinking to do this, but it was still a perfectly valid archive that anyone could create. If you are someone designing a virus scanner, you have to remember that you don’t have an infinite amount of space to work with.

What if it’s too big to fit in memory? I think you know where this is going: no virus scan.

And there is a fourth requirement? The file cannot be password-protected.

As soon as you put a password on any archive, you can’t open it without using the password. Even if the archive format could be understood, this means a virus scanner cannot scan it. Adding a password to archive decompression is a manual step that needs to be done by a person. There’s no way of automatically searching the email for the password, because who knows if it will even be there. Sometimes passwords get sent in separate emails, over the phone, etc.
This is a method I’ve seen get used fairly often, in order to bypass virus scanners. I’m sure many people have seen emails in the past from Purolator or Fedex or UPS about a parcel, with instructions to open an attached ZIP file using the password provided?
That scam was fairly popular a few years ago and was used as a way of delivering a malware payload through virus scanners.

To be fair, I am simplifying things a lot and there are more options and requirements that we could go into, but I think this is enough to get the point across.

So what does this mean for a the virus scanner that is inspecting email and scanning attachments?

It means that a very serious design choice exists for the people making the scanner and there are really only 2 options:

Option 1: Make the configuration more complicated.
This means setting up options for what to do when the scanner runs into various situations. This leads to a lot of control on the part of the email administrator, but also means they need to have way more knowledge about what they are doing.

This can also lead to more headaches for the email admins, as users encounter behaviour that appears to be erratic: “Why did attachment A get blocked, but attachment B did not? Attachment B was smaller!”.

Option 2: Simplify the configuration/behaviour.
Basically this means that to avoid running into problems with your scanner (and your users), you don’t even try to scan archives.
This leaves an obvious security hole that requires blanket rules like “No archived attachments” in order to fix.
To be perfectly honest: Even if the scanner is capable of scanning archives, some email admins will use blanket rules anyway, because users can’t be made to understand the limitations.
While users still find these blanket policies annoying, the behaviour is easier to explain because it looks consistent from the outside.

So, there you go! Today you’ve learned a little more about one small aspect of computers and the real-world Security limitations that exist. So don’t open any archives that come from places you don’t completely trust and never open password-protected archives.

Oh yeah, I didn’t even go into some interesting tricks and situations, like putting archives within archives within archives within archives (etc.), or renaming archives (so a ZIP looks like a TAR), or…….

Karl Buckley
Cyber Security Supervisor