It’s a mysterious thing, antivirus. I mean, how do you know if it’s even working? What’s it doing, anyway?
How Antivirus Programs Detect Malware
Antivirus programs detect more than just viruses. A virus is only one type of malware. Malware is a broader term for any devious program that you don’t want on your computer (including spyware, ransomware, rootkits, keyloggers, etc).
There are basically two ways to recognize malware: how it looks and how it behaves.
Detection Method 1 – How it looks
This is the primary way that antivirus has historically worked. The signature database (sometimes called the “definitions” file) is just a database full of “fingerprints” of known malware. When antivirus companies discover new malware, they update the signature database on your computer so it remains current with the latest threats.
Now, every time a program runs on your computer or is downloaded from the internet, your antivirus will scan it to get the fingerprint. Then it checks the fingerprint against its database. If it finds a match, then it’s flagged as malware. It’s like identifying a criminal by the fingerprints left at the scene of a crime. The fingerprints are lifted then run through the police database of known criminals.
However, you might be able to see a problem here. What about new malware that hasn’t been discovered yet? These are officially called zero-day threats – malware that hasn’t been seen before. So just because it’s not in the database doesn’t mean it isn’t malware. Just like a first-time criminal whose fingerprints are not in the police database. This is where heuristics comes in.
Detection Method 2 – How it behaves
Heuristics is a more modern, advanced, and far more difficult method of detecting malware, but arguably even more important than the database method.
Malware typically performs certain types of functions, which means it has common behavioral patterns. Heuristics is the practice of trying to detect those patterns. In theory, this is supposed to protect against zero-day threats because you don’t need to know what malware looks like if it’s behavior is giving it away.
Some common behaviors that can betray computer code as malware:
- Deleting or modifying main system files
- Replicating itself (how worms operate)
- Containing instructions to delete itself (to hide evidence)
- Recording your keystrokes (how keyloggers operate)
- Suppressing normal computer functions (like preventing a program from opening)
- Looking like a modified version of known malware (called a genetic signature)
- Attaching itself to another program (how trojans operate)
- Containing suspicious text that is supposed to be displayed to the user (like if it contains the words “you’ve been infected”)
- Communicating with a known malicious server on the internet
- Attempting to hide itself from the system (how rootkits operate)
- Encrypting your files (how ransomware operates)
It might sound straight forward to find malware based on these criteria, but performing successful heuristics is actually very difficult. Many legitimate programs need to perform some of these same functions. So how can we tell the difference between the good and bad? If the detection is too aggressive, it will start flagging normal programs as malware (called false positives). If it’s too passive, it will start missing threats.
To make matters worse, simply detecting these behaviors can be difficult. Malware can be written in many different programming languages, and within each language there are many different ways to perform any given function. So trying to decipher exactly what a program does based on its programming is a very complicated task.
But instead of just examining the code, it’s also possible for the heuristics scanner to execute the code and observe what it does. It can do this safely in what’s called a “sandbox”. A sandbox is a closed-off environment within the antivirus program that keeps the suspicious code from getting out into the rest of your computer.
And to complicate the process further, more advanced malware may behave differently than normal when put under the microscope. This also needs to be recognized and compensated for.
Then add even more complicating factors, like trojans (malware hidden within legitimate programs), self-encrypting malware, self-altering malware, etc, and you can see why this is so difficult to do.
In the end, there is no foolproof way of detecting malware, which is becoming more devious and sophisticated every year. It takes a lot of smarts to do what antivirus engineers do and I don’t envy their jobs. Some day there may be a better answer to our malware problem, but for now the antivirus companies will have to do.
Related Articles