Among cyber-attack techniques, what is a DGA?
What is a DGA? To evade detection, they churn out domain names and IP addresses for malware command and control servers. Learn more with BlueCat.
Among all the many types of cyber attacks out there, you might be wondering: What is a DGA?
A Domain Generation Algorithm (DGA) is a technique used by cyber attackers to generate new domain names and IP addresses for malware’s command and control servers. Executed in a manner that seems random, it makes it nearly impossible for threat hunters to detect and contain the attack.
Think of DGAs this way: If someone throws you a single ball, it’s easy to catch it. But if someone throws 100 balls your way, what are the chances you’re going to catch all of them?
This post will first seek to answer the question of what is a DGA. Then, it will delve further into how a DGA works. Next, it will explore some of the security advances to counter DGAs. Finally, it will touch on how BlueCat Edge can bolster your network against DGAs and other threats.
What is a DGA and how did it come to be?
Once cyber attackers send their malware out to do its dirty work, they need to both keep track of what it’s up to and send it instructions. Command and control (C&C) servers issue orders to the malware-infected systems, telling them what to do.
What, precisely, infected machines or botnets (a gaggle of infected machines) are instructed to do depends on the purpose of the malware. It may be to launch a denial-of-service attack, install other malware such as keyloggers, encrypt the hard drive, or extract sensitive data.
Bad actors take advantage of compromised hosts and subsequently connect them to their command and control servers within a specific domain. If defenders discover the name and IP address of the C&C server, they can shut it down.
However, as a counter tactic to evade detection, attackers use DGAs to periodically generate a large number of domain names and IP addresses for the C&C server. This makes it difficult, if not virtually impossible, to easily shut down a C&C server. Since the server and the malware both know the rules for those address changes, they can continue to communicate.
How a DGA works
DGAs generate domains over time that are used as rendezvous points where the infected hosts and the C&C server connect to keep the scheme going. At predetermined intervals, the DGA generates new names for its C&C server using one of several techniques.
It may create what looks like a randomly generated set of numbers or letters (actually, like so-called random number generators, they start with a random seed value) and tack on a top-level domain suffix (e.g., .com or .org). Similarly, pseudo-random number generators produce sequences of numbers that appear like they are random.
Or, the DGA might build a mashup of words or construct hexadecimal strings. It doesn’t matter—as long as the characters used are acceptable as part of a domain name, it will work.
Each piece of DGA malware will have distinct patterns in the domains it generates.
For example, simda alternates between using vowels and consonants in its domains (e.g., puvecyq[.]info, digivehusyd[.]eu) while its top-level domain can vary. Meanwhile, suppobox combines two or more words together from a list for its suspicious purposes (e.g., sharmainewestbrook[.]net, tablethirteen[.]net). Finally, tinynuke creates MD5 hashes that are 32 characters long (e.g., 8c28b41611c50aa0494df096e4d0444b[.]com).
Usually, hundreds or thousands of domain names emerge from each run. Attackers only need to register a single one of those domains (it’s usually done automatically) to have a fresh C&C DNS entry.
These domains are launched systematically and follow patterns that the malware or botnet understands. Bad actors can also configure the DGA to register a new domain at whatever frequency is useful to them—every day, hour, or even minute.
Prevalence among malware families and IoT devices
According to Netlab 360, at least 49 malware families use DGA domains. This includes the venerable Conficker, which is probably not the first but certainly among the most persistent. It is still wreaking havoc on vulnerable devices more than a decade after its release.
Internet of Things (IoT) devices are particularly vulnerable to attacks that use DGA techniques. Palo Alto Networks’ 2020 Unit 42 IoT Threat Report estimated infections in over half a million IoT devices alone, many of them infrequently patched medical devices.
A real challenge for threat hunters
When a DGA fuels malware attacks, the C&C server’s IP address and domain name can quickly switch. This presents a real challenge for defenders. Even if defenders figure out the C&C server’s current address, blocking it will only be effective for a short time.
For more on how DGAs work, BlueCat Chief Strategy Officer Andrew Wertkin whiteboarded how DGAs keep threat hunters guessing.
Security advances to address DGAs
By analyzing DNS logs, security teams can try to determine patterns in the rubbish DNS entries left behind by DGA-fuelled malware. However, manually combing through logs to find the flags is tedious and time-consuming. Once found, then discerning their patterns is not a trivial task.
Data scientists, linguists, and cyber threat researchers, including experts from BlueCat, are tackling the DGA problem in a more systematic fashion. They perform statistical analysis and use machine learning and artificial intelligence to separate the good from the bad.
It works. Using two techniques from the neural network and machine learning worlds (LSTM and ELMo, if you’re curious), along with a dash of natural language processing research, BlueCat dramatically upped its DGA detection from 21.2% to 95.8%.
However, there’s a lot more to do. As quickly as researchers come up with DGA detection methods, adversaries develop new twists on the technology.
How BlueCat can help with DGAs
One approach to detecting and protecting against DGAs lies in monitoring your DNS data. BlueCat DNS Edge monitors all DNS queries, responses, and IP addresses on your network.
Yes, there is a lot of data to comb through. However, there are always hints you can leverage to find that malicious domain needle in the haystack of DNS data.
For example, if you come across a high frequency of non-existent domains, you could be dealing with a DGA. Furthermore, malicious domains themselves are also structured differently from legitimate domains. Malicious domains often have long, seemingly random strings of characters whereas non-malicious domains are simpler to remember.
Patterns eventually begin to arise. They might appear in the domains themselves or in the destinations the domains are trying to connect with.
With BlueCat DNS Edge, you can set policy-based rules to block, allow, or manually watch certain domains. And you can seamlessly integrate security intelligence from BlueCat’s own blocklists, CrowdStrike, and other third-party threat feeds to block new threats as they emerge. This screenshot illustrates how BlueCat’s policy-based rules block DNS queries from known DGA malware:
For more details, here’s a demo of how BlueCat Edge can further help network teams to detect, investigate, and remediate threats: