The Code Red Worm

link to published version in the Communications of the ACM, November, 2001

accesses since September 11, 2001

The Code Red Worm

Hal Berghel

The concept of combining the "new Dew flavor of the summer" with "worms" seems to suggest a non-alcoholic variation of Mescate rather than an major Internet security breach. However, this past August, "Code Red" took on an ominous significance. In this column, I'll discuss how this latest incarnation of "malware" drilled itself into cyberspace.

THE ALERT

The following FBI alert provides a useful measure of the potential threat of the Code Red Worm to the Internet Community:

"For Immediate Release: 3:00 PM (EDT) July 29, 2001"

"A very real and present threat to the Internet: July 31 deadline for action. SUMMARY: The Code Red Worm and mutations of the worm pose a continued and serious threat to internet users. Immediate Action is required to combat this threat. Users who have deployed software that is vulnerable to the worm (Microsoft IIS versions 4.0 and 5.0 must install ... a vital security patch."

"How big is the problem? On July 19, the Code Red Worm infected more than 250,000 systems in just 9 hours. The worm scans the Internet, identifies vulnerable systems, and infects these systems by installing itself. Each newly installed worm joins all the others causing the rate of scanning to grow rapidly. This uncontrolled growth in scanning directly decreases the speed of the Internet and can cause sporadic but widespread outages among all types of systems. Code Red is likely to start spreading again on July 31st, 2001 8:00 pm EDT and has mutated so that it may be even more dangerous. This spread has the potential to disrupt business and personal use of the Internet for applications such as electronic commerce, email and entertainment."

This alert was produced jointly by Microsoft together with the FBI National Infrastructure Protection Center, The Information Technology Association of America, The CERT Coordination Center, the Sans Institute, Internet Security Systems and the Internet Security Alliance - a "Who's Who" of the major agencies and organizations concerned with Internet Security. But what is Code Red, and how did this happen?

A WORM BY ANY OTHER NAME

Code Red began as just another piece of malicious software ("malware" in modern techno-jargon). The two most common forms of malware are viruses and worms, and combinations thereof.

Computer viruses attach themselves to otherwise "healthy" host programs from which they launch their attack and infection of computer systems. Viruses with nicknames like Jerusalem, Christmas, Michelangelo, Chernobyl, and sundry mutations thereof, have been widespread since the 1980's, initially spreading by disk sharing, thereafter by means of digital networks. Modern viruses frequently appear as executable content embedded in distributed data files (e.g., email attachments, spreadsheet and word processing macros). As of September 3, 2001, Symantec's Norton Anti-virus software checks for 52,911 known viruses.

Worms, on the other hand, run as autonomous, standalone programs. Worms achieve their malevolence without need of unsuspecting host programs for either infection or propagation. Passive worms propagate with data transmissions, such as email, as in the case of the VBS/AnnaKournikova spamming worm that used Visual Basic to exploit a hole in Microsoft Outlook to replicate itself to everyone in the host computer's email address book. Melissa and the love letter worms were passive.

Active worms, on the other hand, exploit security weaknesses in networking and operating systems software to aggressively gain entry into computer systems. The association of the terms "tapeworm" and "worm" with digital networks is derived from John Brunner's 1975 science fiction novel, The Shockwave Rider, while the initial discussion of the early experiences with worm programs was provided by John Schoch and Jon Hupp in 1982 (CACM, 25:3, March, 1982).

At this point, worm technology is so widespread that a lexicon could be (and perhaps has been) developed to describe each variety. Code Red is an active worm, as was the 1987 Morris' Cornell Internet Worm and the Linux Raymen worm. As of March, 2001, CNET reported that worms accounted for 80% of the invasive malware on the Internet (news.cnet.com/news/0-1003-201-5125673-0.html).

Not surprisingly, modern malware has become hybridized. Melissa, for example, is not only a virus and a worm, but also a Trojan Horse! In addition, worms, unlike viruses, typically reside in primary memory rather than disk, and as such are immune to detection from most virus scanners.

Code Red also threw in a few unique twists.

(1) It propagated through TCP/IP Web port 80
(2) It identified itself by defacing English language Websites with "Welcome to http://www.worm.com! - Hacked by Chinese!"
(3) Self-propagation was controlled by means of a "random" IP address generator - that had a bug in it (more later)
(4) After the initial infection and incubation periods, Code Red was programmed to unleash a denial-of-service attack on the Whitehouse.gov Website by targeting the actual Whitehouse.gov IP address.

As it turned out (3) and (4) were of far-reaching significance.

THE DISCOVERY OF CODE RED

The first indication that a new worm had been unleashed on the Internet occurred on Thursday, July 13, 2001 as Ken Eichman, senior security engineer for the Chemical Abstract Services, noticed 611 attacks on the CAS Web servers from 27 different computers. By the following Saturday, Eichman noticed the number of attack sources exceeded 1,000. By Sunday, the presence of a worm was confirmed by Dshield.org, and by Monday, July 16, eEye Digital Security programmers began reverse-engineering the malicious code.

With caffeine-induced fury from the new Mountain Dew soft drink, Code Red (hence the honorary nom de plume), the eEye team lead by Marc Maiffret, eEye's CHO (Chief Hacking Officer), determined by Tuesday, July 17 (one week after Code Red was first deployed from a university in China) that the new worm exploited a security hole in Microsoft's Internet Information Services deployed on the millions of Microsoft Windows NT, Windows 2000 and beta versions of Windows XP servers. Technically, the security hole is called an "index-server flaw." The flaw is that versions 4 and 5 of Microsoft's Internet Information Services, or IIS for short, uses an indexing tool called an ISAPI filter that assigns data files to executable program environments automatically. However, this tool did not check for buffer overflow, and the un-detected overflow was the access point for the Code Red worm.

By Wednesday, July 18, the eEye team determined that Code Red was designed to terminate propagation and launch a denial-of-service attack on the Whitehouse.gov server at midnight G.M.T, July 19. Denial-of-service attacks overwhelm Internet servers with so much useless data that they are unable to function properly. The eEye team also discovers that the d-o-s has targeted the Whitehouse site by IP address rather than URL. This is a pivotal moment in the discovery process, for the definitive repair involved nothing more than having the Whitehouse.gov server re-located to another IP address. As predicted, the d-o-s attack began on time, but without significant result to the Whitehouse Web server. According to CNET, by the time of the d-o-s launch, each of the more than 359,000 infected computers were set to unload 400MB of useless data on the Whitehouse sever ever 4.5 hours.

The end? Not quite. A brief review of the FBI alert above will indicate that it is dated ten days later. By then, a new-and-improved Code Red had appeared, some have argued because of the Code Red security advisory posted by eEye. To re-visit this issue is counter-productive, but the reincarnation of an improved version of Code Red scheduled for a midnight G.M.T. July 31 deployment is beyond dispute. This new-and-improved version was the subject of the Code Red FBI alert above. Fortunately, by the time that the new Code Red triggered, the Windows patches were widely enough deployed to lesson the damage to specific targets. However, the spread of the second version was considerably wider than the original (cf. www.digitalisland.net/codered/). Figure 1 portrays the spread of the second Code Red worm through its first week of life. This second version eschewed defacing Web pages and pub in a fix for its faulty random IP address generator.

Figure 1. The Spread of Code Red Version 2 -- August 1-8, 2001
(source: The Digital Island www.digitalisland.net/codered)

One last point. I mentioned that (3) and (4) in the previous section had far-reaching significance. I haven't yet touched on the significance of (3). A very interesting byproduct of the bug in the original Code Red was that instead of creating random paths to each infected server, Code Red infected each new server via the same path as its predecessor, thereby leaving a log of infected servers behind with each new infection. A failure of epidemic proportions in hiding one's tracks (should the perpetrator be identified, perhaps we should give him/her an honorary "F" in Software Design. In any event, bugs of this nature are the holy grail of Internet Forensics.

INDEX-SERVER FLAWS IN A NUTSHELL

We would be remiss were we to fail to provide at least a simple overview of the technical aspect of the Code Red problem.

The essence of the vulnerability is the way that Windows handles buffer overflows with their dynamic link libraries (DLLs). Incoming data with pre-defined filename extents (e.g., .html) are automatically assigned to DLLs for interpretation and processing (e.g., ssinc.dll). If all goes well, the input string is "understood" with the help of the DLL file, and the appropriate action is taken on the data string currently in the buffer. But what happens if the data string is too long to fit in the buffer? If the DLL is executing within the system context, any anomalies can take on life-threatening proportions for the computer system.

Code Red accesses servers through the primary Web port (#80), regardless of host operating system. In the case of IIS servers from Microsoft, the invasion is more problematic, although invasion of other servers can cause unpredictable results as well. The worm itself is actually sent to the server as a chunk of data that follows one input buffer's worth of data in a Web query. The idea is pretty simple.

Assume that the you're sending data to a Perl program on some Web server somewhere. One way to accomplish this is to enclose the program name and data in a query string entered through your browser. To illustrate, the Web query string

http://www.website.net/greeting.cgi?data=25nnn

will race through port 80 on the server named website.net which will cause the server to execute the Perl program named accepting "25nnn" as its input.

In order to accelerate processing, operating systems are designed to automatically associate files with certain extents with built-in programs. This happens on a workstation when one clicks on a .JPG image file from a directory index, and Photoshop or some other program automatically launches to render the graphic. On Microsoft servers, files with extensions like .IDA and IDQ are automatically handled by IIS through the "ASAPI extension," IDQ.DLL, which is running within a core system program, .

Therein lies the rub. Files like .IDA and .IDQ are expected to be "scripting files" that contain indexing information. The bug in the Windows IIS results from the fact that if the data assumed to be associated with an .IDA file is too long for the buffer, the data that overran the buffer will be executed by the operating system. It would be as if were some rogue code designed only accept two integer input (e.g. "25"), and transfer control to any binary executable that followed thereafter (e.g. "nnn").

The temporary solution put in place that got us through the first iteration of Code Red worm, and others of its ilk, simply disassociated the offending DLL from the offending file extents. However, this is at most a patch unless the underlying logic has been changed.

DÉJÀ VU ALL OVER AGAIN

To say that Code Red represents a serious threat is an understatement. The fact that a security hole as simple as the one described above allowed the infection of hundreds of thousands of Internet computers betrays a fundamental flaw in the way that we handle the standards for data exchange on the Internet. This is confirmed by the fact that within the first week after the attack of the revised version of Code Red, a second-generation, Code Red II, had emerged that targeted Cable and DLS ISP networks. Unlike Code Red v1 and v2, Code Red II opens backdoors to infected servers through which subsequent attackers may pass (see sidebar for relevant links). We obviously haven't heard the last of Code Red.

What is more important is the economic impact of all of this derring-do. Figure 2 illustrates the volume of computer attacks broken out by number of computers engaged in the attack and the total number of incidents for the 20 leading TCP/IP ports. These numbers are staggering. But what is more staggering are the estimates of economic impact caused thereby.

Figure 2. Number of Internet Attacks by Port and Frequency
(source: the SANS Institute www.incidents.org)

Michael Erbschloe, Vice President of Research at Computer Economics estimates that the Code Red worm cost society about $2.6 billion in July and August alone. Add to that $8.7 billion for the Love Bug, $1.2 billion for Melissa, $1 billion for Explorer, another $1 billion for Sir Cam, and we're talking serious money. Erbschloe's estimates account for approximately equal losses resulting from returning the computer systems to pre-infection operating status and lost productivity. These losses are so considerable, that Erbschloe has developed a rigorous model for measuring the economic impact of ten different types of information warfare (see reference).

What is wrong with this picture? Perhaps this is the time to add emphasis to courses in social issues in computing in our curriculum models while we simultaneously ratchet up the software standards of and disclosure requirements for software connected to the Internet. I'll have more to say about malware in a subsequent column.

SIDEBAR

Additional information on the Code Red Alert may be found in David Becker's CNET column at http://news.cnet.com/news/0-1003-200-6718987.html?tag=rltdnws. More information on the Marc Maiffret's discovery of IIS vulnerability, along with an extensive product line of protective software, is available online on the eEye website at http://eeye.com/html/index.html.

The original June 18 security advisory explaining "buffer overflow" vulnerability from eEye is online at www.eeye.com/html/press/PR20010618.html. This caused a considerable stir, as Microsoft claimed that this advisory was directly responsible for the second, improved version of Code Red. For comparison, another variant from NSFOCUS is online at www.nsfocus.com/english/homepage/sa01-06.htm.

Microsoft's description of the problem of IIS 5 is online at www.microsoft.com/technet/itsolutions/security/tools/iis5chk.asp. Detailed instructions on accessing patches to Windows environments, including links to Microsoft's own download sites, are available on the Digital Island at www.digitalisland.net/codered. In addition, Digital Island includes an audio-enhanced slide presentation on Code Red by Jason Fossen of the SANS Institute (www.sans.org) that provides a good overview of the underlying technology issues.

For an overview of Code Red II, see the SANS Institute Emergency Incident Handler site at www.incidents.org/react/code_redII.php.

Michael Erbschloe's analytical approach to quantifying the costs of information attacks is to be found in his new book, Information Warfare: How to Survive Cyber Attacks, Osborne/McGraw-Hill, 2001.