The term “buffer” is a generic term that refers to a place to store or hold something temporarily before using it, in order to mitigate differences between input speed and output speed. In computer programming, data can be placed in a software buffer before it is processed. A software buffer is just an area of physical memory (RAM) with a specified capacity to store data allocated by the programmer or program. The data is temporarily stored until the computer is ready to accept it or before being moved to another location.
Buffers come in handy when a difference exists between the rate data is received and the rate it is processed. Here are a few examples of buffering that we see in everyday life:
- When streaming a movie from the internet for instance, a part of the movie you are streaming downloads to a memory location (buffer) in your device to help you stay ahead of your viewing. “Buffering” delays occur when video data is processed faster than it is received.
- Your smartphone keyboard has a buffer used to hold keystrokes before they are processed. You may have seen it in action without realizing it. Most of us have experienced a situation where we typed something on a keyboard and got no response. Then, all of a sudden, an instant burst of text on the screen. All the keystrokes were held in the buffer and, when the system unfroze, they were all released from the buffer.
- Your printer buffer is used for spooling, which is really just saying that the text to be printed is sent to a buffer area or spool so that it can be printed from there. This frees up the computer to attend to other things.
The video buffering example shows what happens when data is processed faster than it is received. But occasionally, the opposite happens: the amount of data received is larger than the assigned buffer capacity. The extra data overflow causes the program to freeze, malfunction, or even crash. This is what computer scientists commonly refer to as a buffer overflow or buffer overrun. In this piece, we will explain buffer overflow vulnerabilities and attacks in detail. Specifically, we’ll be covering the following areas:
- What is buffer overflow?
- Buffer overflow vulnerabilities and attacks
- Notable examples of buffer overflow attacks
- How to detect buffer overflow
- How to prevent and mitigate buffer overflow
What is buffer overflow?
A buffer overflow, just as the name implies, is an anomaly where a computer program, while writing data to a buffer, overruns it’s capacity or the buffer’s boundary and then bursts into boundaries of other buffers, and corrupts or overwrites the legitimate data present. Imagine a container designed to accommodate eight liters of liquid content, but all of a sudden, over 10 liters were poured into it. Yea, you guessed it right! With not enough space to hold the extra liquid, the contents overflow the bounds of the container. Buffer overflow is in principle similar to this concept.
For example, let’s pretend that Joe has written a web application that requires users to enter their usernames when they want to access the app. While developing the web app, Joe allocates an 8-byte buffer capacity for the storage of the username entered by users. After all, he doesn’t expect anybody to input a username longer than 8 characters. Now, a user named Jane decided to input 10 repeated strings of the letter “J”, instead of the username Jane. To her surprise, the web application freezes and refuses to accept new connections from everyone else, resulting in denial of service.
The scenario described above is a typical buffer overflow. The 10 character username inputted by Jane has overrun its bounds and copied over all other surrounding buffers in the vulnerable function, and has caused the application to misbehave. This can be exploited to execute arbitrary code on the web application.
By entering data crafted to cause a buffer overflow, it is possible to write into areas known to hold executable code and replace it with malicious code; or to selectively overwrite data pertaining to the program’s state, thereby causing behavior that was not intended by the original programmer. This overwritten data can also alter the normal functioning of the application by making it perform unauthorized activities, resulting in erratic program behavior such as memory access errors, incorrect results, or even crashes. This is commonly referred to as buffer overflow attack. Exploiting the behavior of a buffer overflow is a well-known security exploit. A common example is when cybercriminals exploit buffer overflow to alter the execution path of applications.
Buffer overflow vulnerabilities and attacks
The buffer overflow problem is one of the oldest and most common problems in software development dating back to the introduction of interactive computing. Certain programming languages such as C and C++ are vulnerable to buffer overflow, since they contain no built-in bounds checking or protections against accessing or overwriting data in their memory. More modern high-level languages such as Java, Python, and C# have built-in features that help reduce the chances of buffer overflow, but may not completely eliminate it.
Many cyber attacks exploit buffer overflow vulnerabilities to compromise or take control of target applications or systems. Attackers exploit buffer overflow issues by attempting to overwrite the memory of an application in order to change the execution path of the program, thereby triggering a response that exposes private data. If attackers know the memory layout of a program, they can intentionally send new instructions to the application by injecting extra code to gain unauthorized access to the application.
Buffer overflow attacks come in different forms, and employ different tactics to target vulnerable applications. The two most common attack tactics are:
- Stack overflow attack: A stack-based buffer overflow occurs when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in the corruption of adjacent data on the stack. This is the most common type of buffer overflow attack.
- Heap overflow attack: A heap-based buffer overflow is where the buffer, to be overwritten, is allocated a large portion of additional memory. Exploitation is performed by corrupting stored data in ways that cause the application to overwrite internal structures. This type of attack targets data in the open memory pool known as the heap.
Notable examples of buffer overflow attacks
The fact that buffer overflow continues to rank as one of the most common security vulnerabilities in software despite being known to the security community for many years is somewhat surprising. Buffer overflow attacks have been responsible for some of the biggest data breaches in history. Some notable examples include:
Morris Worm: The Morris worm of 1988 was one of the first internet-distributed computer worms, and the first to gain significant mainstream media attention. It exploited a buffer overflow vulnerability in the Unix sendmail, finger, and rsh/rexec, infecting 10% of the internet within two days. The Morris worm exploitation infected over 60,000 machines between 1988 and 1990. It has sometimes been referred to as the “Great Worm”, or the “Grand Daddy” when it comes to buffer overflows, because of the devastating impact it had on the internet at that time, both in overall system downtime and in psychological impact on the perception of security and reliability of the internet.
SQL Slammer: SQL Slammer is a 2003 computer worm that exploited a buffer overflow bug in Microsoft’s SQL Server and Desktop Engine database products. It is a small piece of code that does little other than generate random IP addresses and send itself out to those addresses. If a selected address happens to belong to a host that is running an unpatched copy of Microsoft SQL Server Resolution Service listening on UDP port 1434, the host immediately becomes infected and begins spraying the internet with more copies of the worm program. A patch had been available from Microsoft for six months prior to the worm’s launch, but many installations had not been patched. SQL Slammer caused a denial of service on some internet hosts, ISPs, and ATMs and dramatically slowed general internet traffic. It spread rapidly, infecting 90% of vulnerable hosts (about 75,000 victims) within 10 minutes, according to Silicon Defence.
Heartbleed: Heartbleed is a widely publicized security bug in OpenSSL that came to light in 2014. It exploited a buffer over-read vulnerability in the OpenSSL cryptography library used for the implementation of the Transport Layer Security (TLS) protocol. The root cause is exactly the same as that of buffer overflow—lack of bound checking. Although evaluating the total cost of Heartbleed is difficult, several attacks and data breaches including VPN products during that period were linked to Heartbleed. Experts estimated as much as two-thirds of https-enabled websites worldwide—millions of sites—were affected. eWEEK estimated $500 million in damages as a starting point. The vulnerability is resolved by updating OpenSSL to a patched version.
Adobe Flash Player: In 2016, a buffer overflow vulnerability was found in Adobe Flash Player for Windows, macOS, Linux and Chrome OS. The vulnerability was due to an error in Adobe Flash Player while parsing a specially crafted SWF (Shockwave Flash) file. Malicious entities could exploit these vulnerabilities to bypass security restrictions, execute arbitrary code, and obtain sensitive information by enticing users to open the SWF files or Office documents with embedded malicious Flash Player content distributed via email. Adobe responded by releasing security updates that addressed and resolved the issues.
WhatsApp VoIP: In May 2019, Facebook announced a vulnerability associated with all of its WhatsApp products. The vulnerability exploited a buffer overflow weakness in WhatsApp’s VOIP stack on smartphones. This allows remote code execution via a specially-crafted series of SRTP (secure real-time transport protocol) packets sent to a target phone number. An exploit of the vulnerability was used to infect over 1,400 smartphones with malware by just calling the target phone via Whatsapp voice, even if the call wasn’t picked up. Particularly, the spyware infection of the phone of a UK-based attorney involved in a high profile lawsuit generated a lot of media attention. Facebook responded by releasing security updates that fixed the buffer overflow issues.
How to detect buffer overflow
The main reason buffer overflow occurs is because software developers fail to perform bounds checking. Programmers need to pay special attention to sections of codes where buffers are used—especially functions dealing with user-supplied input. Consider the following lines of codes:
variable $username 
print “Enter Username:”
The above program displays (prints) “Enter Username:” on screen, accepts “Username” input (set to a length of 8 bytes or characters) from users, and then stores it in the $username variable. It is clear from the above code that no bounds checking is performed. The programmer-declared $variable to be 8 bytes long, but does not perform bounds checking on the getstring() function, making it susceptible to buffer overflow attack. The programmer assumes the user would type a proper name such as “Jones”. But what happens if the user enters something like “JonesXXXXXXXXXXXXXXXXXXXXXXX”? The program will likely crash, rather than request the user for a valid input. Unfortunately as stated earlier, programming languages such as C/C++ provides no built-in bounds checking. The first 8 bytes will be copied to memory allocated for $username variable. The rest of the characters will overwrite the next 20 bytes of memory. This is called smashing the stack.
In order to detect buffer overflows in source code, it is important that you understand how the code works in the first place. Secondly, you need to pay careful attention to external input, buffer manipulations, and functions susceptible to buffer overflow, especially gets(), strcpy(), strcat(), and printf() functions. These functions, if not carefully applied, can potentially open the door to buffer overflow attacks.
Vulnerability assessment and software testing methodologies can be employed to detect buffer overflow errors in those functions and other parts of the source code. There are two main approaches available in software testing—static and dynamic testing. Code reviews, proofreading, or inspections are referred to as static testing. However, manually combing through thousands of lines of source code looking for potential buffer overflow errors can be a herculean task. Besides, there is always the possibility of missing critical errors by an oversight. Fortunately, static application testing tools such as Checkmarx, Coverity, and others automatically check for buffer overflow bugs by analyzing the source code of a target program, without executing the program.
Executing codes with a given set of test cases (manual or automated) is referred to as dynamic testing. Dynamic application testing tools such as Appknox, Veracode Dynamic Analysis, or Invicti automatically execute the target program and check whether the program’s runtime behavior satisfies some expected security characteristics. These tools can be used for the detection of buffer overflow vulnerabilities during and/or after development, and for the enforcement of expected code quality (quality assurance).
How to mitigate and prevent buffer overflow
There are several different approaches for mitigating and preventing buffer overflows. They include software developer training on secure coding, enforcing secure coding practices, use of safe buffer handling functions, code review, statically analyzing source code, detecting buffer overflows at runtime, and halting exploits via the operating system. Each approach has its limitations and constraints. For instance, code reviews, no matter how thorough, may miss bugs. This is where static analysis comes to play, however, static analysis may sometimes result in false positives or false negatives or both. Dynamic testing on the other hand reports problems that have been observed at runtime; but it also requires test input selection and program execution, which can be difficult and time-consuming.
The easiest way to prevent buffer overflow vulnerabilities is to simply avoid programming languages that are prone to them. Languages such as C/C++ allow these vulnerabilities through direct access to memory and a lack of in-built bounds checking. Languages such as Java, Python, C#, .NET, among others, do not share these characteristics and are therefore far less susceptible to buffer overflows. However, in reality, switching to a completely different programming language may not always be feasible. When this is the case, use of the following secure practices for handling buffers becomes necessary:
- Bounds checking: Bounds checking in abstract data type libraries can limit the occurrence of buffer overflows. Where possible, avoid using standard library functions such as gets(), strcpy(), strcat() that are susceptible to buffer overflows.
- Executable space protection: Designate or mark memory regions as non-executable to prevent the execution of machine code in these areas.
- Use modern operating systems: Most modern operating systems have in-built runtime protection capabilities such as random address space location reordering of the main data areas of a process, and protection of the non-executable area from exploits. These in-built runtime protections help mitigate buffer overflow attacks.
Buffer overflow vulnerabilities can be difficult to spot, especially when the software is very large and complicated. However, through the use of secure coding practices, safe buffer handling functions, and appropriate security features of the compiler and operating system, a strong defense against buffer overflows can be built. In addition to these preventive measures, consistent scanning and identification of these flaws is a critical step to preventing an exploit.