What are path traversal attacks and how you can defend against them

A path traversal or directory traversal attack can allow an attacker to access arbitrary files (i.e., files that they should not be able to access) on a web server by manipulating and exploiting how the server handles file paths. In some cases, the attacker may even be able to upload or create files to/ on the server, modify application or behavior data, and ultimately take control of the server.

Web server/ applications run from a webroot directory and follow its configuration. And normally, external users should not be able to navigate outside of the webroot and its subdirectories. The exact configuration and the location of the webroot directory depend on the application and the web server, but standard webroot references for Apache are:

  • /var/www for Linux or macOS
  • C:\Inetpub\www\root for Windows

Path traversal attacks are perpetrated by manipulating file reference variables and using “dot-dot-slash” (../) sequences (relative file paths) or by using absolute file paths. A relative file path is relative to the user’s current location: the user’s current directory. An absolute file path is relative to the system’s root folder. Either way, the attack’s goal is to break out of the webroot and access potentially sensitive files.

While path traversal attacks may not be as common as SQL injection or cross-site scripting attacks and are not considered as dangerous, they still pose a significant risk to the integrity and security of your web server/application.

As is the case with many web vulnerabilities, the web servers that are vulnerable to path traversal attacks are those that allow unvalidated user input — in this case, specifically in file paths. Let’s look at some examples.

How path traversal attacks work

There are multiple ways an attacker can attack your system. As path traversal is usually an HTTP attack, it can come through any HTTP method like GET, POST, PUT, etc. Here’s a typical path traversal attack flow.

  1. Say the vulnerable site’s dynamic URL is:
    https://vulnerablewebsite.com/show.asp?view=homepage.html
  2. When a user accesses the URL through a web browser, the server receives a request for the show.asp page. With this, you will also get the parameter:
    view=homepage.html
  3. The server sends the required page written in show.asp
  4. The attacker sees that show.asp can get any file that is provided in the URL’s view parameters.
  5. The attacker crafts a URL with either a relative file path, using ../, for example:
    https://www.vulnerablewebsite.com/download_file.php?file=../../etc/passwd

    or an absolute file path, directly referencing the file location, for example:

    http://vulnerablewebsite/get.asp?f=/etc/passwd

    that requests the /etc/passwd file.

  6. Because the vulnerable server does not validate user input, the attacker can break out of the webroot and access your system files; in this case, the passwd file.

Path traversal attack examples

Here are some more specific examples of the attack.

Relative file path

Here’s a simple example of a path traversal attack using a relative file path to download a file via a URL parameter on a vulnerable server.

In this scenario, the user provides the file name ‘document.pdf’ in their supplied URL, and the website downloads the PDF to the user’s computer. The URL in question would be:

https://www.vulnerablewebsite.com/download_file.php?file=document.pdf

If the web server is hosted on a Linux system, the web server’s files would typically be located in /var/www – two directories above the root directory. The attacker can attempt to break out of the webroot directory and access the /etc/passwd file by submitting the following URL to the server:

https://www.vulnerablewebsite.com/download_file.php?file=../../etc/passwd

Because the application does not sanitize inputs in our example, it uses the attacker’s string directly in a system call, changes the current directory to the root folder, and allows the attacker to access the /etc/ directory. The attacker can access the sensitive passwd file from there.

Note that the same attack can be perpetrated on a Windows server using \.. instead of ../

Absolute file path

Suppose the following URL is vulnerable to path traversal attacks:

http://vulnerablewebsite.com/get.asp?f=test

It would be vulnerable for two reasons:

  1. It allows file access using the HTTP parameter: f (?f=)
  2. The server accepts user input without adequately validating the input.

The attacker could exploit the above with something like this:

http://vulnerablewebsite/get.asp?f=/etc/passwd

It’s important to understand that such an attack will require quite a bit of trial and error. That means that the attacker will try different combinations and variations of either ../ sequences or various absolute paths and filenames until they find a directory/file they can access. And the damage done will depend on what those potential files or directories are.

Using cookies to mount a path traversal attack

In many cases, cookies reference directories on a web server to load files required to display the web page. That exposes the webserver to a path traversal attack. Here’s an example of a cookie that accesses a file to load a new template for a website:

<?php
$template = 'template1.php';
if (isset($_COOKIE['TEMPLATE']))
  $template = $_COOKIE['TEMPLATE'];
include ("/var/www/phpstuff/templates/" . $template);
?>

The resource that is loaded is template1.php. Its location on the server is /var/www/phpstuff/templates/. That’s four levels above the root directory. Because there is no validation of the $template variable, an attacker could send a GET HTTP request that modifies the cookie value to template=../../../../etc/passwd.

For example:

http://www.vulnerablewebsite.com/?template=../../../../etc/passwd

The web server would then perform the following system call, loading the passwd file instead of the design template.

include("../templates/../../etc/passwd");

The ../ operator points to the current folder’s parent folder on Unix systems. By combining multiple ../ operators in the file path, an attacker could potentially navigate their way out of the server’s webroot directory (/var/www/) and gain access to the /etc directory, which is not meant to be accessible over the internet.

Path traversal vulnerability examples

In August 2023, a report by VulnCheck revealed that thousands of Openfire XMPP servers were susceptible to a path traversal vulnerability via their administrative consoles — which are web-based applications — in the set-up environments. The vulnerability could allow unauthenticated attackers to access restricted pages.

In May 2023, GitLab rushed out a patch after a researcher identified a critical path traversal issue that could enable attackers to read arbitrary files on GitLab Community Edition (CE) and Enterprise Edition servers. Access was only possible when an attachment existed in a public GitLab project that was nested within at least five groups.

Also in 2023, Cicso issued a security advisory regarding a vulnerability in the CryptoService function of the Cisco Duo Device Health Application for Windows. This could “allow an authenticated, local attacker with low privileges to conduct directory traversal attacks and overwrite arbitrary files on an affected system,” Cisco said.

Payloads of interest

Path traversal is certainly not limited to accessing the /etc/passwd file. An attacker can obtain a tremendous amount of information about a vulnerable application just by reading certain files on the system. And with the right information, an attacker could even take control of the entire web server/application. Here are a few examples of different payloads that an attacker might go after using a path traversal attack.

/proc/version

The /proc/version file contains the Linux kernel version currently running on the system. An attacker could use this information to determine which version of the OS is currently installed and whether the system is missing any critical security updates.

/proc/mounts

The /proc/mounts file provides a list of mounted file systems and can be used by an attacker to find the location of any interesting and potentially sensitive files.

/proc/net/arp

The /proc/net/arp file lists the system’s Address Resolution Protocol (ARP) table, which an attacker could use to discover other systems connected to the current web server/application.

/proc/net/tcp & /proc/net/udp

The /proc/net/tcp and /proc/net/udp files can provide an attacker with a list of active connections. This information can be used to determine which ports are open on the server and deduce which services it is likely to be running.

Preventing path traversal attacks

You can do a few things to prevent path traversal attacks, and they all come down to your web server and its configuration. But the first thing you should do is to check if your web server/application is vulnerable to path traversal attacks by using a Web Vulnerability Scanner. A Web Vulnerability Scanner scans your server/application and can detect security risks and logical flaws.

Sanitize Filename Parameters

If you need to allow access to files from user input, make sure the input is properly validated and that the server does not allow access to restricted files or directories. Never allow user-supplied data as a filename or part of the filename when performing operations on files or folders. If the filename should be determined by the user, use predefined conditions instead of direct input.

Perform whitelist checks

You should set up whitelist checks when working with files or directories coming from user-controlled input.

Hard-code allowed file extensions

Only provide access to allowed file extensions. You can hard-code allowed file extensions as follows:

<?php
include($_GET['file'] . '.html');
?>

Implement the principle of least privilege

The principle of least privilege is an IT security policy that states that only the minimum necessary rights should be assigned to a subject that requests access to a resource. The principle also states that those rights should be in effect for the shortest possible duration.

Make sure only to provide users with the minimum necessary permissions to legitimately interact with the web server/application.

Sandboxing

Use sandbox environments (i.e., jail, chroot) that enforce strict boundaries between the processes and the operating system.

In addition to these major steps, here are other things you can do to enhance your resilience to path traversal attacks:

  • Don’t accept user input when working with system calls
  • Block users from requesting all file paths
  • Only process URI files that do not lead to file requests
  • Ensure that no files are served outside the webroot directory
  • Keep your web server software up-to-date with the most recent security patches

For even more security, consider hosting your confidential documents on a different and hardened web server. That will add a second layer of protection to your web server/application.

Conclusion

So there you have it. That’s the lowdown on path traversal attacks. And while it may not get as much attention as other online vulnerabilities/attacks, it’s nonetheless pretty nasty. And, depending on the files the attacker manages to access, the damage caused can be more than significant and even lead to a complete takeover of the server.

As is the case with many other online attacks that target web servers/applications, while multi-faceted, the biggest culprit here is allowing unvalidated user input. Doing so is like leaving your front door unlocked. Someone could come in and rummage through your papers (as well as steal your stuff). And depending on what you leave lying around, you could really come to regret not simply having locked your door.

Stay safe (and don’t allow unvalidated user input).

See also: