A fresh look on reverse proxy related attacks | Acunetix
In recent years, several researches have been published about attacks deliberately or directly related to reverse proxies. While implementing various reverse-proxy checks on the scanner, I started analyzing implementations of reverse proxies.
Initially, I wanted to analyze how both reverse proxies and web servers parse requests, find out inconsistencies in the process between them and use this knowledge for some kind of bypasses. Unfortunately, I was stuck with analyzing web servers and application servers due to too many possible variations. For example, Apache web server behaves differently depending on how you connect it with PHP. Also, an implementation of a web application, framework or middleware used by a web application can influence the requests parsing process as well. In the end I realized that some attacks are still little-known or completely unknown.
The goal of this research is to portray the bigger picture of potential attacks on a reverse proxy or the backend servers behind it. In the main part of the article, I will show some examples of vulnerable configurations and exploitation of attacks on various reverse proxies, but the second goal of the research is to share the raw data about various implementations of reverse proxies so you can find your ways/tricks (depending on a backend server in each specific situation).
Terms
Actually, the research is not only about reverse proxies, but also about load balancers, cache proxies, WAFs and other intermediate servers between a user and web application which parses and forwards requests. However I haven’t found a good term which correctly describes such a server and is well-known in the community, so I will use “reverse proxy” even when I talk about load balancers or cache proxy. I will call a web application behind a reverse proxy a back-end server. Be aware that a backend server is so-called an origin server (this will make sense when we start talking about caching).
What is reverse proxy?
How proxies work
The basic idea of a reverse proxy is quite simple. It’s an intermediate server between a user and a back-end server. The purpose of it can be quite different: it can route requests depending on the URL to various backends or it can just be there “to protect” against some attacks or simply to analyze traffic. The implementations can be different too, but the main sequence of steps is quite the same.
A reverse proxy must receive a request, it must process it, perform some action on it and forward to a backend.
Processing of a request consists of several main steps:
Parsing
When a reverse proxy receives a request, it must parse it: to get a verb, a path, a HTTP version, host header and other headers and body.
GET /path HTTP/1. 1
Host:
Header: something
Everything may look quite simple, but if you dive into details, you will see implementations are different.
Some examples:
– If a reverse supports Absolute-URI, how will it parse it? Does Absolute-URI have a higher priority than Host header? :
GET other_host_header/path HTTP/1. 1
– URL consists of `scheme:[//authority]path[? query][#fragment], and browsers don’t send #fragment. But how must a reverse proxy handle #fragment?
Nginx throws fragment off, Apache returns a 400 error (due to # in the path), some others handle it as a usual symbol.
– How does it handle symbols which must be URL-encoded?
GET /[0x01] HTTP/1. 1
URL decoding
Due to standards, symbols with a special meaning in the URL must be URL-encoded (%-encoding), like the double quote (“) or “greater than” sign (>). But practically, any symbol can be URL-encoded and sent in a path part. Many web servers perform URL-decoding while processing a request, so next requests will be treated in the same way by them.
GET / HTTP/1. 1
GET%2f%69%6e%64%65%78%2e%70%68%70 HTTP/1. 1
Path normalization
Many web servers support path normalization. Main cases are well-known:
/long/.. /path/here -> /path/here
/long/. /path/here -> /long/path/here
But what about /..? For Apache, it’s an equivalent of /.. /, but for Nginx it means nothing.
/long/path/here/.. -> /long/path/ – Apache
/long/path/here/.. -> /long/path/here/.. – Nginx
The same with // (“empty” directory). Nginx converts it to just one slash /, but, if it’s not the first slash, Apache treats it as a directory.
//long//path//here -> /long/path/here – Nginx
//long/path/here -> /long/path/here – Apache
/long//path/here -> /long//path/here – Apache
Here are some additional (weird) features which are supported by some web servers. For example: support of path parameters – /.. ;/ is valid for Tomcat and Jetty or traversal with backslash (\.. \).
b) Applying rules and performing actions on a request
Once a request is processed, the reverse proxy can perform some actions on the request due to its configuration. Important to note that in many cases, rules of a reverse proxy are path (location) based. If the path is pathA, do one thing, if pathB – do another.
Depending on the implementation or on the configuration, a reverse proxy applies rules based on a processed (parsed, URL-decoded, normalized) path or on an unprocessed path (rare case). It’s also important for us to note if it is case-sensitive or not. For example, will the next paths be treated equally by a reverse proxy? :
/path1/ == /Path1/ == /p%61th1/ == /lala/.. /path1/
C) Forwarding to a back-end
The reverse proxy has processed a request, found appropriate rules for it and performed necessary actions. Now it must send (forward) it to a backend. Will it send the processed request or initial request? Obviously, if it has modified the request, then it sends the modified version, but in this case, it must perform all the necessary steps, for example, to perform URL-encoding of special symbols. But what if the reverse proxy just forwards all requests to only one backend, maybe forwarding the initial request is a good idea?
As you can see all these steps are quite obvious and there are not so many variations. Still, there are differences in implementations, which we, as attackers, can use for our goals.
Therefore, the idea of all attacks described below is that a reverse proxy processes a request, finds and applies rules and forwards it to a backend. If we find an inconsistency between the way a reverse proxy processes a request and the way a backend server processes it, we are then able to create such a request(path) which is interpreted like one path by the reverse proxy and a completely different path by the backend. So, we will be able to bypass or to forcefully apply some rules of the reverse proxy.
Here are some examples
Nginx
Nginx is a well-known web server, but is also very popular as a reverse proxy. Nginx supports Absolute-URI with an arbitrary scheme and higher priority than a Host header. Nginx parses, URL-decodes and normalizes a request path. Then it applies location-based rules depending on the processed path.
But it looks like Nginx has two main behaviors and each of them has its own interesting features:
With trailing slash
location / {
proxy_pass backend_server/;}
In this configuration, Nginx forwards all requests to the `backend_server`. It sends the processed request to the backend, meaning that Nginx must URL-encode the necessary symbols. The interesting thing for an attacker is that Nginx doesn’t encode all the symbols which browsers usually do. For example, it doesn’t URL-encode ‘ ” < >.
Even if there is a web application (back-end server) which takes a parameter from a path and which is vulnerable to XSS, an attacker cannot exploit it, because modern browsers (except dirty tricks with IE) URL-encode these symbols. But if there is Nginx as a reverse proxy, an attacker can force a user to send a URL-encoded XSS payload in the path. The Nginx decodes it and sends the decoded version to the backend server, which makes exploitation of XSS possible.
Browser -> -> Nginx -> backend_server/path/<"xss_here">/ -> WebApp
Without trailing slash
proxy_pass backend_server;}
The only difference between this config and the previous one is the lack of the trailing slash. Although seemingly insignificant, it forces Nginx to forward an unprocessed request to the backend. So if you send /any_path/.. /to_%61pp#/path2, after processing of the request, Nginx will try to find a rule for `/to_app`, but it will send /any_path/.. /to_%61pp#/path2 to the backend. Such behavior is useful to find inconsistencies.
Haproxy
Haproxy is a load balancer (with HTTP support). It doesn’t make much sense to compare it to Nginx, but it will give you an idea of a different approach.
Haproxy makes minimal processing of a request. So there is no “real” parsing, URL-decoding, normalization. It doesn’t support Absolute-URI either.
Therefore, it takes everything (with few exceptions) between a verb and HTTP version (GET! i<@>? lala=#anything HTTP/1. 1) and, after applying rules, forwards it to a backend server. However it supports path-based rules and allows it to modify requests and responses.
How proxies are used
While I was working on this research, analyzing various configurations of reverse proxies, I came to the conclusion that we can both bypass and apply rules of a reverse proxy. Therefore, to understand the real potential of reverse proxy related attacks, we must have a look at their abilities.
First of all, a reverse proxy has access to both a request and a response (including those which it sends/receives from a backend server). Secondly, we need a good understanding of all the features which a reverse proxy supports and how people configure them.
How can a reverse proxy handle a request? :
Routing to endpoint. It means that a reverse proxy receives a request on one path (/app1/), but forwards the request to a completely different one (/any/path/app2/) on a backend. Or it forwards the request to a specific backend depending on a Host header value.
Rewriting path/query. This is similar to the previous one, but usually involves different internal mechanisms (regexp)
Denying access. When a reverse proxy blocks a request to a certain path.
Headers modification. In some cases, a reverse proxy may add or change headers of the request. It could be a cool feature for an attacker, but it’s hard to exploit with a black box approach.
How can a reverse proxy handle a response? :
Cache. Many reverse proxies support caching of response.
Headers modification. Sometimes a reverse proxy adds or modifies response headers (even security related), because it cannot be done on a backend server
Body modification. Reverse proxies will sometimes modify the body too. Edge Side Includes (ESI) is an example of when this can happen.
All this is important for to see more potential attacks, but also understand that in many cases we don’t need to bypass, but apply rules. Which leads to a new type of attacks on reverse proxies – proxy rules misusing.
Server-Side attacks
Bypassing restriction
The most well known case about reverse proxy related attacks.
When someone restricts access (3. Denying access), an attacker needs to bypass it.
Here is an example.
Let’s imagine that there are Nginx as a reverse-proxy and Weblogic as a backend server. Nginx blocks access to an administrative interface of Weblogic (everything that starts with /console/).
Configuration:
location /console/ {
deny all;
return 403;}
proxy_pass weblogic;}
As you can see, `proxy_pass` here is without trailing slash, which means that a request is forwarded unprocessed. Another important thing to bypass the restriction is that Weblogic treats `#` as a usual symbol. Therefore, an attacker can access the administrative interface of Weblogic by sending such a request:
GET /#/.. /console/ HTTP/1. 1
When Nginx starts processing the request, it throws off everything after #, so it skips the /console/ rule. It then forwards the same unprocessed path (/#/.. /console/) to the Weblogic, the Weblogic processes the path and after path normalization, we are left with/console/.
Request Misrouting
It’s about “1. Routing to endpoint” and, in some cases, “2. Rewriting path/query”.
When a reverse proxy forwards requests only to one endpoint, it can make an illusion that an attacker cannot reach other endpoints on a backend or that it cannot reach a completely different backend.
Example 1.
Let’s have a look at similar combinations: Nginx+Weblogic. In this case, Nginx proxies requests only to a certain endpoint of Weblogic (weblogic/to_app). So only requests, which come to a path /to_app on Nginx, are forwarded to the same path on Weblogic. In this situation, it may look like Weblogic’s administrative interface (console) or other paths are not accessible for an attacker.
location /to_app {
In order to misroute requests to other paths, we need to know two things again. Firstly, the same as in the example above – `proxy_pass` is without a trailing slash.
Secondly, Weblogic supports “path parameters” (). For example, /path/to/app/here;param1=val1, and param1 will be accessible in a web app through API.
I think many are aware about this feature (especially after the Orange Tsai’s presentation from BlackHat in the context of Tomcat. Tomcat allows to perform really “weird” traversals like /.. ;/.. ;/. But Weblogic treats path parameters differently, as it treats everything after the first; as a path parameter. Does it mean that this feature is useless for an attacker?
Nope. Let’s have a look at this “magic” which allows accessing any path on Weblogic in this configuration.
GET /any_path_on_weblogic;/.. /to_app HTTP/1. 1
When Nginx receives such a request, it normalizes the path. From /any_path_on_weblogic;/.. /to_app it gets /to_app which successfully applied to the rule. But Nginx forwards /any_path_on_weblogic;/.. /to_app and Weblogic, during parsing, treats everything after; as a path parameter, so Weblogic sees /any_path_on_weblogic. If it’s necessary, an attacker can go “deeper” by increasing the amount of /.. /` after `;.
Example 2.
This one is about a “bug” of Nginx. But this “bug” is just a consequence of how Nginx works (so will not be fixed)
A rule location /to_app means that all paths which start with /to_app (prefix) fall under the rule. So, /to_app, /to_app/, /to_app_anything (including special symbols) fall under it. Also, everything after this prefix(/to_app) will be taken and then concatenated with value in proxy_pass.
Look at the next config. Nginx, after processing `/to_app_anything`, will forward the request to server/any_path/_anything
proxy_pass server/any_path/;}
If we put both features together, we will see that we can go to any path one level higher on almost any backend. We just need to send:
GET /to_app.. /other_path HTTP/1. 1
Nginx applies /to_app rule, gets everything(.. /other_path) after the prefix, concatenates it with a value from `proxy_pass`, so it forwards server/any_path/.. /other_path to a backend. If the backend normalizes the path, we can reach a completely different endpoint.
Actually, this trick is similar to a well-known alias trick. However, the idea here is to show an example of possible misusing of reverse proxy’s features.
Example 3.
As I mentioned before, it’s a common case when a reverse proxy routes requests to different backends depending on the Host header in a request.
Let’s have a look at Haproxy configuration which says that all requests with in the Host header must be proxied to a backend example1_backend – 192. 168. 78. 1:9999.
frontend -in
acl host_example1 hdr(host) -i
use_backend example1_backend if host_example1
backend example1_backend
server server1 192. 1:9999 maxconn 32
Does such a configuration mean that an attacker cannot access other virtual hosts of a backend server? It may look like that, but an attacker can easily do it. Because, as mentioned above, Haproxy doesn’t support Absolute URI, but most web-servers do. When Haproxy receives Absolute URI, it forwards this unprocessed Absolute URI to a backend. Therefore, just by sending next request, we can easily access other virtual hosts of the backend server.
GET unsafe-value/path/ HTTP/1. 1
Is it possible to force a reverse proxy to connect to an arbitrary backend server? I’d say that in most cases (Nginx, Haproxy, Varnish), this cannot be done, but Apache (in some configurations/versions) is vulnerable to it. As Apache “parses” a host value from ProxyPass, we can send something like GET HTTP/1. 1, so Apache sees a value “ and sends the request to “ (SSRF). Here you can see an example of such vulnerability.
Client-Side attacks
If we have a look at reverse proxy features again, we can see that all response-related have a potential for client-side attacks. It doesn’t make them useless. I’d say otherwise. But client-side attacks have additional limitations to possible inconsistencies between the reverse proxy and the web server, as the browser process a request before sending it.
Browser processing
In a client-side attack, an attacker needs to force a victim’s browser to send a special request, which will influence a response, to a server. But the browser follows the specifications and processes the path before sending it: ^The browser parses the URL (e. g. throws off a fragment part), URL-encodes all the necessary symbols (with some exceptions) and normalizes a path. Therefore, to perform such attacks, we can only use a “valid” request which must fit into the inconsistency between three components (browser, reverse proxy, backend server).
Of course, there are differences in browser implementations, plus some features which still allows us to find such inconsistencies:
For example, Chrome and IE don’t decode `%2f`, so a path like that /path/anything/.. %2f.. / will not be path normalized.
Older versions of Firefox didn’t URL-decode special symbols before normalization, but now it behaves in a similar way to Chrome.
There is information that Safari doesn’t URL-decode a path, so we can force it to sent such a path /path/%2e%2e/another_path/.
Also, IE, as usual, has some magic: it doesn’t process a path when it’s redirected with Location header.
Misusing Header modification
A common task for reverse proxy is to add, delete or modify headers from a response of a backend. In some situations, it’s much easier than modification of the backend itself. Sometimes it involves modification of security-important headers. So as attackers, we may want to force a reverse proxy to apply such rules to wrong responses (from wrong backend locations) and then use it for attacks on other users.
Let’s imagine that we have Nginx and Tomcat as a backend. Tomcat, by default, sets header `X-Frame-Options: deny`, so a browser cannot open it in an iframe. For some reason, a part of the web application (/iframe_safe/) on the Tomcat must be accessible through iframe, so Nginx is configured to delete the header `X-Frame-Options` for this part. However, there is no potential for clickjacking attacks on iframe_safe. Here is the configuration:
location /iframe_safe/ {
proxy_pass tomcat_server/iframe_safe/;
proxy_hide_header “X-Frame-Options”;}
proxy_pass tomcat_server/;}
However, as attackers, we can make a request which falls under the iframe_safe rule, but it will be interpreted by Tomcat as a completely different location. Here it is:
HTTP Request Smuggling: Abusing Reverse Proxies – SANS …
HTTP request smuggling is a special web application attack that tries to exploit differences between web servers and their reverse proxies. When successful, it can allow an attacker to submit an HTTP request in the context of another user’s session. In a way, it’s analogous to sneaking malicious traffic past a firewall with overlapping fragments when the firewall and target host interpret that kind of traffic differently – but let’s back up for a minute.
Any website that serves a lot of visitors (for certain values of $A_LOT) uses a reverse proxy. Website visitors hit this reverse proxy first, whether they realize it or not. This device may block attacks, load balance across a set of web servers, modify request parameters, or shape traffic in some other way. During normal business, no one even notices that they’re communicating with two distinct servers.
A typical reverse proxy, web server relationship
This all works very well under normal conditions, but what do hackers care about “normal conditions?? ” Specifically, we’re interested here in what happens with malformed headers in HTTP requests. A normal request might look like one of these:
POST /admin HTTP/1. 1
Host:
User-Agent: curl/7. 68. 0
Accept: */*
Content-Length: 9
Content-Type: application/x-www-form-urlencoded
action=doPOST /admin HTTP/1. 1
Transfer-Encoding: chunked
9
action=doHere we see a POST request to a web server using Content-Length: 15 alongside a POST request to the same web server using Transfer-Encoding: chunked
The first uses Content-Length to tell the server that there are 9 bytes of data after the last header. The second uses Transfer-Encoding to say that a chunk of 9 bytes follows. They effectively do the same thing. But what happens when we send both?
Content-Length: 18
action=dothingsA POST request to a web server using Content-Length: 18 and Transfer-Encoding: chunked
So what happens? The answer here is our attack path: “it depends. ” By Nerd Law 2616 (sometimes called [RFC 2616]()), only Content-Length or Transfer-Encoding should be used. Since developers, logically, expect only one or the other, the behavior of any given web server or proxy is anyone’s guess. It may obey Content-Length and see “9\r\naction=dothings”. It may obey Transfer-Encoding and only see “action=do”.
The fun starts when the reverse proxy and the web server disagree!
In our previous example, we might get something mildly interesting to happen when the two servers interpret our message differently. But what about a request like this?
Cookie: session=8675309jigyn
Content-Length: 38
0
GET /admin? action=dothings HTTP/1. 1A POST request where Content-Length and Transfer-Encoding have differing lengths
If we’re lucky, the reverse proxy and the web server will each interpret this request in a different way. If we’re extra lucky, our request will be followed up by an authenticated request from someone with privileges. If we’re really, super lucky, part of our request will be interpreted as part of that authenticated request!
GET /admin? action=dothings HTTP/1. 1GET /admin? action=nothing HTTP/1. 1
User-Agent: SafariMaybe/77
Content-Type: application/x-www-form-urlencodedHere we see a malicious web request with an orphan line and the authenticated user’s web request
Now what happens with our POST request? We don’t care. We don’t have a session, so we don’t expect to be able to accomplish anything directly. But that second request? We’re hoping
the last line of our request gets orphaned and then adopted by the authenticated request, resulting in an effective request that looks like this:
GET /admin? action=dothings HTTP/1. 1
GET /admin? action=nothing HTTP/1. 1
Content-Type: application/x-www-form-urlencodedEffective request processed by web server
Extra GET request smuggled into an otherwise legitimate web request
Neat! Now we have a GET to do our things, and it’s going in with the authenticated user’s cookie! As long as that web server doesn’t drop the request for having two GET lines (Nerd Law illegal in 147 countries), we just might get our action to fire!
A smuggled request associating with a legitimate request
So there you have it! HTTP request smuggling isn’t as cut and dry as Shell Shock or other branded vulnerabilities with theme songs, but, at the same time, the automated scanner is not going to beat you to finding it. Happy hunting!
What is a Reverse Proxy Server? | NGINX
A proxy server is a go‑between or intermediary server that forwards requests for content from multiple clients to different servers across the Internet. A reverse proxy server is a type of proxy server that typically sits behind the firewall in a private network and directs client requests to the appropriate backend server. A reverse proxy provides an additional level of abstraction and control to ensure the smooth flow of network traffic between clients and servers.
Common uses for a reverse proxy server include:
Load balancing – A reverse proxy server can act as a “traffic cop, ” sitting in front of your backend servers and distributing client requests across a group of servers in a manner that maximizes speed and capacity utilization while ensuring no one server is overloaded, which can degrade performance. If a server goes down, the load balancer redirects traffic to the remaining online servers.
Web acceleration – Reverse proxies can compress inbound and outbound data, as well as cache commonly requested content, both of which speed up the flow of traffic between clients and servers. They can also perform additional tasks such as SSL encryption to take load off of your web servers, thereby boosting their performance.
Security and anonymity – By intercepting requests headed for your backend servers, a reverse proxy server protects their identities and acts as an additional defense against security attacks. It also ensures that multiple servers can be accessed from a single record locator or URL regardless of the structure of your local area network.
How Can NGINX Plus Help?
NGINX Plus and NGINX are the best-in-class load‑balancing solutions used by high‑traffic websites such as Dropbox, Netflix, and Zynga. More than 400 million websites worldwide rely on NGINX Plus and NGINX Open Source to deliver their content quickly, reliably, and securely.
As a software‑based reverse proxy, not only is NGINX Plus less expensive than hardware‑based solutions with similar capabilities, it can be deployed in the public cloud as well as in private data centers, whereas cloud infrastructure vendors generally do not allow customer or proprietary hardware reverse proxies in their data centers.
Frequently Asked Questions about reverse proxy attack
What is a reverse proxy attack?
HTTP request smuggling is a special web application attack that tries to exploit differences between web servers and their reverse proxies. … Any website that serves a lot of visitors (for certain values of $A_LOT) uses a reverse proxy. Website visitors hit this reverse proxy first, whether they realize it or not.Jun 22, 2020
What is reverse proxy with example?
A reverse proxy server is a type of proxy server that typically sits behind the firewall in a private network and directs client requests to the appropriate backend server. … They can also perform additional tasks such as SSL encryption to take load off of your web servers, thereby boosting their performance.
Why do you need a reverse proxy?
Reverse proxies help increase performance, reliability, and security. They provide load balancing for web applications and APIs. They can offload services from applications to improve performance through SSL acceleration, caching, and intelligent compression.Jan 29, 2021