(cache)Detecting the use of "curl | bash" server side

Installing software by piping from curl to bash is obviously a bad idea and a knowledgable user will most likely check the content first. So wouldn't it be great if a malicious payload would only render when piped to bash? A few people have tried this before by checking for the curl user agent which is by no means fail safe - the user may simply curl the url on the commandline revealing your malicious code. Luckily the behaviour of curl (and wget) changes subtely when piped into bash. This allows an attacker to present two different versions of their script depending on the context :)

Its not that the HTTP requests from curl when piped to bash look any different than those piped to stdout, in fact for all intents and purposes they are identical:

# curl -vv http://pluver.xqi.cc/setup.bash
* Hostname was NOT found in DNS cache
*   Trying 69.28.82.189...
* Connected to xqi.cc (69.28.82.189) port 80 (#0)
> GET /setup.sh HTTP/1.1
> User-Agent: curl/7.35.0
> Host: xqi.cc
> Accept: */*
>

The key difference is in time it takes for the contents of large http responses to be ingested by bash.

Passive detection using a short delay

Execution in bash is performed line by line and so the speed that bash can ingest data is limited by the speed of execution of the script. This means if we return a sleep at the start of our script the TCP send stream will pause while we wait for the sleep to execute. This pause can be detected and used to render different content streams.

Unfortuneatly its not just a simple case of wrapping a socket.send("sleep 10") in a timer and waiting for a send call to block. The send and receive TCP streams in linux are buffered on a per socket basis, so we have to fill up these buffers before the call to send data will block. We know the buffer is full when the receiving client to replies to a packet with the Window Size flag set to 0 (Win=0 in wireshark).

Filling the TCP buffers

To detect a pause in execution we need to fill all the buffers before the pipe to bash. The flow of data from the HTTP response looks like this:

TCP buffer flow chart

Both the send and receive buffer sizes on linux are "auto tuned" per socket, this means their size can vary (check /proc/sys/net/ipv4/tcp_rmem and /proc/sys/net/ipv4/tcp_wmem to see just how much). We can control the send buffer as its on the server side but we can't do anything about the receive buffer. By fixing the size of the send buffer we can reduce the overal variance in the amount of data we need to send before we receive a window size of 0 from the client. A smaller fixed sized send buffer helps to prevent the TCP receive buffer from growing.

The bufer size can be set like so (87380 is the default for Ubunty 14.04 LTS)

sck.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 87380)

When tested with Ubuntu you need to send about 1mb of data to fill up the receive buffer - remove the restriction on the send buffer and this easily doubles. The question is what to fill the TCP buffer up with as a curl request without a pipe to bash will render in the console.

Hiding data from the terminal

The only character you can really use to fill the buffer is a null byte as it won't render in most consoles. It also wont render in chrome when the charset text/html is specified. As we dont know the content-length data is transfered with chunked encoding with each chunk being a string of null bytes same size as the TCP send buffer.

In the end we have an HTTP server that generates responses looking like this:

HTTP/1.1 200 OK
Host: xqi.cc
Transfer-type: chunked
Content-type: text/html; charset=us-ascii

sleep 10                                <-- chunk #1
0x0000000000000000000000000000000000... <-- chunk #2
0x0000000000000000000000000000000000... <-- chunk #3
0x0000000000000000000000000000000000... <-- chunk #4
...

Detecting bash

If you chart the time between each chunk is sent and do this for both scenarios it becomes easy to determine which outputs were piped through bash. For curl | bash you can see a clear jump of just under 10 seconds when the sleep command is executed (the exact location varies according to the size of the tcp receive buffer on the client side).

This works well and as long as the connection between the server and the client are stable you could happily reduce the sleep command to less than a second, you can also disguise the delay as another slow command (ping, find etc..). The exact time the command takes to run doesn't matter as long as the server is able to detect a sudden jump in the cumulative transmission time.

This distinctive pattern can be identified by taking the differences in times between chunk transmissions, finding the maximum value, removing it from the list, calculating the varience of the remaining data and ensuring that the both the varience is low (this implies a stable connection) and the maximum difference is high. If this pattern is identified you can send another http chunk containing your malicious script.

Demo

Putting everything together and you end up with a small python based server that can deliver different payloads based on what the content is piped to (this also works with "wget -o /dev/null -O -"). If for some reason the connection is unreliable (the variance is high) or you request the file via a browser the non-malicious payload will display:

Demo (animated GIF)

Download source code

Detecting a server detecting curl | bash

So how do you detect if a server is doing this? If the detection is done via a simple delay then you could try either looking for large scripts containing lots of padding or do:

curl https://example.com/setup.bash | (sleep 3; cat)

However, this is by no means fool proof as an attacker can use other methods (e.g. http/dns callbacks) or set multiple passive delays. The better solution is never to pipe untrusted data streams into bash. If you still want to run untrusted bash scripts a better approach is to pipe the contents of URL into a file, review the contents on disk and only then execute it.