How can wkhtmltopdf be used without introducing a security vulnerability?

Background

Per the project website, wkhtmltopdf is a "command line tool to render HTML into PDF using the Qt WebKit rendering engine. It runs entirely "headless" and does not require a display or display service."

The website also states that "Qt 4 (which wkhtmltopdf uses) hasn’t been supported since 2015, the WebKit in it hasn’t been updated since 2012."

And finally, it makes the recommendation "Do not use wkhtmltopdf with any untrusted HTML – be sure to sanitize any user-supplied HTML/JS, otherwise it can lead to complete takeover of the server it is running on!"


Context

My intention is to provide wkhtmltopdf as part of an application to be installed on a Windows computer. This may or may not be relevant to the question.


Qualifiers / Additional Information

  • A flag is provided by wkhtmltopdf to disable JavaScript (–disable-javascript). This question assumes that this flag functions correctly and thus will count all <script> tags as benign. They are of no concern.
  • This question is not related to the invocation of wkhtmltopdf – the source HTML will be provided via a file (not the CLI / STDIN) and the actual command to run wkhtmltopdf has no chance of being vulnerable.
  • Specifically, this question relates to "untrusted HTML" and "sanitize any user-supplied HTML/JS".
  • Any malicious user that is able to send "untrusted" HTML to this application will not receive the resultant PDF back. That PDF will only temporarily exist for the purpose of printing and then be immediately deleted.
  • Even someone with 100% working knowledge of all of the wkhtmltopdf/webkit/qt source code cannot concretely state that they have zero vulnerabilities or how to safeguard against unknown vulnerabilities. This question is not seeking guarantees, just an understanding of the known approaches to compromising this or similar software.

Questions

What is the goal of sanitization in this context? Is the goal to guard against unexpected external resources? (e.g. <iframe>, <img>, <link> tags). Or are there entirely different classes of vulnerabilities that we can’t even safely enumerate? For instance, IE6 could be crashed with a simple line of HTML/CSS… could some buffer overflow exist that causes this old version of WebKit to be vulnerable to code injection?

What method of sanitizing should be employed? Should we whitelist HTML tags/attributes and CSS properties/values? Should we remove all references to external URI protocols (http, https, ftp, etc.)?

Does rendering of images in general provide an attack surface? Perhaps the document contains an inline/data-uri image whose contents are somehow malicious but this cannot reasonably be detected by an application whose scope is to simply trade HTML for a rendered PDF. Do images need to be disabled entirely to safely use wkhtmltopdf?

¿Se puede agregar una imagen de header en wkhtmltopdf usando Django?

Muy buenas a todos, me preguntaba si puedo poner hacer si puedo poner una imagen de header o footer con pdfkit y wkhtmltopdf.

options = {     '--page-size': 'Letter',     '--no-outline': None,     '--quiet': '',     '--header-center':'Escribir texto',     '--footer-center': 'Escribir texto', } 

Se que de ésta manera se puede colocar el header y el footer, pero no sabría como hacer para mostrar una imagen en alguna de ellas. Tal vez alguno de vosotros me puede ayudar. De antemanos, muchas gracias!!!

No module named wkhtmltopdf even after installing in Django

I have installed wkhtmltopdf using Pycharm for my Django project. Put ‘wkthmltopdf’ in settings. Tried the sample code in their documentation but when I run the server, I get this error:

File “C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\importlib__init__.py”, line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File “”, line 994, in _gcd_import File “”, line 971, in _find_and_load File “”, line 953, in _find_and_load_unlocked ModuleNotFoundError: No module named ‘wkhtmltopdf’

I tried uninstalling and installing again like what I saw in another question post and I still get the same error. I tried using IDLE and I can import wkhtmltopdf. I tried xhtmltopdf and easy_pdf and I get the same error of no module when I run the server. I am using Django 2.1. What am I doing wrong?