remix logo

Hacker Remix

Show HN: Doom (1993) in a PDF

366 points by vk6 5 days ago | 74 comments

I made a Doom source port that runs within a PDF file.

I was inspired by the recent HN post about Tetris in a PDF (https://news.ycombinator.com/item?id=42645218) and I wondered if I could get Doom to run using a similar method.

It turns out that old versions of Emscripten can compile C to asm.js code that will happily run inside the limited JS runtime of the PDF engine. I used the doomgeneric (https://github.com/ozkl/doomgeneric) fork of the original Doom source, as that made writing the IO fairly easy. All I had to do was implement a framebuffer and keyboard inputs.

Unlike previous interactive PDF demos, the output for DoomPDF is achieved by creating a text field for each row of pixels in the screen, then setting their contents to various ASCII characters. This gives me a 6 color monochrome display, that can be updated reasonably quickly (80ms per frame).

The source code is available at: https://github.com/ading2210/doompdf

Note that this PDF can only run in Chromium-based browsers that use the PDFium engine.

ThomasRinsma 5 days ago

Author of "PDF Tetris" here.

Great work! We had the same idea at the same time, here's my version of PDF Doom:

Source: https://github.com/thomasRinsma/pdfdoom

Playable here: https://th0mas.nl/downloads/doom.pdf

Yours is neater in many ways though!

OnionBlender 5 days ago

"There was a problem with this document". Is the problem me, or the document?

wingi 5 days ago

This is just awesome!

daredevil49 4 days ago

This is pretty fascinating stuff*

pavo-etc 5 days ago

> limited JS runtime of the PDF engine

humanity has gone too far

miki123211 5 days ago

Seriously though, is there another format that:

1. Can be easily and freely shared by email / cloud drive, including assets, images and fonts.

2. Supports form filling and saving the form data in the file directly (as opposed to sending it somewhere over HTTP). Basically the electronic equivalent of a paper form that can be filled, send by email and stay filled.

3. Supports (cryptographic) signatures that are again part of the document, and can easily and securely be verified by end users. This is a very important use case in the EU, where electronic signatures are based on cryptography, not "I pinky swear I'm John Smith" DocuSign.

4. Has perfect print fidelity.

We keep complaining about PDF (and rightly so), but there's truly no other format to replace it. The W3c / Whatwg / whatever could probably come up with one based on web technologies, but they haven't yet.

There's Epub which solves a very narrow use case of PDF (electronic book distribution where perfect control over presentation is not required), but nothing that solves the "business" use cases.

kragen 5 days ago

Adding JS to PDF seriously undermines these benefits. If Turing-complete logic can draw arbitrary images on the document, you can no longer have any print fidelity at all, and what you signed cryptographically may have said things you didn't know it said. It may start interfering with #1 if email systems start blocking "malicious" PDF features, too. Only benefit #2 survives.

I have no idea what the folks at Adobe were thinking when they decided to add this feature that could eventually eliminate most of the benefits of their product.

None of this is to say that the Doom implementation is anything less than a very cool hack.

knome 5 days ago

probably the same thing that netscape did when adding javascript to the web. "now we can add some basic client-side validation to these forms". PDFs can be used as form templates, so having some basic validation is reasonable.

quotemstr 5 days ago

:-) I'll never quite appreciate why people say things like this. Having some kind of embedded scripting is useful for all sorts of things, often form validation. A sufficiently complex validation system becomes Turing complete, so you might as well skip the hassle of a custom language and go right to JavaScript. Once you have JavaScript, input, and some way of updating a graphical pixel grid, you're at Doom-completeness. I think it's a wonderful, not terrible, thing that computation and programmability are so cheap they've become ubiquitous even in the most mundane applications

llm_trw 5 days ago

We had that language, it was postscript.

Then pdf came along and said: no this is too dangerous the only thing in a document should be layout information not arbitrary code.

And here we are two decades later.

My hatred of pdf has no end. It killed postscript for dynamic pages and djvu for static pages.

weinzierl 5 days ago

This is very misleading thinking. We've came a very long way from PS security-wise and this is a good thing and should be appreciated.

The fallacy I see in many comments - either directly or between the lines - is to think that since we can run Doom in PDF, hell's gates must have opened and we can do literally anything, especially anything malicious.

This is not the case.

PDF is basically comprised of immutable parts and interactive elements that user agents are supposed to render visibly distinctly. Also user agents are not supposed to run any code without explicit user interaction.

Contemporary user agents do a good job in both respects.

PDFtris and the Doom example are possible because they live in a very small niche of features that enable relatively unobtrusive still interactive form processing. Forms allow code, but do not stick out as much as other interactive elements do and they are relatively flexible. Having found that feature niche is the real genius of PDFtris and related exploits.

Still, they need user interaction. There is no way to do anything behind your back in PDF.

Another fallacy I see in this and the related threads,is that Adobe Acrobat vulnerabilities are PDF vulnerabilities. Yes, Adobe did a terrible job with Acrobat, but in my opinion not at all with the format and specification of PDF - especially not when it comes to security.

jcelerier 4 days ago

> And here we are two decades later.

The conclusion to draw from this is that the hypothesis "the only thing in a document should be layout information not arbitrary code." is wrong and misguided, since whatever the format is, in the end "nature" (us) will make it evolve in a way that has some amount of arbitrary scriptability ; if it's not JS in PDFs it will be ActiveX controls, a government-mandated proprietary app, having to do a trip to the city hall to have the clerk play an algorithm step-by-step by hand, or something else, but something will always eventually come up to fill that void and you will have to use it whether you like it or not.

gorkish 5 days ago

> My hatred of pdf has no end. It killed postscript for dynamic pages and djvu for static pages.

Interesting to see someone evoke DjVu.

With the exception of IW44 wavelet compression, basically everything the DjVu file format supports has a PDF equivalent. I built a tool to convert DjVu to PDF that preserves the image layers and file structure with nearly equivalent compression.

My tool did expose some edge cases in the PDF standard which was frustrating. For instance, PDF supports applying a bitonal mask to an image, but it does not specify how to apply it if the two images have different resolution (DPI). It took many years to get Apple to bring their implementation into consistency.

DiggyJohnson 5 days ago

This is a very concise explanation, thanks for putting it so clearly. It’s not the features or requirements that are the focus of the scorn, per se, but how we got here. I still prefer and use PDF all the time, but between overly dynamic crap and the mainstream tooling, well… “hate” is a reasonable hyperbole.

llm_trw 5 days ago

Hate is too weak a term for what I feel for Adobe.

Adobe kept PDF as a proprietary format from 1992 to 2008. You got the reader for free ... on windows, with a single executable. You didn't get an editor and had to pay through the nose for one from Adobe.

It wasn't until the late 2010s that it actually became a free-ish standard, if you think that a 3,500 page document is a 'standard'.

The only reason why adobe did it is because djvu was eating their lunch, between 2002 and 2008 it was the defacto standard for scanned documents in academia. The documents were easy to edit. The image compression is still better than the native compression on PDF.

To add insult to injury after displacing postscript on windows in the name of security, not only did they add a scripting language to PDF, they added one written in two weeks at a time when it was so bad no one used it for anything but pop-ups and with more security vulnerabilities than you could shake a stick at. I suppose we should be happy Adobe didn't put flash in. Oh wait, they did: https://www.reddit.com/r/Adobe/comments/yqisho/flash_content...

p_ing 5 days ago

JS is what made these file types into the Pretty Dangerous Format. Numerous vulnerabilities in Adobe Acrobat surfaced thanks to the embedded JS engine.

Updating the Acrobat client across an enterprise used to be quite burdensome.

quotemstr 5 days ago

The flip side is that because the industry has converged on just a few embedded scripting systems (JS, Lua, etc.) we can concentrate our security hardening efforts on these few engines and benefit everyone. If PDF, like PostScript, were its own custom thing, it couldn't have been able to benefit from this hardening. In the end, JS was a fine choice.

lolinder 5 days ago

The concern isn't that it was JS, the concern is that there's a scripting system inside of PDF at all. Why? What? Form validation is a lousy excuse because forms themselves were a bridge too far for the format. Why do we need to be able to validate them?

I knew PDFs could be dangerous, but I didn't realize it was because they're intentionally designed to allow embedded scripts.

danieldk 5 days ago

I don't think forms are a bridge too far, it was very common that forms were provided as PDF and it is more convenient for the sender and receiver to fill the fields on a computer for readability, etc. before printing.

However, forms could be handled by a very simple DSL that would be easy to write a safe interpreter for.

quotemstr 5 days ago

JavaScript is already a simple language. There's no requirement to have a JIT even. What makes you believe a custom language would be any safer or better in another way?

bandie91 2 days ago

IMO the parent commenter leans to a validation-specific DSL, opposed to JS, not only because the language complexity itself, but also due to the usually wide range of objects the script engine gets access to. like title bar, URL box, window decoration, placement, mouse pointer, keystrokes, etc. in web browsers. i worry what it has got or will get access to in documents?

hardwaresofton 5 days ago

That’s the only way we know how to go

datavirtue 5 days ago

This. I'm eagerly awaiting the replicators that will explore the cosmos and spread the knowledge of our existence. If we can get them done before we poison ourselves.

alganet 5 days ago

You assume that we are the thing being replicated.

Nature is crafty. It could be the case that we humans are the replicators, not the main show.

Terr_ 3 days ago

"A chicken is an egg's way of making another egg."

ieidkeheb 5 days ago

You mean as long as they can run doom, or create a pocket universe that simulates doom.

krunck 4 days ago

And they'll be able to run Doom too.

luismedel 5 days ago

Pandora's box has been opened.

Next step: embed Bellard's JSLinux (https://bellard.org/jslinux/) and have a fullblown OS with development environment, office suite and all inside a PDF.

khaledh 5 days ago

Portable Doom Format

takeda 5 days ago

As long as it is in Chrome

extraduder_ire 3 days ago

I think it should work in acrobat also, since that implements more of pdf's embedded javascript. I don't know of any other pdf readers which implement javascript. I assume Mozilla’s PDF.js reader doesn't support it for technical reasons, rather than ideological ones.

ikari_pl 5 days ago

oh so that why neither version worked for me in any reader

Narishma 5 days ago

Not that portable since it only works on a single PDF engine.