90 points by abhi9u 4 days ago | 18 comments
egberts1 2 days ago
TLB got double-invalidated yet some say never invalidated. The crux is a glitch within a singlr entire TLB invalidation operation thereby negating XOR opcode's ability to self-modify the neighboring IMUL operand. (Double ROT13, anyone?). I assert double-invalidation because within the same TLB invalidation stroke, XOR operation got performed ... twice, as opposed to retrieving and restoring original IMUL operand value after such invalidation thereby negating XOR computed result EITHER WAY.
A failure of self-modifying code within QEMU amd64/x86 emulation mode could be a useful test to determine if one is under QEMU emulation mode, of course if the page allows read-write-execute as often found in JavaScript, Java, Python and Dalvik (Android) bytecode memory regions.
Fabrice Bellard, author of QEMU, acknowledged the basic of above but failed amd64/x86 IMUL/XOR self-modify premise in emulation (not KVM) mode of QEMU.
dealbreaker 2 days ago
egberts1 2 days ago
No, really, thank you for the update.
I will give it another spin using latest Unicorn-patched QEMU, we have a patched/updated QEMU, ya know, just for Unicorn, do we?
loeg 2 days ago
nonrandomstring 2 days ago
caspper69 2 days ago
Basically, if you don't handle the TLB properly, the CPU will not know that page mappings and/or page permissions have changed. So if you had a page mapped RW, and then changed the mapping to a RO page (such as setting up COW), but failed to flush the TLB (or at least call INVLPG to flush the entry), the CPU might use those stale permissions and grant write access on that page when it shouldn't. The same could happen for changing a region of the VA space to use a different physical page, where the next bit of code would hit the old page (and who knows what state it might be in or what it could be being used for).
The TLB is not super-complicated, but it has some quirks (it's been so long since I've done anything with it, the PCID handling rules were new to me; didn't even support it back when).
adrian_b 23 hours ago
rybosworld 2 days ago
https://googleprojectzero.blogspot.com/2019/01/taking-page-f...
caspper69 2 days ago
Unless you're like Microsoft (from your link) and accidentally leave the page tables writable from userspace for 2 months. But that's not really a TLB error, that's just L-O-L, wow!
jcalvinowens 23 hours ago
Imagine I'm a user with local shell access trying to read a secret owned by root. Maybe I can't read the secret, but I can do something which makes another program read the secret. If I can make that program swap (perhaps by wasting a bunch of RAM to create memory pressure), and swapping has some probability of triggering a TLB invalidation bug that lets me see the old page, I win, although it might take awhile.
egberts1 2 days ago
Retr0id 2 days ago
CPython also no longer appears to create RWX mappings even for ctypes, although you can of course still mmap them manually.
egberts1 2 days ago
I had thought that such V8 optimization were still occuring (as of Chrome Blink81/SparkPlug) during JavaScript execution of untouched bytecode as a form of overhead reduction of its startup.
https://egbert.net/blog/articles/javascript-jit-engines-time...
Retr0id 2 days ago
fn-mote 2 days ago
Perhaps “hacker” should be “crazy bug debugger”, but anybody who is working with TLB issues is a hacker in my book.
There is no “CVE” vulnerability in the slides, for sure.