Hacker Remix
new
past
ask
show
jobs
submit
login
Gemini 2.5: The First LLM That Understands PDF Layouts
16 points by serjester 2 weeks ago | 1 comment
simonw 2 weeks ago
This example is using bounding boxes, but it turns out Gemini 2.5 (both Pro and Flash) take that a step further and can return complex shaped segmentation masks identifying objects too:
https://simonwillison.net/2025/Apr/18/gemini-image-segmentat...