187 points by giuliomagnifico 6 months ago | 53 comments
Sharlin 6 months ago
throwaway314155 6 months ago
dcreater 6 months ago
vidarh 6 months ago
Drawing of the inside of a cylinder.
That's usually bad enough. Then try to specify size, and specify things you want to place inside the cylinder relative to the specified size.
(e.g. try to approximate an O'Neill cylinder)
I love generative AI models, but they're really bad at that, and this one is no exception, but the speed makes playing around with prompt variations to try to see if I get somewhere a lot easier (I'm not getting anywhere...)
James_K 6 months ago
numpad0 6 months ago
vidarh 6 months ago
test6554 6 months ago
jonplackett 6 months ago
You walk up to a glass door. It has 'push' written on it in mirror writing. What should you do and why.
Very few can get it right, even fewer can get it right and explain the right reason. They’ll start going on about how mirror writing is secret writing and push written backwards is code for pull, rather than just that it’s a message for the person on the other side.
No version of Gemini has ever passed.
m741 6 months ago
'You should pull the door.
Since the word "push" is written in mirror writing, it's intended to be read from the other side of the door. Therefore, the intended action for someone on your side of the door is the opposite of "push," which is "pull."'
Seems right to me.
jonplackett 6 months ago
ryankrage77 6 months ago
> Here's a simple logic puzzle: You walk up to a glass door. It has 'push' written on it in mirror writing. What should you do and why?
> ChatGPT said:
> If the word "push" is written in mirror writing on the glass door, it means the writing is reversed as if reflected in a mirror. When viewed correctly from the other side of the door, it would read "push" properly.
> This implies that you are meant to pull the door from your side, because the proper "push" instruction is for someone on the other side of the door. Mirror writing is typically used to convey instructions to the opposite side of a glass surface.
jonplackett 6 months ago
thot_experiment 6 months ago
A clever sign!
Since the word "push" is written in mirror writing, that means it's intended to be read from the other side of the door. In other words, if you were on the other side of the door, the text would appear normally and say "push".
Given this, I should... pull the door open!
The reasoning is that the sign is instructing people on the other side of the door to push it open, which implies that from my side, I need to pull it open.
jonplackett 6 months ago
Some of the large Mistral ones can get it too and I think 8xMixtral can too.
Kiro 6 months ago
jonplackett 6 months ago
*Gemini Flash: The word "push" is written in reverse, which means it is meant to be read in a mirror. Therefore, you should pull the door open.
The mirror writing is a visual cue indicating the opposite action of what the word suggests.
So gets it, but wrong reasoning.
*Gemini Flash 2:
You should pull the door.
The mirror writing is a clue. If the word "push" is written backward as if reflected in a mirror, it means the door is designed to be pulled. The reflection indicates that the intended action is the opposite of what the word appears to say at first glance.
Again, right answer, wrong reasons.
*Gemini Advance:
You should push the door to open it. Here's why:
Mirror Writing: The word "push" being in reverse mirror writing is a deliberate design choice. It's meant to be read correctly from the opposite side of the door (where someone would be exiting). Standard Practice: Most glass doors are designed to swing outward for safety reasons (in case of emergencies where people need to evacuate quickly). Therefore, even though the writing might seem confusing at first, it's actually a helpful indicator that you need to push the door to enter.
Which is worse than flash! Complete nonsense.
mft_ 6 months ago
Kiro 6 months ago
vidarh 6 months ago
I think these kind of unusual requests will eventually need synthetic data, or possibly some way to give the model an "inner eye" by letting it build a 3d model of described scenes and "look at it", as there are lot of things like this that you can construct a mental idea of if you just work through it in your mind or draw it, but that most people won't have many conscious memories off unless you try to describe it in terms of something else.
E.g. for the cylinder example, you get better results if you ask for a tunnel - which often can be "almost" a cylinder. But trying to then nudge it toward an O'Neill cylinder, and it fails to grasp the scale or that there isn't a single "down", and starts putting openings.
mensetmanusman 6 months ago
quikoa 6 months ago
iLoveOncall 6 months ago
Stable Diffusion Turbo has been able to do this for more than a year, even on my "mere" RTX 3080.
vidarh 6 months ago
ajdjspaj 6 months ago
whynotmaybe 6 months ago
From the paper :
> We train using the AdamW [26] optimizer with a batch size of 5 and gradient accumulation over 20 steps on a single NVIDIA A100 GPU
So it's "consumer-grade" because it's available to anyone, not just businesses.
spott 6 months ago
whynotmaybe 6 months ago
Found on Yi-Zhe Song's Linkedin :
> Runs on a single NVIDIA 4090
https://www.linkedin.com/feed/update/urn:li:activity:7270141...
ajdjspaj 6 months ago
ericra 6 months ago
What am I missing?
nomel 6 months ago
dcreater 6 months ago
betenoire 6 months ago
I'm unable to get anything that looks as good as the images in the README, what's the trick for good image prompts?
deckar01 6 months ago
https://gist.github.com/deckar01/7a8bbda3554d5e7dd6b31618536...
betenoire 6 months ago
avereveard 6 months ago
paper https://i.imgur.com/l90WYrT.png
replication on hf https://i.imgur.com/MqN1Qwc.png
betenoire 6 months ago
(I had asked for a rock climber dangling from a rope, eating a banana, and they were wildly nonsensical images)
speerer 6 months ago
wruza 6 months ago
tgsovlerkhgsel 6 months ago
LeoPanthera 6 months ago
It supports all major models and has a native Mac UI, and as far as I can tell there's nothing faster for generation.
The "best" models, and a bunch more, are built-in. The state of the art is FLUX.1, "dev" version for quality, "schnell" version for speed.
SDXL is an older, but still good model, and is faster.
yk 6 months ago
For models, Flux [2] is pretty good and quite straightforward to use. (In general, you will have a runtime and then you have to get the model weights seperately). Which Flux variant depends on your graphics card, the Flux.1 schnell should work for most decently modern ones. (And the website, civitai.com is a repository for models and other associated tools.)
[0] https://github.com/comfyanonymous/ComfyUI
Multicomp 6 months ago
cut3 6 months ago
Flux has been very popular lately.
Pony is popular especially for adult content.
SDXL is still great as it has lots of folks tweaking it. I chose it to make a comic as it worked well with LoRas trained on my drawings. (article on using it for a comic here https://www.classicepic.com/p/faq-what-are-the-steps-to-make...)
qclibre22 6 months ago
download models and all vae files for the model, put in right place, run batch file, configure correctly and then gen images using browser.
LZ_Khan 6 months ago
A1111 is a good place to start. Very beginner friendly UI. You can lookup some templates on Runpod to get started if you don't have a GPU.
someone else mentioned a local setup which might be even easier
42lux 6 months ago
nprateem 6 months ago
NikkiA 6 months ago
wruza 6 months ago
It was called LCM/Turbo in SD and it generated absolute crap most of the times, just like this one. Which is likely yet another “ground-breaking” finetune of SD.
musicale 6 months ago
Kind of like what you can do on an iPhone?
smusamashah 6 months ago
gloosx 6 months ago