remix logo

Hacker Remix

Beyond Text: On-Demand UI Generation for Better Conversational Experiences

71 points by fka 16 hours ago | 37 comments

sheo 7 hours ago

I think that the example in the article is not a good usecase for this technology. It would be better, cheaper and less error prone to have prebuilt forms that LLM can call like tools, at least for things like changing shipping address

Shipping forms usually need verification of addresses, sometimes they even include a map

Especially if on the other end data that would be inputted in this form, would be stored in the traditional DB

Much better usecase would be use it in something, that is dynamic by nature. For example, advanced prompt generator for image generation models (sliders for size of objects in a scene; dropdown menus with variants of backgrounds or style, instead of usual lists)

cjcenizal 5 hours ago

You make a good point! There are many common input configurations that will come up again and again, as forms and other types on input (like maps as you mentioned). How can we solve for that?

Maybe a solution would look like the server expression a more general intent -- "shipping address", and leaving it to the client to determine the best UI component for capturing that information. Then the server will need to do its own validation of the user's input, perhaps asking for confirmation that it understood correctly.

jFriedensreich 11 hours ago

I was working on exactly this in gpt 3 days and still believe ad hoc generation of super specifc and contextual relevant UIs will solve a lot of problems and friction that purely textual or speech based conversational interfaces pose especially if the UI elements like sliders provide some form of live feedback of their effect and are possible to scroll back to or pin and make changes anytime.

WillAdams 10 hours ago

This always felt like something which the LCARS interface addressed, at least conceptually (though I've never seen an implementation which was more than just a skin).

I'd love to see folks finding the same sort of energy and innovation which was driving early projects such as Momenta and PenPoint and so forth.

bhj 7 hours ago

Yes, there’s a video where Michael Okuda (with Adam Savage, I think?) recalls the TNG cast being worried about where to tap, and his response was essentially “you can’t press a wrong button“.

jFriedensreich 5 hours ago

thanks for bringing this up, totally forgot the connection even though i looked at it before and also remember the adam savage interview

ActionHank 11 hours ago

I really believe this is the future.

Conversations are error prone and noisy.

UI distills down the mode of interaction into something defined and well understood by both parties.

Humans have been able to speak to each other for a long time, but we fill out forms for anything formal.

aziaziazi 7 hours ago

> this is the future

For sure! UIs are also most of the past and present way to interact with a computer, off or online. Even Hacker News - which is mostly text - has some UI for to vote, navigate, flag…

Imagine the mess of a text-field-only interface where you had to type "upvote the upper ActionHank message" or "open the third article’ comments on the front page, the one that talks about On-demand UI generation…" then press enter.

Don’t get me wrong: LLMs are great and it’s fascinating to see experimentations with them. Kudos to the author.

visarga 11 hours ago

> Conversations are error prone and noisy.

I thought you'd say not being able to reload the form at a later time from the same URL is bad. This would be a "quantum UI" slightly different every time you load it.

ActionHank 10 hours ago

I think that there will be ways to achieve this.

If you look at many of the current innovations around working with llms and agents, they are largely around constraining and tracking context in a structured way. There will likely be emergent patterns for these sorts of things over time, I am implementing my own approach for now with hopefully good abstractions to allow future portability.

fka 11 hours ago

Exactly! LLMs can generate UIs according to user needs. E.g. it can generate simplified or translated ones, on-demand. No need for preset forms or long ones. Just the required ones.

wddlz 9 hours ago

Related to this: Here is some recently published research we did at Microsoft Research on generating UX for prompt refinements based on the user prompt and other context (case study: https://www.iandrosos.me/promptly.html, paper link also in intro).

We found it lowered barriers to providing context to AI, improved user perception of control over AI, and provided users guidance for steering AI interactions.