remix logo

Hacker Remix

Teaching LLMs how to solid model

289 points by wgpatrick 22 hours ago | 99 comments

alnwlsn 20 hours ago

The future: "and I want a 3mm hole in one side of the plate. No the other side. No, not like that, at the bottom. Now make it 10mm from the other hole. No the other hole. No, up not sideways. Wait, which way is up? Never mind, I'll do it myself."

I'm having trouble understanding why you would want to do this. A good interface between what I want and the model I will make is to draw a picture, not write an essay. This is already (more or less) how Solidworks operates. AI might be able to turn my napkin sketch into a model, but I would still need to draw something, and I'm not good at drawing.

The bottleneck continues to be having a good enough description to make what you want. I have serious doubts that even a skilled person will be able to do it efficiently with text alone. Some combo of drawing and point+click would be much better.

This would be useful for short enough tasks like "change all the #6-32 threads to M3" though. To do so without breaking the feature tree would be quite impressive.

abe_m 19 hours ago

I think this is along the lines of the AI horseless carriage[1] topic that is also on the front page right now. You seem to be describing the current method as operated through an AI intermediary. I think the power in AI for CAD will be at a higher level than lines, faces and holes. It will be more along the lines of "make a bracket between these two parts". "Make this part bolt to that other part". "Attach this pump to this gear train" (where the AI determines the pump uses a SAE 4 bolt flange of a particular size and a splined connection, then adds the required features to the housing and shafts). I think it will operate on higher structures than current CAD typically works with, and I don't think it will be history tree and sketch based like Solidworks or Inventor. I suspect it will be more of a direct modelling approach. I also think integrating FEA to allow the AI to check its work will be part of it. When you tell it to make a bracket between two parts, it can check the weight of the two parts, and some environmental specification from a project definition, then auto-configure FEA to check the correct number of bolts, material thickness, etc. If it made the bracket from folded sheet steel, you could then tell it you want a cast aluminum bracket, and it could redo the work.

[1]https://news.ycombinator.com/item?id=43773813

jillesvangurp 10 hours ago

It's also going to be about diagnosing issues. "This part broke right here, explain why and come up with a solution", "Evaluate the robustness of this solution", "Can I save some material and reduce the weight", etc.

Those are the kind of high level questions that an LLM with a decent understanding of CAD and design might be able to deal with soon and it will help speed up expensive design iterations.

A neat trick with current LLMs is to give them screenshots of web pages and ask some open questions about the design, information flow, etc. It will spot things that expert designers would comment on as well. It will point out things that are unclear, etc. You can go far beyond just micro managing incremental edits to some thing.

Mostly the main limitation with LLMs is the imagination of the person using it. Ask the right questions and they get a lot more useful. Even some of the older models that maybe weren't that smart were actually quite useful.

For giggles, I asked chatgpt to critique the design of HN. Not bad. https://chatgpt.com/share/6809df2b-fc00-800e-bb33-fe7d8c3611...

Nevermark 9 hours ago

> Mostly the main limitation with LLMs is the imagination of the person using it. Ask the right questions and they get a lot more useful.

Completely agree.

We get waves of comments on HN downplaying model abilities or their value.

Many people don’t seem to explore and experiment with them enough. I have 3 screens. The left one has two models on it. The right one has a model & a web browser for quick searches. I work on the largest middle screen.

Extreme maybe, but I use them constantly resulting in constant discovery of helpful new uses.

I web search maybe 10% of what I did six months ago.

The quirks are real, but the endless upsides models deliver when you try things were unobtainium, from humans or machines, until LLMs.

wavefrontbakc 8 hours ago

I've only found useful answers for when I've begun something from completely zero. Things I've got even a reasonable grasp on I start to spot large dangerous holes a model very confidently claims is the right solution. Stuff like bringing camping stove inside a tent to avoid rainy weather, nevermind it being suicide at the same time.

>I web search maybe 10% of what I did six months ago.

Me too, though this is more driven by the total cliff-fall of web search result quality

sharemywin 3 hours ago

I'm not sure what you used for it to tell you that?

what should I do with my camping stove if I want to avoid the rain

ChatGPT said: If you want to avoid the rain while using your camping stove, here are your best options:

Shelter Options: Use a Tarp: Set up a tarp high above your stove area with good airflow. Make sure it's not enclosed and water runs off away from you.

Camp Kitchen Tent: Lightweight pop-up canopies or dedicated camp kitchen shelters are perfect. Just make sure it’s ventilated.

Picnic Table Shelter: Many campgrounds have covered areas you can cook under — check for those.

Vestibule (Cautiously): If you have a tent with a large vestibule, you can use the stove there, only if it's very well-ventilated and you're extremely cautious (fire and carbon monoxide risks are real).

Important Safety Tips: Never use a stove inside a closed tent — risk of carbon monoxide poisoning and fire.

Keep the stove on a stable, flat surface.

Avoid cooking right under trees with low-hanging branches.

Ensure your shelter is tall enough — heat rises.

Would you like a quick sketch or image of a tarp setup that works well in rain?

anthonyrstevens 1 hour ago

I'm starting to think that many of the "But the AIs tell me I should drive my car off a cliff!!" posters are just making stuff up.

ben_w 45 minutes ago

I've seen enough weird output from some models to not think quite so negatively about nay-sayers.

If "stupid response" happens 1% of the time, and the first attempt to use a model has four rounds of prompt-and-response, then I'd expect 1 in 25 people to anchor on them being extremely dumb and/or "just autocomplete on steroids" — the first time I tried a local model (IIRC it was Phi-2), I asked for a single page Tetris web app, which started off bad and half way in became a python machine learning script; the first time I used NotebookLM, I had it summarise one of my own blog posts and it missed half and made up clichés about half the rest.

And driving off, if not a cliff then a collapsed bridge, has gotten in the news even with AI of the Dijkstra era: https://edition.cnn.com/2023/09/21/us/father-death-google-gp...

xhkkffbf 1 hour ago

No! A friend of a friend asked an AI and the AI said they were real. Honest. But it was the other AIs. Not the one the friend asked.

wavefrontbakc 8 hours ago

I think the cost of mistakes is the major driving force behind where you can adopt tools like these. Generating a picture of a chair with five legs? No big deal. Generating supports for a bridge that'll collapse next week? Big problem.

> It will point out things that are unclear, etc. You can go far beyond just micro managing incremental edits to some thing.

When prompted an LLM will also point it out when it's perfectly clear. LLM is just text prediction, not magic

ben_w 7 hours ago

> I think the cost of mistakes is the major driving force behind where you can adopt tools like these. Generating a picture of a chair with five legs? No big deal. Generating supports for a bridge that'll collapse next week? Big problem

Yes, indeed.

But:

Why can LLMs generally write code that even compiles?

While I wouldn't trust current setups, there's no obvious reason why even a mere LLM cannot be used to explore the design space when the output can be simulated to test its suitability as a solution — even in physical systems, this is already done with non-verbal genetic algorithms.

> LLM is just text prediction, not magic

"Sufficiently advanced technology is indistinguishable from magic".

Saying "just text prediction" understates how big a deal that is.

wavefrontbakc 7 hours ago

>While I wouldn't trust current setups, there's no obvious reason why even a mere LLM cannot be used to explore the design space when the output can be simulated to test its suitability as a solution

Having to test every assertation sounds like a not particularly useful application, and the more variables there are the more it seems to be about throwing completely random things at the wall and hoping it works

You should use a tool for it's purpose, relying on text prediction to predict clarity is like relying on teams icons being green to actual productivity; a very vague, incidentally sometimes coinciding factor.

You could use text predictor for things that rely on "how would this sentence usually complete" and get right answers. But that is a very narrow field, I can mostly imagine entertainment benefiting a lot.

You could misuse text predictor for things like "is this <symptom> alarming?" and get a response that is statistically likely in the training material, but could be completely inverse for the person asking, again having very high cost for failing to do what it was never meant to. You can often demonstrate the trap by re-rolling your answer for any question a couple times and seeing how the answer often varies mild-to-completely-reverse depending on whatever seed you land.

ben_w 37 minutes ago

> Having to test every assertation sounds like a not particularly useful application, and the more variables there are the more it seems to be about throwing completely random things at the wall and hoping it works

That should be fully automated.

Instead of anchoring on "how do I test what ChatGPT gives me?", think "Pretend I'm Ansys Inc.*, how would I build a platform that combines an LLM to figure out what to make in the first place, with all our existing suite of simulation systems, to design a product that actually meets those requirements?"

* Real company which does real sim software

aredox 5 hours ago

>Saying "just text prediction" understates how big a deal that is.

Here on HN we often see posts insisting on the importance of "first principles".

Your embrace of "magic" - an unknown black box who does seemingly wonderful things that usually blow up to one's face and have a hidden cost - is the opposite of that.

LLMs are just text prediction. That's what they are.

>Why can LLMs generally write code that even compiles?

Why can I copy-paste code and it compiles?

Try to use LLM on code there is little training material about - for example PowerQuery or Excel - and you will see it bullshit and fail - even Microsoft's own LLM.

ben_w 33 minutes ago

> Why can I copy-paste code and it compiles?

I think phrasing it like that is called "begging the question": you've already skipped past all the intelligence you had to apply to figure out which part of the entire internet constituted "code".

And not just any code, but code in correct language. If I copy-paste C64 Basic into the middle of a .swift file (and not as a string), it isn't going to compile.

And not just in the correct language, but a complete block of it, rather than a fragment.

> even Microsoft's own LLM.

"even" suggests you hold them in higher regard than I do.

sharemywin 3 hours ago

isn't it closer to concept prediction layered over top of text prediction because of the multiple levels? it compresses text into concepts using layers of embeddings and neural encoding then predicts the concept based on multiple areas of attention. then decompresses it to find the correct words to convey the concept.

baq 8 hours ago

The text of every Nobel winning physics theory was predicted in someone’s head, too

dmd 9 hours ago

> Not bad

It reads like a horoscope to me.

krige 8 hours ago

> Not bad. I'm sorry but it's pretty bad. Hierarchy complaint is bogus, so's navigation overload, it hallucinates BG as white, and the rest is very generic.

alnwlsn 19 hours ago

You're right, but I think we have a long way to go. Even our best CAD packages today don't work nearly as well as advertised. I dread to think what Dassault or Autodesk would charge per seat for something that could do the above!

abe_m 18 hours ago

I agree. I think a major hindrance to the current pro CAD systems is being stuck to the feature history tree, and rather low level features. Considerable amounts of requirements data is just added to a drawing free-form without semantic machine-readable meaning. Lots of tolerancing, fit, GD&T, datums, etc are just lines in a PDF. There is the move to MBD/PMI and the NIST driven STEP digital thread, but the state of CAD is a long way from that being common. I think we need to get to the data being embedded in the model ala MBD/PMI, but then go beyond it. The definition of threads, gear or spline teeth, ORB and other hydraulic ports don't fit comfortably into the current system. There needs to be a higher level machine-readable capture, and I think that is where the LLMs may be able to step in.

I suspect the next step will be such a departure that it won't be Siemens, Dassault, or Autodesk that do it.

coderenegade 19 hours ago

I think this is correct, especially the part about how we actually do modelling. The topological naming problem is really born from the fact that we want to do operations on features that may no longer exist if we alter the tree at an earlier point. An AI model might find it easier to work directly with boolean operations or meshes, at which point, there is no topological naming problem.

seveibar 18 hours ago

Most likely you won’t be asking for specific things like “3mm hole 3in from the side”, you’ll say things like “Create a plastic enclosure sized to go under a desk, ok add a usb receptacle opening, ok add flanges with standard screw holes”

In the text to CAD ecosystem we talk about matching our language/framework to “design intent” a lot. The ideal interface is usually higher level than people expect it to be.

mediaman 13 hours ago

The problem is that this isn't very useful except for the very earliest ideation stages of industrial design, which hardly need CAD anyway.

Most parts need to fit with something else, usually some set of components. Then there are considerations around draft, moldability, size of core pins, sliders, direction of ejection, wall thickness, coring out, radii, ribs for stiffness, tolerances...

LLMs seem far off from being the right answer here. There is, however, lots to make more efficient. Maybe you could tokenize breps in some useful way and see if transformers could become competent speaking in brep tokens? It's hand-wavy but maybe there's something there.

Mechanical engineers do not try to explain models to each other in English. They gather around Solidworks or send pictures to each other. It is incredibly hard to explain a model in English, and I don't see how a traditional LLM would be any better.

esperent 12 hours ago

You may or may not be right, but your arguments sound like echos of what software developers were saying four or five years ago. And four or five years ago, they were right.

Don't dismiss an AI tool just because the first iterations aren't useful, it'll be iterated on faster than you can believe possible.

littlestymaar 9 hours ago

While LLMs are a useful tool for software development, if you try asking them the software equivalent of “Create a plastic enclosure sized to go under a desk, ok add a usb receptacle opening, ok add flanges with standard screw holes” you'll end up with the equivalent of “No the other side. No, not like that, at the bottom. Now make it 10mm from the other hole. No the other hole. No, up not sideways. Wait, which way is up? Never mind, I'll do it myself” a lot.

What works is asking them to implement micro feature that you will specify well enough at first try, not to ask them writing the entire piece of software from top to bottom. The tech is clearly not there yet for the latter.

The main difference between Code and CAD is that code is language you're writing to the machine to execute already, so it's pretty natural to just use a more abstract/natural language to ask it instead of the formal one of code, whereas CAD is a visual, almost physical task, and it's more pleasant to do a task than describe it in depth with words.

esperent 9 hours ago

> Create a plastic enclosure sized to go under a desk, ok add a usb receptacle opening, ok add flanges with standard screw holes

With vague specifications like these, you'd get garbage from a human too.

What works for software, and I suspect for other technical fields like CAD too, is to treat it like a junior developer who has an extreme breadth of knowledge but not much depth. You will need to take care to clearly specify your requirements.

littlestymaar 7 hours ago

> With vague specifications like these, you'd get garbage from a human too.

You'll never have better input than this at the beginning of any project from the person that brings the use-case. That's a full job to help them define the needs more accurately. And if you always work with clear specifications it's just because there's someone in front of you that has helped write the spec starting from the loose business requirement.

> You will need to take care to clearly specify your requirements

Yes, but as I discussed above, for such tasks it's going to be very frustrating and less efficient than doing things by yourself. The only reason why you'd accept to go through this kind of effort for an intern is that because you expect him to learn and become autonomous at some point. With current tech, an LLM will forever remain as clueless as it started.

esperent 6 hours ago

> You'll never have better input than this at the beginning of the project from the person that brings the use-case

That's as may be, but again, it's not much different to being a software developer.

Someone might ask you to create a website for their business. It's your job, as the expert, to use the available tools - including AI - to turn their requirements into working code. They might say "put a button for the shopping cart into the top right". It's then your job, as as the technical expert, to get that done. Even the laziest of devs wouldn't expect to just copy/paste that request into a AI tool and get a working result.

It takes time to learn to use these tools.

When I'm using AI to help me write code, depending on the complexity of what I'm working on, I generally write something very similar to what I'd write if I was asking other developers for help (although I can be much terser). I must specify the requirements very clearly and in technical language.

Usually I keep a basic prompt for every project that outlines the technical details, the project requirements, the tools and libraries being used, and so on. It's exactly the same information I'd need to document for another human working on the project (or for myself a year later) so there's no wasted work.

eurekin 18 hours ago

I have come across a significant number of non engineers wanting to do, what ultimately involves some basic CAD modelling. Some can stall on such tasks for years (home renovation) or just don't do it at all. After some brief research, the main cause is not wanting to sink over 30 hours into learning basics of a cad package of choice.

For some reason they imagine it as a daunting, complicated, impenetrable task with many pitfalls, which aren't surmountable. Be it interface, general idea how it operates, fear of unknown details (tolerances, clearances).

It's easy to underestimate the knowledge required to use a cad productively.

One such anecdata near me are highschools that buy 3d printers and think pupils will naturally want to print models. After initial days of fascination they stopped being used at all. I've heard from a person close to the education that it's a country wide phenomena.

Back to the point though - maybe there's a group of users that want to create, but just can't do CAD at all and such text description seem perfect for them.

Animats 13 hours ago

There's a mindset change needed to use a feature tree based constructive solid geometry system. The order in which you do things is implicit in the feature tree. Once you get this, it's not too hard. But figuring out where to start can be tough.

I miss the TechShop days, from when the CEO of Autodesk liked the maker movement and supplied TechShop with full Autodesk Inventor. I learned to use it and liked it. You can still get Fusion 360, but it's not as good.

The problem with free CAD systems is that they suffer from the classic open source disease - a terrible user interface. Often this is patched by making the interface scriptable or programmable or themeable, which doesn't help. 3D UI is really, really hard. You need to be able to do things such as change the viewpoint and zoom without losing the current selection set, using nothing but a mouse.

(Inventor is overkill for most people. You get warnings such as "The two gears do not have a relatively prime number of teeth, which may cause uneven wear.")

phkahler 17 hours ago

>> I have come across a significant number of non engineers wanting to do, what ultimately involves some basic CAD modelling.

I very much want Solvespace to be the tool for those people. It's very easy to learn and do the basics. But some of the bugs still need to get fixed (failures tend to be big problems for new users because without experience its hard to explain what's going wrong or a workaround) and we need a darn chamfer and fillet tool.

Animats 13 hours ago

> I very much want Solvespace to be the tool for those people.

Probably not. "Copyright 2008-2022 SolveSpace contributors. Most recent update June 2 2022."

phkahler 3 hours ago

Most recent commit to the repository: Last week. It's slow and has slowed the last couple years but it's still going.

itissid 19 hours ago

> and I want a 3mm hole in one side of the plate. No the other side. No, not like that, at the bottom. Now make it 10mm from the other hole. No the other hole. No, up not sideways.

One thing that is interesting here is you can read faster than TTS to absorb info. But you can speak much faster than you can type. So is it all that typing that's the problem or could be just an interface problem? and in your example, you could also just draw with your hand(wrist sensor) + talk.

As I've been using agents to code this way. Its way faster.

voidUpdate 2 hours ago

And I can think faster than I can speak, which means it's easier for me to think about what I need to do and then do it, rather than type or speak to an LLM so they can work out what I need to do instead

alnwlsn 19 hours ago

Feels a bit like being on a call with someone at the hardware store, about something that you both don't know the name for. Maybe the person on the other end is confused, or maybe you aren't describing it all that well. Isn't it easier to take a picture of the thing or just take the thing itself and show it to someone who works there? Harder again to do that when the thing you want isn't sold at the store, which is probably why you're modeling it in the first place.

Most of the mechanical people I've met are good at talking with their hands. "take this thing like this, turn it like that, mount it like this, drill a hole here, look down there" and so on. We still don't have a good analog for this in computers. VR is the closest we have and it's still leagues behind the Human Hand mk. 1. Video is good too, but you have to put in a bit more attention to camerawork and lighting than taking a selfie.

bdcravens 26 minutes ago

Most of the 3D printing model repositories offer financial incentives for model creators, as they are usually owned by manufacturers who want to own as much of the ecosystem as possible. (Makerworld, Printables, etc)

Widespread AI generation obviously enables abuse of those incentives, so it'll be interesting to see how they adjust to this. (It's already a small problem, with modelers using AI renderings that are deceptive in terms of quality)

33 minutes ago

spmcl 21 hours ago

I did this a few months ago to make a Christmas ornament. There are some rough edges with the process, but for hobby 3D printing, current LLMs with OpenSCAD is a game-changer. I hadn't touched my 3D printer for years until this project.

https://seanmcloughl.in/3d-modeling-with-llms-as-a-cad-luddi...

0_____0 13 hours ago

As a MCAD user this makes me feel more confident that my skills are safe for a bit longer. The geometry you were trying to generate (minus bayonet lock, which is actually a tricky thing to make because it relies on elastic properties of the material) takes maybe a few minutes to build in Solidworks or any modern CAD package.

dgacmu 18 hours ago

This matches my experience having Claude 3.5 and Gemini 2.0-flash generate openSCAD, but I would call it interesting instead of a game changer.

It gets pretty confused about the rotation of some things and generally needs manual fixing. But it kind of gets the big picture sort of right. It mmmmayybe saved me time the last time I used it but I'm not sure. Fun experiment though.