remix logo

Hacker Remix

Show HN: Steiner – An open-source reasoning model inspired by OpenAI o1

68 points by peakji 13 hours ago | 17 comments

Steiner is a series of reasoning models trained on synthetic data using reinforcement learning. These models can explore multiple reasoning paths in an autoregressive manner during inference and autonomously verify or backtrack when necessary, enabling a linear traversal of the implicit search tree.

Blog: https://medium.com/@peakji/a-small-step-towards-reproducing-...

Hugging Face: https://huggingface.co/collections/peakji/steiner-preview-67...

schmeichel 10 hours ago

This seems promising! Great work! Any chance there will be a Ollama Modelfile for the masses?

peakji 10 hours ago

GGUF files are available on HF: https://huggingface.co/peakji/steiner-32b-preview-gguf

I haven't personally used Ollama Modelfile, but I think it should be relatively easy to convert from GGUF?

nxobject 10 hours ago

As someone without specific background in the subfield (I do embedded programming) – thanks for spelling out what people "in the know" seem to understand about o1's functioning!

zby 12 hours ago

Can it be mixed with the sampling based approaches from optillm (https://github.com/codelion/optillm)?

peakji 12 hours ago

Approaches like best of n sampling and majority voting are definitely feasible. But I don't recommend trying things related to CoT, as it might interfere with the internalized reasoning patterns.

swyx 11 hours ago

advice to OP - you hurt your own credibility posting on medium dot com. just blog on huggingface or substack or hashnode.

mdaniel 2 hours ago

peakji 10 hours ago

I'm new here. Just curious, why avoid Medium? Is it a Hacker News thing, or did I miss something?

whatshisface 9 hours ago

Medium doesn't "hurt your credibility" nearly as much as revealing that one's arsenal of litmus tests is suffering from such a paucity of real knowledge that one bases it on the web design, but Medium has a lot of annoying popups. A lot of people like Substack better and they have a paid subscriber thing that works well.

(realistically speaking, experts tend to know less about the blog hosting ecosystem the more they know about their domain)

swyx 9 hours ago

its just a "tell" that you dont mind the poor reader experience and being associated with the rest of low quality slop that is on medium. many of us here have simply given up clicking on anything medium related

Metameh 8 hours ago

[dead]