Hacker Remix

The program is the database is the interface

204 points by tosh 4 months ago | 61 comments

hilti 4 months ago

What the author demonstrates here is a powerful principle that dates back to LISP's origins but remains revolutionary today: the collapse of artificial boundaries between program, data, and interface creates a more direct connection to the problem domain.

This example elegantly shows how a few dozen lines of Clojure can replace an entire accounting application. The transactions live directly in the code, the categorization rules are simple pattern matchers, and the "interface" is just printed output of the transformed data. No SQL, no UI framework, no MVC architecture - yet it solves the actual problem perfectly.

The power comes from removing indirection. In a conventional app, you have: - Data model (to represent the domain) - Storage layer (to persist the model) - Business logic (to manipulate the model) - UI (to visualize and interact)

Each boundary introduces translation costs, impedance mismatches, and maintenance burden.

In the LISP approach shown here, those boundaries disappear. The representation is the storage is the computation is the interface. And that direct connection to your problem is surprisingly empowering - it's why REPLs and notebooks have become so important for data work.

Of course, there are tradeoffs. This works beautifully for personal tools and small-team scenarios. It doesn't scale to massive collaborative systems where you need rigid interfaces between components. But I suspect many of us are solving problems that don't actually need that complexity.

I'm reminded of Greenspun's Tenth Rule: "Any sufficiently complicated program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp." The irony is that by embracing LISP principles directly, you can often avoid building those complicated programs in the first place.

Rich Hickey's "Simple Made Easy" talk explores this distinction perfectly - what the industry calls "easy" (familiar tools and patterns) often creates accidental complexity. The approach shown here prioritizes simplicity over easiness, and the result speaks for itself.

jahewson 4 months ago

Over time I’ve come to see LISP less as the natural collapse of artificial boundaries but the artificial collapse of natural ones. Where and how data is stored is a real concern, but where and how the program is stored isn’t. Security boundaries around data and executable code are of paramount importance. Data storage concerns don’t benefit from being mixed with programming language concerns but from computer and storage architecture concerns (eg column stores).

In toy programs, such as this one, those concerns can all be discarded, so LISP is a good fit. But in serious systems it’s soon discovered that what is offered is in fact “simplistic made easy”. That’s not to say that traditional systems don’t suffer from all the ills Hickey diagnoses in them, but that we differ on what the cure is.

jamii 4 months ago

> it solves the actual problem perfectly

The whole post was about how that doesn't solve the problem perfectly - there is no way to interactively edit the output.

> by embracing LISP principles directly

This could just as easily have been javascript+json or erlang+bert. There's no lisp magic. The core idea in the post was just finding a way for code to edit it's own constants so that I don't need a separate datastore.

Eventually I couldn't get this working the way I wanted with clojure and I had to write a simple language from scratch to embed provenance in values - https://news.ycombinator.com/item?id=43303314.

cogman10 4 months ago

> It doesn't scale to massive collaborative systems where you need rigid interfaces between components. But I suspect many of us are solving problems that don't actually need that complexity.

Here's the issue. Starting out you almost certainly don't need that rigid interface. However, the longer the app grows the more that interface starts to matter and the more costly retrofitting it becomes.

The company I currently worked at started out with a "just get it done" approach which lead to things like any app reaching into any database directly just to get what it needs. That has created a large maintenance issue that to this day we are still trying to deal with. Modifying the legacy database schema in any way takes multiple months of effort due to how it might break the 20 systems that reach into it.

crq-yml 4 months ago

My take on what the issue is, is primarily in the ramifications of Conway's law and how our social structures map to systems.

When the system is small, it makes a great deal of sense to be an artisan and design simple automations that work for exactly that task, which for the most common things is always supported by any production-oriented programming environment - there's a lot of ways in which you can't go wrong because the problem is so small relative to the tools that any approach will crush it. "Just get it done" works because no consequence is felt, and on the time scale of "most businesses fail within five years", it might never be.

When it's large, everyone would prefer to defer to a common format and standard tools. The problems are now complex, have to be discussed and handled by many people, they need documentation and clear boundaries on roles and responsibilities. But common formats and standards are a pyramid of scope creep - eventually it has to support everyone - and along the way, monopolistic organizations vie for control over it in hopes of selling the shovels and pickaxes for the next gold rush. So we end up with a lot of ugly compatibility issues.

In effect, the industry is always on this treadmill of hacking together a simple thing, blowing out the complexity, then picking up the pieces and reassembling them into another, slightly cleaner iteration.

Maintenance can be done successfully - there are always examples of teams and organizations that succeed - but like with a lot of infrastructure, there's an investment bias towards new builds.

phkahler 4 months ago

>> No SQL, no UI framework, no MVC architecture - yet it solves the actual problem perfectly.

No SQL but in it's place is some code. The point of SQL was to standardize a language for querying data. This is just using a language other than the standard. A UI is a way for people to avoid writing code.

Sure doing your own custom thing results in something easy for the programmer. Nothing new about that.

rlupi 4 months ago

> Of course, there are tradeoffs. This works beautifully for personal tools and small-team scenarios. It doesn't scale to massive collaborative systems where you need rigid interfaces between components. But I suspect many of us are solving problems that don't actually need that complexity.

Spreadsheets. If you squint the right way, they embody the lisp principles for the non-programmers' world.

kazinator 4 months ago

MS Excel has IF(this, then, else), AND(expr, expr, ...), OR(expr, expr, ...) and more recently LAMBDA.

deterministic 4 months ago

> "Any sufficiently complicated program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp."

I have never seen that in practice (30+ years of industry experience working on very large applications).

I think it is one of those statements that fans of Lisp love to quote (a lot) without having any empirical data to back it up.

And yes I am sure that there are examples out there. You can probably find examples of anything if you look hard enough. But that doesn't make it a general rule.

wruza 4 months ago

This makes no sense to me. CL is not an astral technology. It’s just a parentheses-rich language that is worse than almost any other until you get to macros, continuations and psychotic polymorphism. Which are cool to talk about at hacker parties, no /s, but don’t do much business-wise.

fuzzfactor 4 months ago

>the collapse of artificial boundaries between program, data, and interface creates a more direct connection to the problem domain.

I always figured that was one of the reasons that Excel can tackle so many different problems.

lgrapenthin 4 months ago

Was this written by a LLM?

jamii 4 months ago

This is an old prototype. I ended up making a language for it from scratch so that I could attach provenance metadata to values, making them directly editable even when far removed from their original source.

https://www.scattered-thoughts.net/log/0027#preimp

https://x.com/sc13ts/status/1564759255198351360/video/1

I never wrote up most of that work. I still like the ideas though.

Also if I had ever finished editing this I wouldn't have buried the lede quite so much.

jim_lawless 4 months ago

This reminds me of home computing in the late 70's when we used to keep our "database" info in DATA statements embedded in a given BASIC program.

https://jimlawless.net/images/basic_data.png

julesallen 4 months ago

Glad I'm not the only lunatic who would do something like this. At least until I got my paws on dBASE.

I worked with somebody with your name in the early 90s on a Sequent/Dynix system, that wasn't you by chance was it?

jim_lawless 4 months ago

Nope. Not me.

w10-1 4 months ago

That reminds me of a dream:

1. Write queries in datalog

2. Compose them like reusable functions

3. Stream that data between SQL db and dataframe columns and graph db

4. Both also supported with reusable composable function interfaces

5. With IDE/notebook feedback and content-assist including not just syntax but the referenced data model.

6. With distinct modes for exploring datasets and for generating code and tests for hardened pipelines.

7. With abstract accounting for operations of the db, transforms, and transfers.

Data and language will always have ultra-specialized forms. I’m looking for a low-overhead way to explore solutions using different combinations of baseline paradigms before generating code for the one to productize.

kaeland 4 months ago

This might be what you're looking for... https://gtoolkit.com/

bob1029 4 months ago

I am a huge proponent of ideas like using SQL to directly implement business logic. When you bring the logic and the data together under the same abstraction, things will begin to click that you didn't even know existed. Business logic expressed as SQL commands can be stored as yet more data within the same schema. This opens the door for reflection and other features you would ordinarily only think to look for in advanced backend languages.

eitland 4 months ago

I sometimes deal with such code.

I belive it works for you, and I can even say I think most projects have (a little) too few lines of embedded sql in them.

But arguing for code that isn't checked out with the rest of the code when I clone the repo, isn't greppable, and radically departs from everything most people are used to, that isn't something I'd accept.

tmountain 4 months ago

You can greatly improve DX with something like this and get the best of both worlds.

https://github.com/t1mmen/srtd

zabzonk 4 months ago

Looks horrible to me. Why not a spreadsheet?

ajross 4 months ago

Without bothering to try to enumerate minutiae, here's the One Thing that would control such a decision in any such "why not just" argument:

Complicated formats can't be meaningfully version controlled.

TZubiri 4 months ago

OTOH

Excel guy: Boom got it done in 5 minutes sips coffee what's next?

10x engineer: I have devised a new programming paradigm to tell me I need to spend less on coffee, wrote a blog post about it and missed a call from a client and a recruiter.

antonvs 4 months ago

I once made over $1 million developing a program to rescue a small financial services company from its unmanageable mess of Excel sheets.

While I was working on that, the company was busy trying to figure out which other companies the millions of unaccounted excess dollars in their fiduciary account belonged to.

Turns out the boom-got-it-done-in-5-minutes types don't always have a plan for making the big picture work, beyond their current 5-minute problem.

TZubiri 4 months ago

Sounds like business as usual to me.

Non computer people try to solve a problem internally with existing tooling, but the scale gets out of control and they hire a specialist, whether accounting or computer scientist. Most of the jobs I have consist of a company already hacking together a solution from canned software or having a prototype of what they wanted, don't think this is weird at all.

Are you suggesting they should have built a well architected Java program from the start? I'm not so sure.

1 - It's easy to say with monday's newspaper, companies can move fast and then worry about the consequences later if they grow and are succesful. Internal proyects or the whole company can be unsuccessful so you can tag on tech debt and worry about it if it needs to survive, happens all the time. As a curious note, the legal system supports this, companies have limited liability so courts are ok with letting companies that go too deep into the red (tech debt or cash) collapse into 0.

2- Overengineering is also a risk, although usually the failure mode is it never takes off the ground, but what's to say the couldn't have hired a -10x engineer to build their accounts system and then needed to hire an actual 10x engineer to fix the mess?

3- This seems like an accounting problem domain, I think if a company has classically trained accountants you avoid this problem, and if you don't then you get this problem regardless of whether using excel or a programming language. Double entry bookkeeping has been around for centuries. (I'm guessing an accountant helped build the specs for whatever you built for 1M$). Good accountants can handle huge (+1M) companies with pen and paper, and regularly do so with Excel, so I'm not buying that excel was the culprit here.

That said 1M$ is a lot, congrats!

conductr 4 months ago

This isn't some spaghetti ERP on Excel or some other blatant overuse of the application. This is tagging some lines on a CSV file to categorize expenses. A few filters, a few formulas, create a summary sheet with SUMIFS or just create a pivot table, done.

Evaluating the problem tells you how complex the solution needs to be. This problem of categorizing basic financial transactions is pretty basic and I'd take the basic approach every time.

switchbak 4 months ago

Well in this particular case - yeah, excel 100% of the time.

But I think the point here was exploring different paradigms and their trade-offs. I appreciate that a lot, as I think we're stuck in various local optima as an industry - and spreadsheets seem to be one of them (once they pass a certain complexity threshold).

I'm not sold on the "everything in clojure" model here, I think you could accomplish all this does with a script written in $LANG and a little DuckDB (and probably a half dozen other similarly suitable approaches), but again - I appreciate the goal of exploring the solution space, especially when it's using approaches that I'm not that familiar with.

hilti 4 months ago

Fair point, and I've certainly been on both sides of this equation!

The Excel approach is absolutely more efficient for one-off tasks or where the goal is simply "get numbers, make decision, move on." No argument there.

Where the programmatic approach shines is when:

1. The same task repeats annually/monthly (like the author's accounting example) 2. The rules or categorizations evolve over time 3. You need an audit trail of how you arrived at conclusions 4. The analysis grows more complex over time

I've seen plenty of Excel wizards whose spreadsheets eventually become their own form of programming - complete with complex macros, VBA, and data models that only they understand. At that point, they've just created an ad-hoc program with a different syntax.

There's a sweet spot for each approach. Sometimes the 5-minute Excel solution is exactly right. Other times, spending an hour on a reusable script saves you 10 hours next year. The real 10x move is knowing which tool fits which situation.

And yes, sometimes we programmers do overthink simple problems because playing with new approaches is fun. I'll cop to that!

endofreach 4 months ago

But then, who'd invent excel? Maybe we need the compromise and let some people be the 10xcel engineer...

TZubiri 4 months ago

For sure, I just don't see it coming out of a place of "I need to analyze my credit card statement"

Excel wasn't built out of a small personal project, it was a huge multi engineer project built on microsoft (built upon decades of spreadsheet software tradition and arguably centuries of accounting tradition), by a company that had a huge userbase that acted as stakeholders for managing the data and accounts of multi million dollar companies and small businesses alike.

I think that complex contraptions arise more from complex necessities than overengineering a solution from a simple problem

ajross 4 months ago

The VisiCalc/Lotus erasure is very strong in this comment.

TZubiri 4 months ago

Wouldn't they fit within the "decades of spreadsheet software tradition"?

bitwize 4 months ago

> Excel wasn't built out of a small personal project,

But another program with a similar impact -- dBase, the precursor to Microsoft Access, was. Cecil Ratliff wrote it to manage football statistics so he could win his office football pool.

zabzonk 4 months ago

> Complicated formats can't be meaningfully version controlled.

Well, at the risk of simply saying "why not" - why not? Preferably with an example.

ajross 4 months ago

How can you diff the formulas in two spreadsheet templates?

jodrellblank 4 months ago

XLSX files are zipped XML.

Unzip them, strip the formulae out of the XML, and diff them?

ajross 4 months ago

How do you "strip the formulae out of the XML" then? You sort of hide a lot of complexity there. The Excel file format isn't complicated because it's stored as a zip file, it's complicated because it's complicated.

Again, the use case is straightforward. You have an already-solved accounting problem. But, say, last years accounting solution is giving different answers than the one you just ran. So what changed? With the script in the linked article, the answer is trivially findable via the git history (and in larger software searchable via tools like "git bisect", etc...). And the reason for this is that source code is intended to be read by human beings, often including things like "comments" and "style guidelines" and "literate programming" to help the process. None of that exists for ad hoc GUI tools, and the result is that you can't meaningfully develop them as software.

That kind of tasks maps very poorly to "strip the formulae out of the XML, and diff them".

jodrellblank 4 months ago

That's a mixed bag of claims all jumbled together.

> How do you "strip the formulae out of the XML" then?

Use any XML parser, or any pre-existing Excel library which can do that, e.g. the Import-Excel PowerShell module mentioned earlier:

    PS C:\test> $x = Open-ExcelPackage book1.xlsx
    PS C:\test> $x.Workbook.Worksheets[1].Cells["B3"].Formula
    SUM(B1:B2)

> "trivially findable via the git history (and in larger software searchable via tools like "git bisect", etc...)".

So put a pre-commit hook which dumps the spreadsheet as some kind of text file instead of a binary blob.

> "And the reason for this is that source code is intended to be read by human beings"

That's not "the reason" that's an unrelated popular saying. If it were true, reverse engineering, maintenance, refactoring, and porting between different languages would be easy. It isn't. Instead source code appears to be intended to be read by the compiler/interpreter the programmer is using, and if anyone else can make anything out of it, good luck to them.

> "often including things like "comments" and "style guidelines" and "literate programming" to help the process".

If you allow helpers external to the source code as part of the development process, that's good because it cuts off your incoming reply saying that a pre-commit hook writing the formulae from spreadsheet to text is too hard/too much work/unreasonable.

> "None of that exists for ad hoc GUI tools"

Ad-hoc gui tools aren't programmable. Notepad isn't programmable. Calculator isn't programmable. Excel isn't an ad-hoc tool, Excel is one of the most famous, most used, GUI tools on the planet with some of the largest ecosystem and community around it, and one of the most pluggable, scriptable, documented, standardised, systems going.

> "and the result is that you can't meaningfully develop them as software".

Humans wrote Microsoft Excel itself, wrote Windows search indexers for searching inside Excel documents, wrote SharePoint which can index and work with Excel content, wrote the Microsoft Graph API and M365 cloud which can integrate with Excel spreadsheets, wrote the OpenOffice/LibreOffice Excel importers/exporters, wrote the ImportExcel module and the DLLs it's based on, rewrote Excel in TypeScript for Office365. Claiming that humans can't meaningfully write code to work with ... weird formats? Excel? spreadsheets? Files that were once touched by a GUI? ... is such a throwing-hands-up-and-giving-up-without-trying take on things. People could if they wanted to. People running $16Bn departments with Excel sheets could if they took it seriously and invested an appropriate amount of money in doing so.

It just occurred to me to google "diff excel spreadsheet" and look, there's one built-in: https://support.microsoft.com/en-us/office/compare-two-versi...

And there's a SaaS product which does it: https://www.diffchecker.com/excel-compare/

And another SaaS product: https://xlcompare.com/

And another SaaS product: https://www.textcompare.org/excel/

and more I didn't click on.

ajross 4 months ago

And what happens if the author moved the forumla that used to be in B3? How does your diff utility detect that?

I genuinely can't believe you're staking your argument on the idea that you can somehow track code deltas between versions of a Microsoft Excel spreadsheet when literally no one in the world does this.

jodrellblank 4 months ago

B3 would be empty. This is the same as someone renaming any function or method and you asking "but how would the diff utility detect the rename?". The diff would show that the method used to exist and now doesn't. And it would show that the new method used not to exist and now does. The Excel-diff would show that B3 used to have a value and now doesn't. That B4 used to be blank and now has a value. It could show it in a rendering of the spreadsheet, even.

Your argument is that it's not possible. My argument is that it is possible. Nobody is arguing the strawmen positions that you claim I am arguing [it's a good idea, it's a way to build reliable software, everyone does it, etc.]

jerryhomelab 4 months ago

I version control them by duplicating sheets

AlienRobot 4 months ago

Nor can a single file.

If you're keeping records as files, I think you should just use a million files instead.

1. Easy to version control

2. You can sync on the cloud with less conflicts if each record is isolated from the rest

3. Sounds unique and cool

TZubiri 4 months ago

Single files can easily be versioned. Consider git, which uses blob based tracking, not file based tracking. It doesn't matter if you split file A into A and B, git tracks it just fine.

codr7 4 months ago

Lisp makes a lovely data format imo, much nicer to work with than CSV/JSON/YAML/XML.

https://github.com/codr7/whirlog

moi2388 4 months ago

(defn text->tag [text] (first (for [[keyword tag] keyword->tag :when (clojure.string/includes? text keyword)] tag))) “Joe’s Coffee Hit” text->tag :error #object[TypeError TypeError: a.indexOf is not a function. (In 'a.indexOf(b)', 'a.indexOf' is undefined)] If you type "Joe's Coffee Hut" (including the "!) into the textbox above and hit the text->tag button, you'll see the result of running the text->tag function on that input.

Looks great xD

pmkary 4 months ago

The moment I saw the title I knew it has something to do with EVE. Those people were revolutionary and it is a shame EVE never made it, what a glorious masterpiece of an article.

nullpoint420 4 months ago

The engineers yearn for org-mode

AlienRobot 4 months ago

And to think this could all have been solved in LibreOffice Calc if they spent similar amounts of time learning it...

Hammershaft 4 months ago

tbf learning clojure is a more valuable spend of time then learning LibreOffice Calc.

amelius 4 months ago

Nice. In the same vein, I also like "the program is the config file".

dustingetz 4 months ago

(2023), i think - https://news.ycombinator.com/item?id=34761768

Kwpolska 4 months ago

And it's still a "!!!DRAFT!!!" ending with a TODO that ended up on the front page?

pianoben 4 months ago

I think they just invented Microsoft Access :)

hilti 4 months ago

Ha! I see the resemblance, but they're actually opposites. Access adds visual layers between you and your data, while this LISP approach removes them. Access makes databases more approachable to non-programmers through GUIs, while this example makes programming more direct by collapsing all those abstraction layers into pure code. Same problems, completely different philosophies!

pianoben 4 months ago

The way I see it, the problem being addressed here is that there is structured data (for some definition of "structured"), and we are in search of a useful visual representation of that data in various aggregates.

Lisp is a fantastic tool for accomplishing this, but to me, Access is tailor-made for exactly this problem (and a whole lot more besides, it's a beast). If one follows the trail of the original problem far enough, I think one is likely to end up with an Access-like solution regardless of implementation language.

almosthere 4 months ago

everything is trying to reinvent ms access these days, with a bit of RBAC on top. It was the original low code environment.

TZubiri 4 months ago

These thoughts are a bit too scattered imo, maybe collect your thoughts a bit.

JBiserkov 4 months ago

This was posted on scattered-thoughts.net ;-)

If this was a joke, I'm sorry for explaining it.

If this was not a joke, I point you to the giant

    <h1>!!! DRAFT !!!</h1>

as the first element of the <article>

ZeroTalent 4 months ago

The tests are the program

meetkevin 4 months ago

[dead]

throwaway173738 4 months ago

They just re-invented Jupyter.