Hacker Remix

The Languages of English, Math, and Programming

161 points by stereoabuse 9 months ago | 47 comments

nick0garvey 9 months ago

I took a Udacity class by Norvig [1] and my abilities as a programmer clearly were improved afterward.

His code here demonstrates why. It is both shorter and much easier to understand than anything the LLMs generated. It is not always as efficient as the LLMs (who often skip the third loop by calculating the last factor), but it is definitely the code I would prefer to work with in most situations.

[1] https://www.udacity.com/course/design-of-computer-programs--...

fifilura 9 months ago

Working extensively in SQL for a while also gives you another perspective of programming. There is just no way you can write this with a for loop in SQL since it does not (generally) have for loops.

  WITH all_numbers AS
  (
      SELECT generate_series as n
      FROM generate_series(1, 108) as n
  ),
  divisors AS
  (
      SELECT *
      FROM all_numbers
      WHERE 108 % n = 0
  ),
  permutations as
  (
      SELECT a.n as n1, b.n as n2, c.n as n3
      FROM divisors as a
        CROSS JOIN divisors as b
        CROSS JOIN divisors as c
  ) 
  SELECT *
  FROM permutations
  WHERE n1 * n2 * n3 =  108
      AND n1 < n2 And n2 < n3 
  ORDER BY  n1, n2, n3

https://codapi.org/embed/?sandbox=duckdb&code=data%3A%3Bbase...

upghost 9 months ago

Single handedly most important class in my career back in the day. It took me 3 months to really grok and generalize the Qpig algorithm, even my professors couldn't explain it.

It's amazing how he never used the words "AI" once in this course despite the fact that it is a straight up AI course.

I revisit the course notes at least once a year and I still walk away with something new every time.

abecedarius 9 months ago

I've recommended his book PAIP despite the AI in the title because you can take it as about the craft of programming (I mean things like style and efficiency, not the bare beginning of being able to code at all) -- using old-fashioned AI as the subject matter. To learn better coding you gotta code something.

https://github.com/norvig/paip-lisp

upghost 9 months ago

Yes, this book is an incredible gem. Hard to believe that one human wrote it. I've been attempting to mine its secrets for a long time. Sometimes it makes me a little embarassed that I've probably spent more time trying to understand it than he spent researching/writing it but I am appreciative that it exists.

But it's especially great if you want to appreciate PN's code more. He actually offers explanations of choices for his distinctive style of coding here. His 6 rules for good code are particularly good. He says they are for good lisp but I think they broadly apply to any language:

https://github.com/norvig/paip-lisp/blob/main/docs/chapter3....

*Edit:* After reading this book I am often really surprised that this book is not referenced as often or more often than SICP is. Not the time or place for a soapbox rant but SICP is often cited as landmark functional programming text when in fact it is largely advocating OO practices implemented in scheme, whereas PAIP really demonstrates full power FP programming on display (type system independent), and where OO practices such as CLOS are used he is quite explicit about it.

abecedarius 9 months ago

I think this is well enough explained by how unfashionable GOFAI and Common Lisp both are. Fashion matters a lot more than I thought in my youth.

Speculating: the sheer bulk of the book might also push people away. It's usually an anti-signal for riches within.

JohnKemeny 9 months ago

What's the Qpig algorithm?

upghost 9 months ago

Q as in maximum expected utility for a given move, Pig as in the dice game of Pig.

https://web.archive.org/web/20140122073352/https://www.udaci...

Yes, I know that you can get the Q value via RL with Bellman equation or non-parametric evo learning, but at the time I didn't know that and I'm glad I didn't.

Depending on how you count, this code features either 2 or 4 mutually recursive functions, something that is quite rare in Python (esp outside the context of an FSM) but is a signature of PN code.

I had to learn so much about so many things to untangle it.

TaurenHunter 9 months ago

quantile function for the Poisson-inverse Gaussian?

l33t7332273 9 months ago

> It is both shorter and much easier to understand than anything the LLMs generated.

I think this makes sense. LLMs are trained on average code on average. This means they produce average code, on average.

olliemath 9 months ago

Actually most of the LLMs algos are less efficient than the readable human one, even with only two nested loops. Only one of them precalculates the factors which makes the biggest difference (there are log2(N) factors, worst case, for large N and a triple loop over those is better than a double loop over 1..N).

underdeserver 9 months ago

I'm just here to point out that since Python 3.10, you don't need to import anything from typing anymore, you could use the built-in types for annotation:

  from math      import prod
  from itertools import combinations

  def find_products(k=3, N=108) -> set[tuple[int, ...]]:
    """A list of all ways in which `k` distinct positive integers have a product of `N`.""" 
    factors = {i for i in range(1, N + 1) if N % i == 0}
    return {ints for ints in combinations(factors, k) if prod(ints) == N}

  find_products()

superlopuh 9 months ago

The typing version is still useful when you want to communicate that the result conforms to a certain interface, which doesn't include mutability in the case of Set, but not the exact type.

Edit: I see that he imported the typing Set, which is deprecated, instead of collections.abc.Set, which is still useful, so your comment is correct.

earslap 9 months ago

It is more obvious when taken to extreme: With the current feedforward transformer architectures, there is a fixed amount of compute per token. Imagine asking a very hard question with a yes/no answer to an LLM. There are infinite number of cases where the compute available to the calculation of the next token is not enough to definitively solve that problem, even given "perfect" training.

You can increase the compute for allowing more tokens for it to use as a "scratch pad" so the total compute available will be num_tokens * ops_per_token but there still are infinite amount of problems you can ask that will not be computable within that constraint.

But, you can offload computation by asking for the description of the computation, instead of asking for the LLM to compute it. I'm no mathematician but I would not be surprised to learn that the above limit applies here as well in some sense (maybe there are solutions to problems that can't be represented in a reasonable number of symbols given our constraints - Kolmogorov Complexity and all that), but still for most practical (and beyond) purposes this is a huge improvement and should be enough for most things we care about. Just letting the system describe the computation steps to solve a problem and executing that computation separately offline (then feeding it back if necessary) is a necessary component if we want to do more useful things.

segmondy 9 months ago

Tried these with some local models and these are the ones that generated the program one shot, a few of them also generated the results correctly one shot.

llama3.1-70b, llama3.1-405b, deepseekcoder2.5, gemma-27b, mistral-large, qwen2.5-72b. https://gist.github.com/segmond/8992a8ec5976ff6533d797caafe1...

I like how the solution sort of varies across most, tho mistral and qwen look really similar.

assadk 9 months ago

What specs does your machine have to run these models locally?

ryandv 9 months ago

It's worth noting that math and programming do not appear to be considered "languages" by much of the academic and/or neuroscientific literature; see [0] on the front page right now and my comments regarding the same [1].

[0] https://news.ycombinator.com/item?id=41868884

[1] https://news.ycombinator.com/item?id=41892701

riku_iki 9 months ago

> It's worth noting that math and programming do not appear to be considered "languages" by much of the academic

There is math term "formal language", and both math and programming perfectly fit into it: https://en.wikipedia.org/wiki/Formal_language

nurettin 9 months ago

Yes, calling an algorithm, a formal description of a generative function "a language" is clearly wrong. I don't think we need an academic dissertation on this.

ryandv 9 months ago

I take it you disagree with the conclusions from TFA then?

> Sometimes a natural language such as English is a good choice, sometimes you need the language of mathematical equations, or chemical equations, or musical notation, and sometimes a programming language is best. Written language is an amazing invention that has enabled human culture to build over the centuries (and also enabled LLMs to work). But human ingenuity has divised [sic] other notations that are more specialized but very effective in limited domains.

bbor 9 months ago

Huh, interesting. Programming languages were devised with Chomsky’s foundational theory of formal languages in mind, and they’re one of the few actual implementations of it. I read your comment and it seems your main thrust is that arithmetic activity lights up different brain regions than communicative activity, which I don’t personally see as a compelling basis for a definition of the word “language”.

Of course, this is what Chomsky calls a “terminological dispute”, so I mean no offense and you’re ofc free to stand your ground that the only language is what appears in human brains! But if mathematical notation and programming languages aren’t languages, what are they…? Protocols? Recursively generative patterns? Maybe just grammars?

The best move in any terminological dispute is “my terms are more useful”, so this seems like a good reason to keep language as it’s defined by the generative linguistics. Or, more broadly:

  Saussure approaches the essence of language from two sides. For the one, he borrows ideas from Steinthal and Durkheim, concluding that language is a 'social fact'. For the other, he creates a theory of language as a system in and for itself which arises from the association of concepts and words or expressions. Thus, language is a dual system of interactive sub-systems: a conceptual system and a system of linguistic forms. Neither of these can exist without the other because, in Saussure's notion, there are no (proper) expressions without meaning, but also no (organised) meaning without words or expressions.

https://en.wikipedia.org/wiki/Theory_of_language

ryandv 9 months ago

I don't actually take any strong positions on the matter so long as the sense in which a term is used is clearly defined and agreed upon from the outset. This is merely an observation regarding what the literature considers "language" and this narrow definition's basis in brain structure contrasted with other forms of mental activity.

But if I must, I suppose I am indeed assuming a stance by taking potshots at this narrower (albeit more precise) use of the word language by (obliquely) pointing to counterexamples that could be considered languages in their own right; the sweeping claim that "language is not essential for thought" seems far broader than the narrow sense in which the term is construed in the actual paper.

Tainnor 9 months ago

The confusion is in a sense intentional as the whole point of Chomsky's program (Universal Grammar) is that natural languages essentially work like formal languages.

I don't think that this is actually true at all (and modern neuroscience, sociolinguistics, etc. disagree pretty heavily with Chomsky), but it is impressive how far you can get with modelling natural language grammar with formal methods.

tmsh 9 months ago

What about o1? I think the world is sleeping on o1. Recently I misread a leetcode/neetcode problem (so I was curious that my version of the problem with an extra constraint could be solved in a different way). And 4o hallucinated incorrectly and double downed when asked follow up questions - but o1 solved it the first time what seemed like easily. It really is a major step forward.

fifilura 9 months ago

So... Can you enlighten us how it went?

klibertp 9 months ago

After 34 seconds:

    import itertools

    def factors(n):
        result = set()
        for i in range(1, int(n**0.5) + 1):
            if n % i == 0:
                result.add(i)
                result.add(n // i)
        return sorted(result)

    def main():
        n = 108
        factors_list = factors(n)
        triplets = []
        for a_idx in range(len(factors_list) - 2):
            a = factors_list[a_idx]
            for b_idx in range(a_idx + 1, len(factors_list) - 1):
                b = factors_list[b_idx]
                for c_idx in range(b_idx + 1, len(factors_list)):
                    c = factors_list[c_idx]
                    if a * b * c == n and a < b < c:
                        triplets.append((a, b, c))
        print("All ways in which three distinct positive integers have a product of 108:")
        for triplet in triplets:
            print(triplet)

    if __name__ == "__main__":
        main()

Run:

    -▶ python a.py
    All ways in which three distinct positive integers have a product of 108:
    (1, 2, 54)
    (1, 3, 36)
    (1, 4, 27)
    (1, 6, 18)
    (1, 9, 12)
    (2, 3, 18)
    (2, 6, 9)
    (3, 4, 9)

Too smart for its own good, maybe? Still, apparently correct on the first try.

There's an import for `itertools` which isn't used, just as noted in the article for 4o.

Can someone who knows what it's about say how optimal this version is, compared to other answers?

fifilura 9 months ago

My 5c.

Algorithmically it is optimal, many corner cases thought of to make as few loops as possible.

But it is suboptimal in the sense that it uses for loops so it is not possible for an underlying engine to execute in parallel, or to optimize the looping in native (e.g. C) code.

In short - the for loops are 'how'. Without loops it is much more 'what' which leaves it up to the computer to optimize

klibertp 9 months ago

> But it is suboptimal in the sense that it uses for loops

Looking at https://news.ycombinator.com/item?id=41897526 caused me to try asking for a solution in a language that doesn't have `for` loops. Since SQL was taken, I tried Prolog:

    % Define a predicate to find all positive divisors of N
    divisors(N, Divisors) :-
        findall(D, (between(1, N, D), N mod D =:= 0), Divisors).

    % Define a predicate to generate all combinations of three distinct elements from a list
    combination3(List, X, Y, Z) :-
        select(X, List, Rest1),
        select(Y, Rest1, Rest2),
        select(Z, Rest2, _),
        X < Y, Y < Z.

    % Define a predicate to find triplets (X, Y, Z) such that X*Y*Z = N
    find_triplets(N, X, Y, Z) :-
        divisors(N, Divisors),
        combination3(Divisors, X, Y, Z),
        X * Y * Z =:= N.

    % Define a predicate to list all the triplets
    list_triplets(N) :-
        findall((X, Y, Z), find_triplets(N, X, Y, Z), Triplets),
        writeln(Triplets).

Run:

    ?- [a]. list_triplets(108).
    [(1,2,54),(1,3,36),(1,4,27),(1,6,18),(1,9,12),(2,3,18),(2,6,9),(3,4,9)]

I'm in love. I both understand what is happening (I think I would understand it even if I didn't know the task description!) and could easily port optimizations from the previous solution.

Maybe this is the point when being a polyglot developer - knowing tens of programming languages and their characteristics - will finally start paying off.

tmsh 9 months ago

https://chatgpt.com/share/671604db-7ff0-8001-840e-960abfed0b...

Note the added section compared to the leetcode question (https://leetcode.com/problems/two-sum-ii-input-array-is-sort...) is "And note also that index1 + 1 == index2."

If you enter that into 4o it'll just keep repeating leetcode 167 answers and insist binary search has nothing to do with it.

tangent: o1-preview isn't going to invent things quite yet (i tried to get it to come up with something original in https://chatgpt.com/share/6715f3fe-cfb8-8001-8c7c-10e970c7d3... via curiosity after watching https://www.youtube.com/watch?v=1_gJp2uAjO0). but it can sort of reason about things that are more well-established.

fifilura 9 months ago

This is not the same question.

bytebach 9 months ago

Related HN discussion - https://news.ycombinator.com/item?id=41831735 using Prolog as an intermediate target language for LLM output improves their 'reasoning' abilities.

owenpalmer 9 months ago

A formal notation for reasoning could possibly solve some reasoning issues for LLMs. Perhaps something like Lojban or symbolic logic. We don't have a lot of data for it, but it might be possible to synthetically generate it.

On a dark note, I wonder if increasing AI reasoning capability could have dangerous results. Currently, LLMs are relatively empathetic, and seem to factor the complex human experience into it's responses. Would making LLMs more logical, cold, and calculating result in them stepping on things which humans care about?

seanhunter 9 months ago

> A formal notation for reasoning could possibly solve some reasoning issues for LLMs. Perhaps something like Lojban or symbolic logic. We don't have a lot of data for it, but it might be possible to synthetically generate it.

There's definitely anecdata supporting this. For some time chatgpt was better on a lot of arithmetic/logic type tasks if you asked it to generate a python program to do x than if you just asked it to do x for example. I haven't tested this specifically on the latest generation but my feeling is it has caught up a lot.

Jianghong94 9 months ago

Due to the extreme data quantity requirement for pre-training, LLM effectively locks your reasoning language into the lowest common denominator, aka Python. Sure, people (maybe very smart) can come up with reasonable, effective, efficient notation; the problem is to train the model to use it properly.

albert_e 9 months ago

Gut feel: doing this in two steps (1. write an algorithm for and 2. write code for) or even chain-of-thought prompting might yield better results.

downboots 9 months ago

> The language that a problem-solver uses matters!

Because the "intelligence" is borrowed from language (lower entropy)

ainiriand 9 months ago

What are your thoughts on https://news.ycombinator.com/item?id=41868884

downboots 9 months ago

An analogy https://youtu.be/5H8aRCyEGnU

dandanua 9 months ago

It's not the language it's the context. LLMs could give very different outputs depending on the supplied context. In this case, words "python" and "program" put it in a far better context to solve the problem.

okibry 9 months ago

I have question, we still do not know what behind "neuron network" of machine. What if we gave them some syntax of a language (or syntax of a area) then ask them extract a general rule for that language ? Can them can do that ?

__0x01 9 months ago

I thought this was going to be an essay on the impact of English, Math and Programming on humanity.

I would place English (or all spoken/written languages in general) first, Math (as discovered) second, and programming languages last.

YeGoblynQueenne 9 months ago

>> Only 2 of the 9 LLMs solved the "list all ways" prompt, but 7 out of 9 solved the "write a program" prompt. The language that a problem-solver uses matters! Sometimes a natural language such as English is a good choice, sometimes you need the language of mathematical equations, or chemical equations, or musical notation, and sometimes a programming language is best. Written language is an amazing invention that has enabled human culture to build over the centuries (and also enabled LLMs to work). But human ingenuity has divised other notations that are more specialized but very effective in limited domains.

If I understand correctly, Peter Norvig's argument is about the relative expressivity and precision of Python and natural language with respect to a particular kind of problem. He's saying that Python is a more appropriate language to express factorisation problems, and their solutions, than natural language.

Respectfully -very respectfully- I disagree. The much simpler explanation is that there are many more examples, in the training set of most LLMs, of factorisation problems and their solutions in Python (and other programming languages), than in natural language. Examples in Python etc. are also likely to share more common structure, even down to function and variable names [1], so there are more statistical regularities for a language model to overfit-to, during training.

We know LLMs do this. We even know how they do it, to an extent. We've known since the time of BERT. For example:

Probing Neural Network Comprehension of Natural Language Arguments

https://aclanthology.org/P19-1459/

Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

https://aclanthology.org/P19-1334/

Given these and other prior results Peter Norvig's single experiment is not enough, and not strong enough, evidence to support his alternative hypothesis. Ideally, we would be able to test an LLM by asking it to solve a factorisation problem in a language in which we can ensure there are very few examples of a solution, but that is unfortunately very hard to do.

______________

[1] Notice for instance how Llama 3.1 immediately identifies the problem as "find_factors", even though there's no such instruction in the two prompts. That's because it's seen that kind of code in the context of that kind of question during training. The other LLMs seem to take terms from the prompts instead.

zazaulola 9 months ago

No LLM suggested to first build a table of primes for faster factorization. 108 is too small a number for the test. How long will each program hang to give results for, say, the prime number 1599999983: [1,1599999983,1]

itronitron 9 months ago

so, in conclusion, the training data containing 'math' that LLMs have access to is predominantly written as software code, and not as mathematical notation

jll29 9 months ago

It would be quite exciting to train a LLM with (OCR scans of) all mathematical journals, pre-prints, time-series etc.

PaulDavisThe1st 9 months ago

Within hours to weeks to months, many LLMs will be better at solving the problem as originally given in TFA ... because of the existence of TFA.

Not immediately clear what this means or if it is good or bad or something else.

9 months ago

akira2501 9 months ago

When an author writes things like:

"But some of them forgot that 1 could be a factor of 108"

I struggle to take them seriously. The anthropomorphization of AI into something that can "know" or "forget" is ridiculous and shows a serious lack of caution and thinking when working with them.

Likewise it leads people into wasting time on "prompt engineering" exercises that produce overfit and worthless solutions to trivial problems because it makes them feel like they're revealing "hidden knowledge."

Genuinely disappointing use of human time and of commercial electricity.