161 points by stereoabuse 9 months ago | 47 comments
nick0garvey 9 months ago
His code here demonstrates why. It is both shorter and much easier to understand than anything the LLMs generated. It is not always as efficient as the LLMs (who often skip the third loop by calculating the last factor), but it is definitely the code I would prefer to work with in most situations.
[1] https://www.udacity.com/course/design-of-computer-programs--...
fifilura 9 months ago
WITH all_numbers AS
(
SELECT generate_series as n
FROM generate_series(1, 108) as n
),
divisors AS
(
SELECT *
FROM all_numbers
WHERE 108 % n = 0
),
permutations as
(
SELECT a.n as n1, b.n as n2, c.n as n3
FROM divisors as a
CROSS JOIN divisors as b
CROSS JOIN divisors as c
)
SELECT *
FROM permutations
WHERE n1 * n2 * n3 = 108
AND n1 < n2 And n2 < n3
ORDER BY n1, n2, n3
https://codapi.org/embed/?sandbox=duckdb&code=data%3A%3Bbase...upghost 9 months ago
It's amazing how he never used the words "AI" once in this course despite the fact that it is a straight up AI course.
I revisit the course notes at least once a year and I still walk away with something new every time.
abecedarius 9 months ago
upghost 9 months ago
But it's especially great if you want to appreciate PN's code more. He actually offers explanations of choices for his distinctive style of coding here. His 6 rules for good code are particularly good. He says they are for good lisp but I think they broadly apply to any language:
https://github.com/norvig/paip-lisp/blob/main/docs/chapter3....
*Edit:* After reading this book I am often really surprised that this book is not referenced as often or more often than SICP is. Not the time or place for a soapbox rant but SICP is often cited as landmark functional programming text when in fact it is largely advocating OO practices implemented in scheme, whereas PAIP really demonstrates full power FP programming on display (type system independent), and where OO practices such as CLOS are used he is quite explicit about it.
abecedarius 9 months ago
Speculating: the sheer bulk of the book might also push people away. It's usually an anti-signal for riches within.
JohnKemeny 9 months ago
upghost 9 months ago
https://web.archive.org/web/20140122073352/https://www.udaci...
Yes, I know that you can get the Q value via RL with Bellman equation or non-parametric evo learning, but at the time I didn't know that and I'm glad I didn't.
Depending on how you count, this code features either 2 or 4 mutually recursive functions, something that is quite rare in Python (esp outside the context of an FSM) but is a signature of PN code.
I had to learn so much about so many things to untangle it.
TaurenHunter 9 months ago
l33t7332273 9 months ago
I think this makes sense. LLMs are trained on average code on average. This means they produce average code, on average.
olliemath 9 months ago
underdeserver 9 months ago
from math import prod
from itertools import combinations
def find_products(k=3, N=108) -> set[tuple[int, ...]]:
"""A list of all ways in which `k` distinct positive integers have a product of `N`."""
factors = {i for i in range(1, N + 1) if N % i == 0}
return {ints for ints in combinations(factors, k) if prod(ints) == N}
find_products()
superlopuh 9 months ago
Edit: I see that he imported the typing Set, which is deprecated, instead of collections.abc.Set, which is still useful, so your comment is correct.
earslap 9 months ago
You can increase the compute for allowing more tokens for it to use as a "scratch pad" so the total compute available will be num_tokens * ops_per_token but there still are infinite amount of problems you can ask that will not be computable within that constraint.
But, you can offload computation by asking for the description of the computation, instead of asking for the LLM to compute it. I'm no mathematician but I would not be surprised to learn that the above limit applies here as well in some sense (maybe there are solutions to problems that can't be represented in a reasonable number of symbols given our constraints - Kolmogorov Complexity and all that), but still for most practical (and beyond) purposes this is a huge improvement and should be enough for most things we care about. Just letting the system describe the computation steps to solve a problem and executing that computation separately offline (then feeding it back if necessary) is a necessary component if we want to do more useful things.
segmondy 9 months ago
llama3.1-70b, llama3.1-405b, deepseekcoder2.5, gemma-27b, mistral-large, qwen2.5-72b. https://gist.github.com/segmond/8992a8ec5976ff6533d797caafe1...
I like how the solution sort of varies across most, tho mistral and qwen look really similar.
assadk 9 months ago
ryandv 9 months ago
riku_iki 9 months ago
There is math term "formal language", and both math and programming perfectly fit into it: https://en.wikipedia.org/wiki/Formal_language
nurettin 9 months ago
ryandv 9 months ago
> Sometimes a natural language such as English is a good choice, sometimes you need the language of mathematical equations, or chemical equations, or musical notation, and sometimes a programming language is best. Written language is an amazing invention that has enabled human culture to build over the centuries (and also enabled LLMs to work). But human ingenuity has divised [sic] other notations that are more specialized but very effective in limited domains.
bbor 9 months ago
Of course, this is what Chomsky calls a “terminological dispute”, so I mean no offense and you’re ofc free to stand your ground that the only language is what appears in human brains! But if mathematical notation and programming languages aren’t languages, what are they…? Protocols? Recursively generative patterns? Maybe just grammars?
The best move in any terminological dispute is “my terms are more useful”, so this seems like a good reason to keep language as it’s defined by the generative linguistics. Or, more broadly:
Saussure approaches the essence of language from two sides. For the one, he borrows ideas from Steinthal and Durkheim, concluding that language is a 'social fact'. For the other, he creates a theory of language as a system in and for itself which arises from the association of concepts and words or expressions. Thus, language is a dual system of interactive sub-systems: a conceptual system and a system of linguistic forms. Neither of these can exist without the other because, in Saussure's notion, there are no (proper) expressions without meaning, but also no (organised) meaning without words or expressions.
https://en.wikipedia.org/wiki/Theory_of_languageryandv 9 months ago
But if I must, I suppose I am indeed assuming a stance by taking potshots at this narrower (albeit more precise) use of the word language by (obliquely) pointing to counterexamples that could be considered languages in their own right; the sweeping claim that "language is not essential for thought" seems far broader than the narrow sense in which the term is construed in the actual paper.
Tainnor 9 months ago
I don't think that this is actually true at all (and modern neuroscience, sociolinguistics, etc. disagree pretty heavily with Chomsky), but it is impressive how far you can get with modelling natural language grammar with formal methods.
tmsh 9 months ago
fifilura 9 months ago
klibertp 9 months ago
import itertools
def factors(n):
result = set()
for i in range(1, int(n**0.5) + 1):
if n % i == 0:
result.add(i)
result.add(n // i)
return sorted(result)
def main():
n = 108
factors_list = factors(n)
triplets = []
for a_idx in range(len(factors_list) - 2):
a = factors_list[a_idx]
for b_idx in range(a_idx + 1, len(factors_list) - 1):
b = factors_list[b_idx]
for c_idx in range(b_idx + 1, len(factors_list)):
c = factors_list[c_idx]
if a * b * c == n and a < b < c:
triplets.append((a, b, c))
print("All ways in which three distinct positive integers have a product of 108:")
for triplet in triplets:
print(triplet)
if __name__ == "__main__":
main()
Run: -▶ python a.py
All ways in which three distinct positive integers have a product of 108:
(1, 2, 54)
(1, 3, 36)
(1, 4, 27)
(1, 6, 18)
(1, 9, 12)
(2, 3, 18)
(2, 6, 9)
(3, 4, 9)
Too smart for its own good, maybe? Still, apparently correct on the first try.There's an import for `itertools` which isn't used, just as noted in the article for 4o.
Can someone who knows what it's about say how optimal this version is, compared to other answers?
fifilura 9 months ago
Algorithmically it is optimal, many corner cases thought of to make as few loops as possible.
But it is suboptimal in the sense that it uses for loops so it is not possible for an underlying engine to execute in parallel, or to optimize the looping in native (e.g. C) code.
In short - the for loops are 'how'. Without loops it is much more 'what' which leaves it up to the computer to optimize
klibertp 9 months ago
Looking at https://news.ycombinator.com/item?id=41897526 caused me to try asking for a solution in a language that doesn't have `for` loops. Since SQL was taken, I tried Prolog:
% Define a predicate to find all positive divisors of N
divisors(N, Divisors) :-
findall(D, (between(1, N, D), N mod D =:= 0), Divisors).
% Define a predicate to generate all combinations of three distinct elements from a list
combination3(List, X, Y, Z) :-
select(X, List, Rest1),
select(Y, Rest1, Rest2),
select(Z, Rest2, _),
X < Y, Y < Z.
% Define a predicate to find triplets (X, Y, Z) such that X*Y*Z = N
find_triplets(N, X, Y, Z) :-
divisors(N, Divisors),
combination3(Divisors, X, Y, Z),
X * Y * Z =:= N.
% Define a predicate to list all the triplets
list_triplets(N) :-
findall((X, Y, Z), find_triplets(N, X, Y, Z), Triplets),
writeln(Triplets).
Run: ?- [a]. list_triplets(108).
[(1,2,54),(1,3,36),(1,4,27),(1,6,18),(1,9,12),(2,3,18),(2,6,9),(3,4,9)]
I'm in love. I both understand what is happening (I think I would understand it even if I didn't know the task description!) and could easily port optimizations from the previous solution.Maybe this is the point when being a polyglot developer - knowing tens of programming languages and their characteristics - will finally start paying off.
tmsh 9 months ago
Note the added section compared to the leetcode question (https://leetcode.com/problems/two-sum-ii-input-array-is-sort...) is "And note also that index1 + 1 == index2."
If you enter that into 4o it'll just keep repeating leetcode 167 answers and insist binary search has nothing to do with it.
tangent: o1-preview isn't going to invent things quite yet (i tried to get it to come up with something original in https://chatgpt.com/share/6715f3fe-cfb8-8001-8c7c-10e970c7d3... via curiosity after watching https://www.youtube.com/watch?v=1_gJp2uAjO0). but it can sort of reason about things that are more well-established.
fifilura 9 months ago
bytebach 9 months ago
owenpalmer 9 months ago
On a dark note, I wonder if increasing AI reasoning capability could have dangerous results. Currently, LLMs are relatively empathetic, and seem to factor the complex human experience into it's responses. Would making LLMs more logical, cold, and calculating result in them stepping on things which humans care about?
seanhunter 9 months ago
There's definitely anecdata supporting this. For some time chatgpt was better on a lot of arithmetic/logic type tasks if you asked it to generate a python program to do x than if you just asked it to do x for example. I haven't tested this specifically on the latest generation but my feeling is it has caught up a lot.
Jianghong94 9 months ago
albert_e 9 months ago
downboots 9 months ago
Because the "intelligence" is borrowed from language (lower entropy)
ainiriand 9 months ago
downboots 9 months ago
dandanua 9 months ago
okibry 9 months ago
__0x01 9 months ago
I would place English (or all spoken/written languages in general) first, Math (as discovered) second, and programming languages last.
YeGoblynQueenne 9 months ago
If I understand correctly, Peter Norvig's argument is about the relative expressivity and precision of Python and natural language with respect to a particular kind of problem. He's saying that Python is a more appropriate language to express factorisation problems, and their solutions, than natural language.
Respectfully -very respectfully- I disagree. The much simpler explanation is that there are many more examples, in the training set of most LLMs, of factorisation problems and their solutions in Python (and other programming languages), than in natural language. Examples in Python etc. are also likely to share more common structure, even down to function and variable names [1], so there are more statistical regularities for a language model to overfit-to, during training.
We know LLMs do this. We even know how they do it, to an extent. We've known since the time of BERT. For example:
Probing Neural Network Comprehension of Natural Language Arguments
https://aclanthology.org/P19-1459/
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
https://aclanthology.org/P19-1334/
Given these and other prior results Peter Norvig's single experiment is not enough, and not strong enough, evidence to support his alternative hypothesis. Ideally, we would be able to test an LLM by asking it to solve a factorisation problem in a language in which we can ensure there are very few examples of a solution, but that is unfortunately very hard to do.
______________
[1] Notice for instance how Llama 3.1 immediately identifies the problem as "find_factors", even though there's no such instruction in the two prompts. That's because it's seen that kind of code in the context of that kind of question during training. The other LLMs seem to take terms from the prompts instead.
zazaulola 9 months ago
itronitron 9 months ago
jll29 9 months ago
PaulDavisThe1st 9 months ago
Not immediately clear what this means or if it is good or bad or something else.
9 months ago
akira2501 9 months ago
"But some of them forgot that 1 could be a factor of 108"
I struggle to take them seriously. The anthropomorphization of AI into something that can "know" or "forget" is ridiculous and shows a serious lack of caution and thinking when working with them.
Likewise it leads people into wasting time on "prompt engineering" exercises that produce overfit and worthless solutions to trivial problems because it makes them feel like they're revealing "hidden knowledge."
Genuinely disappointing use of human time and of commercial electricity.