BusyBeaver(6) Is Quite Large

https://scottaaronson.blog/?p=8972

tromp
People on the bbchallenge Discord server are keen to speculate on how many Turing Machine states are needed to surpass Graham's Number, which is vastly larger than the 2^^2^^2^^9 achieved by the latest BB(6) champion.

We know from the functional busy beaver [1] that Graham behaviour can come surprisingly early; a 49-bit lambda term suffices. There are only 77519927606 closed lambda terms of at most that size [2], compared to 4^12*23836540=399910780272640 unique 6-state Turing Machines [3].

With the achievement of pentation in only 6 states, several people now believe that 7 states should suffice to surpass Graham's. I would still find that rather surprising. A few days ago, I made a large bet with one of them on whether we would see proof of BB(7)>Graham's within the next 10 years.

What do people here think?

[1] https://oeis.org/A333479

[2] https://oeis.org/A114852

[3] https://oeis.org/A107668

gpm
I can't pretend to be an expert, but I'll argue BB(7) is probably larger than Graham's number.

BB has to grow faster than any computable sequence. What exactly that means concretely for BB(7) is... nothing other than handwaving... but it sort of means it needs to walk up the "operator strength" ladder very quickly... it eventually needs to grow faster than any computable operator we define (including, for example, up-arrow^n, and up-arrow^f(n) for any computable f).

My gut feeling is that the growth between 47 million and 2^^2^^2^^9 is qualitatively larger than the growth between 2^^2^^2^^9 and graham's number in terms of how strong the operator we need is (with gramah's number being g_64 and g here being roughly one step "above" up_arrow^n). So probably we should have BB(7)>Graham's number.

pinkmuffinere
I can’t edit my original comment anymore, but replying to say I went on a Wikipedia binge and I think you’re right. Thanks for humoring me and helping me learn!
gpm
:)
pinkmuffinere
Apologies if this feels adversarial, but I think your informal proof has an error, and I think I can explain it!

Your proof rests primarily on this assertion:

> BB has to grow faster than any computable sequence.

This is almost true! BB(n) has to grow faster than any computable sequence _defined by an n-state Turing machine_. That last part is really important. (Note that my restatement is probably incorrect too, it is just correct enough to point out the important flaw I saw in your statement). This means that up-arrow^f(n) _can_ be larger than BB(n) — up-arrow^f(n) is not restricted by a Turing machine at all. As an easy example, consider f(n) = BB(n)^2.

You may still be right about BB(7) being bigger than Graham’s number, even if your proof is not bulletproof

btilly
No, the original was correct.

Any computable sequence S(n) must be computed by a specific finite program of fixed length.

Once n gets big enough, BB(n) will include the function S(2^n), and therefore will exceed that computable sequence.

Given computable sequences may exceed BB(n) for a finite number of terms. But eventually BB(n) will outgrow them, and will never look back.

gpm
Math's a cooperative endeavor, I want to know if I'm wrong!

I'm not sure I understand the distinction you're trying to make though, and I'm not sure it's right...

The argument that BB has to grow faster than any computable sequence is that if we have a computable f(n) where for all n f(n) > BB(n) then we can solve the halting problem by simulating turing machines of size n for f(n) steps and checking if they halt. Even if we can't prove f(n) > BB(n) the mere existence of this f would mean we could solve the halting problem (even though we couldn't prove we had done so).

I agree my "proof" (intuition really) rests on that assertion.

> As an easy example, consider f(n) = BB(n)^2.

This, like BB(n), isn't computable?

Scarblac
It boggles my mind that a number (an uncomputable number, granted) like BB(748) can be "independent of ZFC". It feels like a category error or something.
tromp
What makes BB(748) independent of ZFC is not its value, but the fact that one of the 748-state machines (call it TM_ZFC_INC) looks for an inconsistency (proof of FALSE) in ZFC and only halts upon finding one.

Thus, any proof that BB(748) = N must either show that TM_ZF_INC halts within N steps or never halts. By Gödel's famous results, neither of those cases is possible if ZFC is assumed to be consistent.

Diggsey
I think what's most unintuitive is that most (all?) "paradoxes" or "unknowables" in mathematics involve infinities. When limiting ourselves to finite whole numbers, paradoxes necessarily disappear.

BB(748) is by definition a finite number, and it has some value - we just don't know what it is. If an oracle told us the number, and we ran TM_ZFC_INC that many steps we would know for sure whether ZFC was consistent or not based on whether it terminated.

The execution of the turing machine can be encoded in ZFC, so it really is the value of BB(748) that is the magic ingredient. Somehow even knowledge of the value of this finite number is a more potent axiomatic system than any we've developed.

edanm
> BB(748) is by definition a finite number, and it has some value - we just don't know what it is. If an oracle told us the number, and we ran TM_ZFC_INC that many steps we would know for sure whether ZFC was consistent or not based on whether it terminated.

This doesn't sound right to me.

You can prove that ZFC is consistent. You could do it today, with or without the magic number, using a stronger axiom system. If an Oracle told you that BB(748) = 100 or whatever, that would constitute proof that ZFC is consistent.

But it wouldn't negate the fact that BB(748) is independent of ZFC, because you haven't proved within the axioms of ZFC that ZFC is consistent, which is what makes it independent.

> I think what's most unintuitive is that most (all?) "paradoxes" or "unknowables" in mathematics involve infinities. When limiting ourselves to finite whole numbers, paradoxes necessarily disappear.

I might be missing something, but all of these assertions deal with finite whole numbers, not infinity. Unless you count a Turing machine running forever an infinity, in which case, it seems counterintuitive to me that encoding a while loop that runs forever somehow makes paradoxes appear.

waluigi
It’s even more counterintuitive than you let on! If you are working in ZFC along with the axiom “ZFC is consistent” then there’s no issue: just a normal number[1]. Where things get really strange is in ZFC plus the axiom “ZFC is inconsistent”.

This already sounds like an inconsistent theory, but surprisingly isn’t: Godel’s second incompleteness theorem directly gives us that Con(ZFC) is independent, so there are models that validate both Con(ZFC) and ~Con(ZFC). The models that validate ~Con(ZFC) are very confused about what numbers are: from the models perspective, there is a number corresponding to a Godel code for the supposed proof of inconsistency, but from the external view this is a “nonstandard number”: it’s not not a finite numeral!

Getting back to BB(748): what does this look like in a model of ZFC + ~Con(ZFC)? We can prove that the machine internal to the model will find that astronomically large Godel code, so BB(748) will be a nonstandard number. In other words, you can tell if a 748 state machine will terminate in this model: you’ve just got to run it for a number of steps that’s larger than every finite numeral!

[1]: unless there’s some machine that with 748 that enumerates theorems of ZFC+Con(ZFC) but that’s a different discussion.

edanm
> Thus, any proof that BB(748) = N must either show that TM_ZF_INC halts within N steps or never halts. By Gödel's famous results, neither of those cases is possible if ZFC is assumed to be consistent.

Isn't it more accurate to say that any proof that BB(748) = N in ZFC must either show that TM_ZF_INC halts within N steps, or never halts?

Meaning, it's totally possible to prove that BB(748) = N, it just can't be done within the axioms of ZFC?

woopsn
Does the fact that BB(k)=N is provable up to some k < 748 mean that all halting problems for machines with k states are answered by a proof in ZFC?
Xcelerate
It boggles my mind that we ever thought a small amount of text that fits comfortably on a napkin (the axioms of ZFC) would ever be “good enough” to capture the arithmetic truths or approximate those aspects of physical reality that are primarily relevant to the endeavors of humanity. That the behavior of a six state Turing machine might be unpredictable via a few lines of text does not surprise me in the slightest.

As soon as Gödel published his first incompleteness theorem, I would have thought the entire field of mathematics would have gone full throttle on trying to find more axioms. Instead, over the almost century since then, Gödel’s work has been treated more as an odd fact largely confined to niche foundational studies rather than any sort of mainstream program (I’m aware of Feferman, Friedman, etc., but my point is there is significantly less research in this area compared to most other topics in mathematics).

hyperpape
This ignores the fact that it is not so easy to find natural interesting statements that are independent of ZFC.

Statements that are independent of ZFC are a dime a dozen when doing foundations of mathematics, but they're not so common in many other areas of math. Harvey Friedman has done interesting work on finding "natural" statements that are independent of ZFC, but there's dispute about how natural they are. https://mathoverflow.net/questions/1924/what-are-some-reason...

In fact, it turns out that a huge amount of mathematics does not even require set theory, it is just a habit for mathematicians to work in set theory. https://en.wikipedia.org/wiki/Reverse_mathematics.

Xcelerate
Yeah, I’m quite familiar with Friedman’s work. I mentioned him and his Grand Conjecture in another comment.

> This ignores the fact that it is not so easy to find natural interesting statements that are independent of ZFC.

I’m not ignoring this fact—just observing that the sheer difficulty of the task seems to have encouraged mathematicians to pursue other areas of work beside foundational topics, which is a bit unfortunate in my opinion.

azan_
> As soon as Gödel published his first incompleteness theorem, I would have thought the entire field of mathematics would have gone full throttle on trying to find more axioms.

But why? Gödel's theorem does not depend on number of axioms but on them being recursively enumerable.

Xcelerate
Right, Hilbert’s goal was (loosely speaking) to “find a finitely describable formal system” sufficient to “capture all truths”. When Gödel showed that can’t be done, that shouldn’t imply we just stop with the best theory we have so far and call it a day—it means there are an infinite number of more powerful theories (with necessarily longer minimal descriptions) waiting to be discovered.

In fact, both Gödel and Turing worked on this problem quite a bit. Gödel thought we might be able to find some sort of “meta-principle” that could guide us toward discovering an ever increasing hierarchy of more powerful axioms, and Turing’s work on ordinal progressions followed exactly this line of thinking as well. Feferman’s completeness theorem even showed that all arithmetical truths could be discovered via an infinite process. (Now of course this process is not finitely axiomatizable, but one can certainly extract some useful finite axioms out of it — the strength of PA after all is equivalent to the recursive iteration up to ε_0 of ‘Q_{n+1} = Q_n + Q_n is consistent’ where Q_0 is Robinson arithmetic).

tliltocatl
Gödel's theorem shows that you need an infinite number of axioms to describe reality (given that available reality isn't finite), so any existing axiomatic system isn't enough.
throwaway81523
> It boggles my mind that we ever thought a small amount of text that fits comfortably on a napkin (the axioms of ZFC) would ever be “good enough” to capture the arithmetic truths or approximate those aspects of physical reality that are primarily relevant to the endeavors of humanity.

ZFC is way overpowered for that. https://mathoverflow.net/questions/39452/status-of-harvey-fr...

Xcelerate
I don’t understand your post. You’re linking to a discussion about the same conjecture I mentioned in another comment 11 hours prior to your comment. Did you mean to link something else?
red75prime
> It boggles my mind that a number (an uncomputable number, granted) like BB(748) can be "independent of ZFC".

It's BB(n) that is incomputable (that is there's no algorithm that outputs the value of BB(n) for arbitrary n).

BB(748) is computable. It's, by definition, a number of ones written by some Turing machine with 748 states. That is this machine computes BB(748).

> It feels like a category error or something.

The number itself is just a literally unimaginably large number. Independence of ZFC comes in when we try to prove that this number is the number we seek. And to do that you need theory more powerful than ZFC to capture properties of a Turing machine with 748 states.

ChadNauseam
The number itself is not independent of ZFC. (Every integer can be expressed in ZFC.) What's independent of ZFC is the process of computing BB(748).
bo1024
I think the more correct statement is that there are different models of ZFC in which BB(748) are different numbers. People find that weird because they don't think about non-standard models, as arguably they shouldn't.
Strilanc
Isn't that incompatible with the models being consistent?

Suppose model A proves BB(748) = X and model B proves BB(748) = Y > X. But presumably the models can interpret running all size 748 Turing machines for Y steps. Either one of the machines halts at step Y (forming a proof within A that BB(748) >= Y contradicting the assumed proof within A that BB(748) = X < Y) or none of the machines halts at step Y (forming a proof within B that BB(748) != Y contradicting the assumed proof within B that BB(748) = Y).

I'm guessing the only way this could ever work would be some kind of nastiness like X and Y aren't nailed down integers, so you can't tell if you've reached them or not, and somehow also there's a proof they aren't equal.

wat10000
How is that possible? That implies there’s at least one specific program whose execution changes based on the ZFC model. The rules of program execution are so simple, it doesn’t make sense that they’d change based on anything like that.
Straw
Sure, if someone just gives you the number, ZFC can represent it. But ZFC cannot prove that the value is correct, so how do you know you have the right number? Use a stronger proof system? Go a bit bigger and same issue.
ajkjk
Not an expert, but I've read about this a bit because it bothered me also and I think this is the answer:

Most of these 'uncomputable' problems are uncomputable in the sense of the halting problem: you can write down an algorithm that should compute them, but it might never halt. That's the sense in which BB(x) is uncomputable: you won't know if you're done ever, because you can't distinguish a machine that never halts from one that just hasn't halted yet (since it has an infinite number of states, you can't just wait for a loop).

So presumably the independence of a number from ZFC is like that also: you can't prove it's the value of BB(745) because you won't know if you've proved it; the only way to prove it is essentially to run those Turing machines until they stop and you'll never know if you're done.

I'm guessing that for the very small Turing machines there is not enough structure possible to encode whatever infinitely complex states end up being impossible to deduce halting from, so they end up being Collatz-like and then you can go prove things about them using math. As you add states the possible iteration steps go wild and eventually do stuff that is beyond ZFC to analyze.

So the finite value 745 isn't really where the infinity/uncomputability comes from-it comes from the infinite tape that can produce arbitrarily complex functions. (I wonder if over a certain number of states it becomes possible to encoding a larger Turing machine in the tape somehow, causing a sort of divergence to infinite complexity?)

thechao
We need to distinguish between a computer that's equivalent to BB(n), and a computer big enough to compute the value of the number that is BB(n). By (terrible) analogy: a 4004 can be made to write a finite loop that describes how many FLOPs the number 1 supercomputer can compute without, itself, being able to usefully perform the computations of that supercomputer. (The 4004 will run out of memory/addressable disk space.) Similarly, we can no longer build decidable programs in ZFC that can compute the number BB(748). Scott is saying that they now think this "disassociation" might occur at BB(7)!
nyrikki
To try and help people digging into this, the following helped me.

Two lenses for trying to understand this are potentially Chastain's limits on output of a lisp program being more complex than the program itself [1] or Markov's proof that you can't classify manifolds in d>= 4.

If you try the latter and need/want to figure out how the Russian school is so different this is helpful [2]

IMHO the former gives an intuition why, and the latter explains why IMHO.

In ZFC, C actually ends up implying PEM, which is why using constructionism as a form of reverse math helped it click for me .

This is because in the presence of excluded middle, every sequentially complete metric space is a complete space, and we tend to care about useful things, but for me just how huge the search space grows was hidden due to the typical (and useful) a priori assumption of PEM.

If you have a (in my view) dislike for the constrictive approach or don't want/have to invest in learning an obscure school of it, This recent paper[3] on the limits for finding a quantum theory of everything is another lens.

Yet another path is through Type 2 TMs and the Borel hierarchy, where while you can have a uncomputable number on the input tape you algorithms themselves cannot use them, while you can produce uncomputable numbers by randomly selecting and/or changing an infinite sequence.

Really it is the difference between expressability and algorithms working within what you can express.

Hopefully someone else can provide more accessible resources. I think a partial understanding of the limits of algorithms and computation will become more important in this new era.

[1] https://arxiv.org/abs/chao-dyn/9407003 [2] https://arxiv.org/abs/1804.05495 [3] https://arxiv.org/abs/2505.11773

drdeca
Looking at [3], they seem to argue that the system isn’t complete for the usual Gödel reasons, which, sure, it isn’t, but then they call the claim that the system fails to decide, which is a statement about probability, a “scientific fact”. This seems to me like a mistake?

Like, a TOE is not expected to decide all statements expressible in the theory, only to predict particular future states from past states, with as much specificity as such past states actually determine the future states. It should not be expected to answer “given a physical setup where a Turing machine has been built, is there a time at which it halts?” but rather to answer “after N seconds, what state is the machine (as part of the physical system) in?” (for any particular choice of N).

Whether a particular statement expressed in the language of the theory is provable in the theory, is not a claim about the finite-time behavior of a physical system, unless your model of physics involves like, oracle machines or something like that.

Edit: it later says: “ Chaitin’s theorem states that there exists a constant K_{ℱ_{QG}} , determined by the axioms of ℱ_{QG} , such that no statement S with Kolmogorov complexity K(S) > K_{ℱ_{QG}} can be proven within ℱ_{QG} .”

But this, unless I’m badly misinterpreting it, seems very wrong? Most formal systems of interest have infinitely many distinct theorems. Given an infinite set of strings, there is no finite universal upper bound on the Kolmogorov complexity of the strings in that set.

Maybe this was just a typo or something?

They do then mention something about the Bekenstein bound, which I haven’t considered carefully yet but seems somewhat more promising than the parts of the article that preceded it.

MichaelDickens
It's known that BB(14) is bigger than Graham's number, but this new finding leads me to believe that BB(7) is probably bigger than Graham's number. Intuitively, the technology required to go from pentation to Graham's number feels simpler than the technology required to go from `47,176,870` to `2 <pentate> 5`.
tromp
Thanks for sharing; your post would fit well as an answer to mine about Graham's number...
sedatk
> Also, the left-superscript means tetration, or iterated exponentiation: for example, 1510 means 10 to the 10 to the 10 and so on 15 times.

I thought it was a typo. First time I encounter tetration.

lbourdages
I've seen it before, but it was using Knuth's up-arrow notation [1], which I like because it generalizes easily.

[1] https://en.wikipedia.org/wiki/Knuth's_up-arrow_notation

griffzhowl
Continuing the theme of iteration: it was the first time I encountered pentation
tialaramex
One of the reasons I like the use of the number line in schools is that on the line it's more obvious when you're shown addition and multiplication and then later exponeniation that this is a pattern. With the number line, two natural questions arise and, hopefully by the time you're taught exponentiation the Math teacher knows enough math to confidently affirm the answer to both. Yes, it keeps going like this forever, that's called Hyperoperation. And yes, we did (probably) skip one, it's known as Successor-of and you were probably not explicitly shown this operator but it's the near end of that infinite succession.

When arithmetic is introduced just as a way to, for example, count money, it's more directly practical in the moment, but you're not seeing the larger pattern.

Nevermark
Don't forget identity. Its range is small but important!
vismit2000
Scott Aaronson | How Much Math Is Knowable? [Harward CMSA]: https://www.youtube.com/watch?v=VplMHWSZf5c

Recently on HN (couple of months ago): https://news.ycombinator.com/item?id=43776477

seeknotfind
> So I said, imagine you had 10,000,000sub10 grains of sand. Then you could … well, uh … you could fill about 10,000,000sub10 copies of the observable universe with that sand.

I don't get this part. Is it really rounding away the volume of the observable universe divided by the average volume of a grain of sand? That is many more orders of magnitude than the amount of mass in the universe, which is a more usual comparison.

Straw
Yes, that's right, dividing by that ratio essentially barely affects the number in a sense that 'adjacent' numbers in that notation give a much bigger change.

10↑↑10,000,000 / (sand grains per universe) is vastly larger than, say, 10↑↑9,999,999

So on system we're using to write these numbers, there's really no better way to write (very big)/ (only universally big) than by writing exactly that, and then in the notation for very big, it pretty much rounds to just (very big).

mckeed
With tetration you're not dealing with orders of magnitude anymore, but orders of magnitude of orders of magnitude.
Chirono
Exactly. This number is so so much bigger than 10^100000 or however many grains of sand would fit, that dividing by that amount doesn’t really change it, certainly not enough to bring it down closer to 9,999,999sub10
Scarblac
Yes, that's only some normal number amount of orders of magnitude. Even 10,000,000^10,000,000 is already so large that it doesnt matter, let alone after exponentiating _the exponent_ nine times more.
wizzwizz4
It's the other way around: we're talking about 10^(10^(10^(10^…))) (which is vastly bigger).
ryandrake
> For those tuning in from home, here BB(6) is the 6th Busy Beaver number, i.e. the maximum number of steps that a 6-state Turing machine with a {0,1} alphanet can take before halting, when run on an initially all-0 input tape.

Oh! Of course! That sure clears things up for this non-expert. This is clearly a hardcore blog for people who have been doing this kind of research for decades. Kind of awesome to stumble upon something so unapologetically dense and jargony and written for a very specific audience!

Bjartr
That should be enough for someone with an undergrad CS education to at least get a sense of what's going on if they haven't encountered the busy beaver problem before.

Is it niche jargon, absolutely, but to say it's only accessible to people who have put in decades is selling yourself short.

ryandrake
Hmm, interesting. It’s been 30 years since my engineering degree (not CS) and I’d have to look up what a Turing machine is. I think I remember one professor briefly mentioned it as “This is something the CS majors care deeply about but nobody else in the industry does.” Where I was, the CS degree was essentially a math degree dressed up in a hoodie.
Bjartr
> This is something the CS majors care deeply about but nobody else in the industry does

Correct, the industry cares a lot more about Software Engineering than Computer Science.

> CS degree was essentially a math degree dressed up in a hoodie.

To a first approximation, that's what it's supposed to be. CS is a field of mathematics. It's not a trade school course.

clbrmbr
The definition there is standard undergraduate computer science theory. Maybe not standard for software engineering though.
phs
So what is the richest logic whose proofs can be enumerated with only a five state TM?
LegionMammal978
While that question depends on what you count as an 'enumeration', there's the related question of "What's the richest logic that cannot prove the halting status of all 5-state TMs?" That is, what's the richest logic that some 5-state TM's halting status is independent of?

I've pondered that version of the question a bit, but I couldn't get very far due to my lack of expertise in first-order logic. What I do know is that Skelet #17 [0] is one of the toughest machines to prove non-halting on a mathematical level [1], so any theory sufficient to prove that Skelet #17 doesn't halt is likely sufficient to decide the rest of the 5-state machines.

[0] https://bbchallenge.org/1RB---_0LC1RE_0LD1LC_1RA1LB_0RB0RA

[1] https://arxiv.org/abs/2407.02426

tromp
That entirely depends on how you want to interpret a finite binary string as an enumeration of logic proofs?!
fjfaase
I wonder if the visible universe is large enough to write down the exact value of BB(6).
aeve890
If you treat the observable universe as a closed system, you could try to apply the Bekenstein bound using - R ≈ 46.5 billion light-years (radius of the observable universe) - E ≈ total mass-energy content of the observable universe

The mass-energy includes ordinary matter, dark matter, and dark energy. Current estimates suggest the observable universe contains roughly 10^53 kg of mass-energy equivalent.

Plugging these into S ≤ 2πER/ℏc gives someting on the order of 10^120 bits of maximum information content.

S ≤ 2πER/ℏc

S ≤ (2 × 3.141593 × 3.036e+71 × 4.399e+26)/(1.055e-34 × 299792458)

S ≤ 2.654135e+124

S ≤ 10^120

So, no.

kaashif
It definitely isn't. The amount of information you can store in the universe is something like 10^120 bits. Even if I'm off by a trillion orders of magnitude it doesn't matter.
layer8
You’re probably referring to a state where all parts of the complete representation exist at the same time. Because if they don’t have to exist at the same time, then it might be possible to “write it down” if the universe has unbounded duration (“might” because I don’t know how the heat death plays into that). However, “at the same” time isn’t well-defined in relativistic spacetime. The sibling comments are definitely right with respect to the reference frame implied by the CMB. But I’m wondering if it wouldn’t be possible to slice spacetime in a way that actually makes a representation possible “at the same time” in some reference frame?
Dylan16807
Just the starting number in the article is ¹⁵10. That means it's 10^(¹⁴10). That means it has ¹⁴10 digits. So no, you can't.
d_silin
Any time I see such results from computation complexity theory, I realize that any current zeitgeist of "super-intelligent AI are gods" is complete bullshit.

You can convert every atom of observable Universe into a substrate for supercomputer, you can harness energies of supermassive black holes to power it, but running a humble BB(6) to halting state would be forever out of its reach.

istjohn
That strawman never stood a chance.