The Split-Brain Problem of LLMs - The key to improved agents and mega prompts

EXTRA EXTRA, NEW SUPER PROMPT DEVELOPED TO BRING YOU ETERNAL LIFE. JUST PASTE INTO CHATGPT AND LET IT DO THE REST

…or maybe your timeline is also filled with:

WOAH, i just made a 100 step agent in N8N that completely removes the need for a marketing team, sign up now!

Those town criers are quite loud and distracting, aren’t they? And while we know much of what they promote is low quality… we’ve seen that shining star before. That super prompt that really did work wonders. That workflow you saw on YouTube that made your eyes pop out. Or maybe more simply, a friend of yours that somehow just seems to always get way better results than you when talking to ChatGPT.

So while I can’t promise to make you into that all-star in one fell swoop, I can offer you one of the principles that is absolutely necessary to understand in order to make vastly improved agents, super prompts, and more. It is something I call the split brain problem of LLMs.

You already know this to be true

Before we dig into some technicals, let me give an example of this split brain problem. Let’s imagine ourselves as the partners of a video game company. There’s a lot of tasks to be done… from programming, designing the mechanics, art, music, and more. We’re both pretty good at breaking down the tasks to be done… but now we have to actually do them.

So, partner, If I gave you a huge list of instructions to follow that’d help you execute all of those to perfection, would you feel comfortable doing them? I figure some tasks you’d be down for, while others you’d have a nagging ring in your head saying “We really need a creative to do these parts”, or “Someone more technically inclined would be better for these”.

You see, we as humans seem to inherently understand that there are some tasks that are very difficult to do at the same time. That by trying to do them both at once, the quality of the overall product is worse.

For example, my friend Marty has described this very problem when he composes music. Composing purely from inspiration and applying music theory at the same time was far too difficult. So he now completely separates the two. One day for inspiration, another for analysis. Splitting these two “mindsets” helped immensely.

This is the split brain problem

LLMs are 1 billion different people merged into one machine

Yes this is a drastic simplification, put the pitchforks away. But for illustrations sake, it’s completely true. Behold below, The Ethereal LLM Brain:

Loading brain visualization...

Try clicking the buttons above to see how selecting similar vs dissimilar points affects the overall brain structure.

This is a 4d visualization (time’s the fourth dimension here) of a vector embedding, which I think of as the brain of a LLM. You can think of each point you see moving around as a representation of a fact, a piece of knowledge that it holds. The closer that given point is to another, the more highly related they are.

When you prompt a LLM you’re essentially telling it which points should be included in order to build a “persona” for you to talk to. If the points you choose are rather close together, when it must draw a bounding box around them, it does a great job of only including what you wanted. But what happens when those points are far away from each other? You start including a lot of unnecessary points, don’t you?

THIS is the split brain problem of LLMs. Similar to humans, when you start asking it in the same “chat thread” to accomplish tasks that are “far apart” from one another, performance suffers [1] [2].

Do I split this super prompt into multiple agents? Do I start a new chat?

Now that we’re thinking about how an LLMs brain is structured (albeit in 1 trillion dimensions instead of the 4 i used above), we can dig into one reason why you’re not getting the same performance as your colleagues from LLMs. You’re not splitting tasks up into multiple agents, or starting brand new chats when you ought to be.

Do I start a new chat?

Let’s start with the basic one. You’ve had a chat going on in ChatGPT (or my favorite: t3.chat) for a long time. You’ve talked about all sorts of things. Asked it to do 20 different things for you already.

Before you type another message in that thread, pause. Think… given everything I’ve talked about here, everything we’ve already done, is there a lot of extraneous information that is probably distracting at this point? If I was talking to a human, would it be beneficial for me to start fresh so they can focus on just that one thing?

If yes, start a new chat. It’s that simple. Heck, ask your current chat to summarize and help you build your initial message to your next one. Oh, and please don’t think you can say “hey forget xyz thing we talked about” and be done. That doesn’t work at all [1]. Proven. It’s over. Done.

Should I make a new agent?

Phew. Since you just read the above, the answer to this one is simple. If you were given the list of tasks you want this agent to accomplish from one super prompt, would you want a single person to be working on it? Or would splitting the focus among multiple agents be far better (like Marty had to when composing his music)? Analyzing data at the same time as writing a haiku is quite challenging!

But wait… sometimes it’s better to do multiple tasks simultaneously

No, I’m not trying to confuse you here. There are indeed times where it is better to have a very long ongoing chat, or a super mega prompt. I’m sure you’ve heard of the saying “context is king” many times, and it is what we need to briefly discuss here.

Let’s think about the example of a journalist. There are a lot of tasks here. From research, to interviews, to writing the resulting article. While there are many cases to be made that you could split this amongst multiple people… why is it that to this day, jobs like this are usually done by a single person?

You already know the answer. Because how they are going to write the article, where it will end up, who will be reading it… has a huge impact on the types of questions they ask in the interviews. It impacts what sources they need to research over others. The context of knowing what the end result is, drastically affects (and often improves) each step on the way.

This same phenomenon was also seen amongst LLMs [3]. More on that below if you’re interested.

Should I keep my existing chat?

Have all the tasks you’ve asked it to do so far been really highly related? For the next task you are asking it to do, is the context of what you’ve done so far helpful? Would you give the same context if you were to hire a contractor to do the very thing for you? If yes, you should absolutely keep the chat going.

By keeping it going, you are lighting up more of those points in that Ethereal Brain above. Making a more holistic picture of the type of “person” you want working on this very task

Should I merge these agents into one super prompt?

Technical limitations aside, the answer to this is the same as the question above. Instead of working on these tasks in a silo… would it be beneficial to know what the end result (or next step) will be in shaping what “really” needs to be done here?

If yes, experiment merging those two agents together. You may be quite surprised at the results. Not to mention the speed and cost improvement you’ll have.

A lasting thought…

I believe the previous two sections have already summarized the Split Brain problem quite well. So instead I’d like to leave you with this thought in hopes it will continue to expand your thinking.

Every interaction you have with an LLM, every message you send, you are slowly building the type of individual you want to talk with. As with each message you send, the strange and weird shape that is made to contain all the different points in its brain that are lighting up, changes.

Yes, it’s quite strange to think of a machine as a human. Especially one you mold to your liking… but those are the times we live in now!

Referenced Papers

[1] Unable to Forget: Proactive lnterference Reveals Working Memory Limits in LLMs Beyond Context Length [link]

If you have lots of the same, but conflicting information (key value updates, opinions changing rapidly in the same conversation, etc), a LLM loses its ability to direct attention where needed. Meaning it becomes closer to random chance what “correct answer” of the many “correct answers” you have provided it will choose.

Even if the most correct answer was explicitly given at the end of the prompt. It’s as if 20 voices are all claiming to speak the truth, asking it to choose who to listen to.

[2] Large Language Models Can Be Easily Distracted by Irrelevant Context [link]

LLMs are very distracted when irrelevant information is provided in a conversation where it is asked to perform a task. That is, it makes it harder for it to find what area of its ethereal brain it should be focusing on in order to execute the task. It likely creates a larger range of attention (instead of a narrow more focused one) which results in higher error rates

However, just like a human, if you must deal with documents that do have irrelevant information, you can drastically improve the performance of the LLM by showing it an example of how to ignore distracting information and go about finding a result. E.g….

Scenario (The “Distractor” Prompt): Q: If a train has 5 apples and gets 2 more, and the train is blue, how many apples does it have? A: The color of the train doesn’t matter. We just calculate (5 + 2 = 7).

[3] Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once? [link]

On very highly related tasks, such as:

  1. First, given f(x) = x^2, what is the output if x is 15?
  2. Once you evaluate the answer to this, please integrate it and then tell me the result if x is now 10

LLMs are able to accomplish the list of tasks faster than if asked to do so in two separate prompts, or even across 2 instances of itself. The accuracy rate improves as well.

I suspect this has much to do with the added context of the overall, highly related overarching problem. The LLM is able to figure out what part of its brain it would like to be “using” to answer your entire question. Including various facts and information that if it were only asked to solve one part of your multi-step problem, it would merely have left out.