Pranab Ghosh’s Post

AI Consultant || MIT Alumni || Entrepreneur || Open Source Project Owner || Blogger

9mo

The so called Chain of Thought(CoT) or Tree of Thought reasoning with LLM has very little to do with reasoning. In CoT the problem is broken down to smaller reasoning problems to guide the LLM response. All CoT does is to trigger the right sequence of latent states in the Transformer. When new similar problem is entered as prompt, the same sequence of hidden states is triggered. Those hidden states generate the right token based on the earlier occurence of a similar token from a broken down problem and what appeared after that token. A latent state represents representation of the token and its relationship with earlier tokens. The same relationship chain is used for a new similar problem. The only mechanism at play is co occurrence pattern learning and matching. There is no mystical neural circuits for reasoning form inside the Transformer. It’s precisely the reason LLM fails badly for complex and compositional reasoning tasks. But the narrative you will hear from LLM gurus is that a problem is broken down to smaller problems because that’s how humans reason and solve problems. #ai #llm #cot #reasining https://lnkd.in/gjshg9Ag

Something-of-Thought in LLM Prompting: An Overview of Structured LLM Reasoning

towardsdatascience.com

18 Comments

Pranab Ghosh

AI Consultant || MIT Alumni || Entrepreneur || Open Source Project Owner || Blogger

9mo

A better name for LLM Chain of Thought is Chain of co occurrence.

8 Reactions

Jeremy Owen

Staff Software Engineer @ LinkedIn | Machine Learning Expert

9mo

I feel like when we are talking about models of a sufficient size, that this miopic view isn't really very accurate at all. CoT and similar methods work primarily because of two factors: 1. Memozation of intermediate decisions/state. If you neeed to make several decisions to output a correct response then by outputing intermediate tokens the model can apply more computation to each intermediate decision. Going forward the attention layers can then focus on more on the pre-determined intermediate results on the following inference passes. This gets less useful (and necessary) as the amount of state a model can contain in flight during a single inference pass increases. 2. Increasing raw computation. This is a secondary effect where as you pad the length of the outputs you are also increasing the width of your active context window for models that use masking. This effect can be seen in the recent paper https://arxiv.org/abs/2310.02226 When dealing with the larger models, they tend to operate at higher levels of abstraction anyway, though certainly some completion biases are still present. It isn't simply looking at token probabilities, it is looking at high level concept relations.

2 Reactions

Eric X.

Strive to be a Renaissance man everyday

9mo

There are roughly two schools of thoughts when it comes to LLM performance. One is to take an analogy of how it works with human thought process and use lingo from human cognitive science to explain the model behavior. The other is to treat it as a “stochastic parrot” and thus lingo from statistics. The truth is we don’t know enough to know which is which, but obviously the first approach tend to hype the digital intelligence while the second one downplays some of the amazing things transformers can do that was not possible in any previous generations of statistical models. I do want to point one thing out that it is not fair to say that prompt engineering is to push the model to a set of “hidden states”. In fact we should not think of Transformer as a representative learning model. Instead, it calculates token interactions. I am not sure if we can get a meaningful interpretation of “latent space” from the layers of transformers (the mechanists interpretation of Transformers is far less successful than CNNs). Transformers likely don’t have the concept of latent states. They are more like calculations across the whole input.

1 Reaction

Li Deng

Chief AI Officer and Global Head of Machine Learning at @Vatic Investments

9mo

This is debatable...

2 Reactions

Gabriele Scheler

Computational Neuroscience and Theoretical Biology

9mo

Humams dont have to break down higher-level tasks once they have learned them. They learn them from combining lower-level tasks but once the complex task is established it functions like another low-level task.

5 Reactions

Rémy Fannader

Author of 'Enterprise Architecture Fundamentals', Founder & Owner of Caminao

9mo

Simply put: it's correlation (tokens) mistaken for causation (meanings)

5 Reactions

Mark Spivey

Helping us all "Figure It Out" (Explore, Describe, Explain), many Differentiations + Integrations at any time .

9mo

how does this compare with reductionist cognitive science theories where no free-will exists, as in deterministic reductionist cognition of humans .

Vasily Orlov

I help companies realise the full potential of their investment in data and AI

9mo

A must read. Thanks Pranab Ghosh for sharing

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Andrea Terlizzi

Computer Science MSc in machine learning, scholar and research assistant in neuromorphic and bio-inspired deep learning field.
9mo Edited
Report this post
Pretty accurate. I completely agree with this idea, since it's the most simple explanation (and, by Occam's razor, the most likely) of why LLMs seemingly show "some reasonings abilities" that strangely improve with right prompting (not taking into account data contamination, which could be an even better explanation in closed models like Claude or GPT-4). Sometimes they achieve high accuracy in reasoning benchmarks (but curiously never 100%), and they fall short even on very simple, ad-hoc designed problems (see, e.g. "GPT-4 can't reason" by Konstantine Arkoudas or "LLMs trained on “A is B” fail to learn “B is A”" by Kaufmann et al.). I think we should stop looking at LLMs like they were magical black boxes that have somehow learned a prior over real semantics, that magically gives them reasoning, planning, complex problem-solving abilities or even baby-AGI status. One should rather look at them as what they really are: very big transformer networks that almost perfectly captured the structure of the language through a mixture of statistical pattern learning, memorization and combinatorial abilities. Neuroscience and biology also strongly suggest that intelligence and reasoning are not emergent properties of language, but rather the opposite. This is not to say that LLMs aren't useful or helpful, of course they are, and I also believe that they can be a building block of very powerful "generic" deep learning systems, e.g. in combination with RL planning networks, neural algorithmic reasoning models (see "Neural Algorithmic Reasoning" by Veličković et al.), external algorithmic tools and so on. For instance, they could be trained to understand another network's latent states, in order to provide (even approximate) natural language explanations for their decisions, or they could translate natural language instructions to latent states in order to give instructions to the other networks. A trivial application of this could be in self-driving cars, to explain and/or interact with the driving, RL-trained network. "GPT-4 can't reason" paper: https://lnkd.in/dbH9zzjj "LLMs trained on “A is B” fail to learn “B is A”" paper: https://lnkd.in/dhCwFMkk "Neural Algorithmic Reasoning" paper: https://lnkd.in/dPvsxvWp

Pranab Ghosh

AI Consultant || MIT Alumni || Entrepreneur || Open Source Project Owner || Blogger
9mo

The so called Chain of Thought(CoT) or Tree of Thought reasoning with LLM has very little to do with reasoning. In CoT the problem is broken down to smaller reasoning problems to guide the LLM response. All CoT does is to trigger the right sequence of latent states in the Transformer. When new similar problem is entered as prompt, the same sequence of hidden states is triggered. Those hidden states generate the right token based on the earlier occurence of a similar token from a broken down problem and what appeared after that token. A latent state represents representation of the token and its relationship with earlier tokens. The same relationship chain is used for a new similar problem. The only mechanism at play is co occurrence pattern learning and matching. There is no mystical neural circuits for reasoning form inside the Transformer. It’s precisely the reason LLM fails badly for complex and compositional reasoning tasks. But the narrative you will hear from LLM gurus is that a problem is broken down to smaller problems because that’s how humans reason and solve problems. #ai #llm #cot #reasining https://lnkd.in/gjshg9Ag

Something-of-Thought in LLM Prompting: An Overview of Structured LLM Reasoning

towardsdatascience.com

22 Comments
Like Comment
To view or add a comment, sign in
Eric X.

Strive to be a Renaissance man everyday
9mo Edited
Report this post
There are roughly two schools of thoughts when it comes to LLM performance. One is to take an analogy of how it works with human thought process and use lingo from human cognitive science to explain the model behavior. The other is to treat it as “stochastic parrot” and thus lingo from statistics. The truth is we don’t know enough to know which is which, but obviously the first approach tend to hype the digital intelligence while the second one downplays some of the amazing things transformer can do that was not possible in any previous generation of statistical models. I do want to point one thing out that it is not fair to say that prompting is to push the model to a set of “hidden states”. In fact we should not think of Transformers as a representative learning example. It does calculate token interactions. I am not sure if we can get a meaningful interpretation of “latent space” from the layers of transformers, the same way that I don’t think “thought” is actually a vector.

Pranab Ghosh

AI Consultant || MIT Alumni || Entrepreneur || Open Source Project Owner || Blogger
9mo

The so called Chain of Thought(CoT) or Tree of Thought reasoning with LLM has very little to do with reasoning. In CoT the problem is broken down to smaller reasoning problems to guide the LLM response. All CoT does is to trigger the right sequence of latent states in the Transformer. When new similar problem is entered as prompt, the same sequence of hidden states is triggered. Those hidden states generate the right token based on the earlier occurence of a similar token from a broken down problem and what appeared after that token. A latent state represents representation of the token and its relationship with earlier tokens. The same relationship chain is used for a new similar problem. The only mechanism at play is co occurrence pattern learning and matching. There is no mystical neural circuits for reasoning form inside the Transformer. It’s precisely the reason LLM fails badly for complex and compositional reasoning tasks. But the narrative you will hear from LLM gurus is that a problem is broken down to smaller problems because that’s how humans reason and solve problems. #ai #llm #cot #reasining https://lnkd.in/gjshg9Ag

Something-of-Thought in LLM Prompting: An Overview of Structured LLM Reasoning

towardsdatascience.com
Like Comment
To view or add a comment, sign in
Speaking Business Club

36 followers
4mo
Report this post
Enhancing AI’s Foresight: The Crucial Role of Discriminator Accuracy in Advanced LLM Planning Methods https://lnkd.in/gt-stDND The ability of systems to plan and execute complex tasks stands as a testament to AI’s progress. Panning within AI has been approached through various methodologies, ranging from basic decision-making processes to complex algorithms designed to simulate the foresight and adaptability of human intelligence. As the intricacy of problems addressed by AI systems has escalated, so too has the necessity for innovative planning strategies that can navigate these challenges with greater precision and efficiency. Large language models (LLMs), which have shown remarkable capabilities in generating human-like text, can be leveraged for multi-step problem-solving. Central to this exploration is the concept of a language agent framework that incorporates a generator for creating potential solutions, a discriminator for evaluating these solutions, and a planning method to select the most promising path forward. This...
Like Comment
To view or add a comment, sign in
Chris Columbkille Biddle

CEO at Dóchas Life Sciences
1w
Report this post
What is the best LLM size?: Comparing the performance and capabilities of Tiny LLM to verify when too Small is really too much. Continue reading on Generative AI » #genai #generativeai #ai

What is the best LLM size?

generativeai.pub
Like Comment
To view or add a comment, sign in
Artificiality

196 followers
1mo
Report this post
Our Research: Can LLMs Reason and Plan? The AI world feels like it’s divided into two camps: those that think LLMs can reason and plan and those who do not agree. This dichotomy in views gives rise to over-optimism and over-pessimism about AI, neither of which are particularly helpful. So which is it? It’s increasingly clear that LLMs aren't capable of genuine planning and reasoning. According to ASU researchers, they're essentially giant pseudo-System 1 knowledge sources, not System 2 thinkers. While it’s true that they are more than giant machine translators, it’s also true that they cannot reason autonomously. LLMs are great at coming up with approximate knowledge and ideas for potential plans. But to actually use those ideas, you need to pair the LLM with external programs that can rigorously check the plans for errors. The key is to use them as part of a bigger system. #ai #artificialintelligence #generativeai #airesearch #complexity #chatgpt #complexchange #changemanagement #futureofwork #artificiality #mindforourminds #intimacyeconomy #agenticweb

Can LLMs reason and plan?

artificiality.world

1 Comment
Like Comment
To view or add a comment, sign in
Joe Fuqua

Futurist 🚀 AI Strategist 🤖 Data Scientist 📈 Enterprise Architect 📐
8mo
Report this post
Good dialog on this post… From a reply by me below… “GPT-4 is reasonably good at mimicking human interaction, but is unequivocally not sentient or self aware - to your point it’s a statistical model of a knowledge base. I do think that as the field progresses the architecture of these models will evolve and we will move closer to AGI, but we’ve got a lot of hurdles to overcome before we’re there. I suppose a question to consider is this - when these models become so good at emulating human intelligence, even if it’s just a statistical approximation, will that make the question of whether they’re sentient matter any longer? Would love options on that.” What are your thoughts? At what point does it no longer matter whether AI is sentient - if it can mimic human thinking by any meaningful measure?

Are A.I. Text Generators Thinking Like Humans — Or Just Very Good at Convincing Us They Are?

gsb.stanford.edu

2 Comments
Like Comment
To view or add a comment, sign in
Dave Edwards
1mo
Report this post
Our Research: Can LLMs Reason and Plan? The AI world feels like it’s divided into two camps: those that think LLMs can reason and plan and those who do not agree. This dichotomy in views gives rise to over-optimism and over-pessimism about AI, neither of which are particularly helpful. So which is it? It’s increasingly clear that LLMs aren't capable of genuine planning and reasoning. According to ASU researchers, they're essentially giant pseudo-System 1 knowledge sources, not System 2 thinkers. While it’s true that they are more than giant machine translators, it’s also true that they cannot reason autonomously. LLMs are great at coming up with approximate knowledge and ideas for potential plans. But to actually use those ideas, you need to pair the LLM with external programs that can rigorously check the plans for errors. The key is to use them as part of a bigger system. #ai #artificialintelligence #generativeai #airesearch #complexity #chatgpt #complexchange #changemanagement #futureofwork #artificiality #mindforourminds #intimacyeconomy #agenticweb

Can LLMs reason and plan?

artificiality.world

1 Comment
Like Comment
To view or add a comment, sign in
Bob Sefcik
2mo
Report this post
Human Excellence in the Age of AI

Maree Conway PhD

'Retired' Foresight Practitioner, now thinking and writing about foresight as a cognitive capacity (hint: foresight is not an adjective.)
2mo

James Allen thank you for this paragraph: "AI may be a threat to industrialised labour and creativity. Yet it is also an invitation to re-examine what it actually means to be human. That invitation, should we choose to accept it, may see us rediscover and reintegrate those parts of human intelligence that, for many of us, were marginalised during our assembly-line education or which were burned away as part of our professional acculturation. Intuition. Wisdom. Empathy and deep-emotional reasoning. Moral and ethical decision-making. Embodied cognition. I suspect it will be these faculties and others that will be most needed—and most undisplaceable—in the age of AI."

AI & the flat-packing of the human experience

allenj.substack.com
Like Comment
To view or add a comment, sign in
Lava Kafle
2mo
Report this post
AI may be a threat to industrialised labour and creativity. Yet it is also an invitation to re-examine what it actually means to be human. That invitation, should we choose to accept it, may see us rediscover and reintegrate those parts of human intelligence that, for many of us, were marginalised during our assembly-line education or which were burned away as part of our professional acculturation. Intuition. Wisdom. Empathy and emotional reasoning. Moral and ethical decision-making. Embodied cognition. I suspect it will be these faculties and others that will be in demand—and most undisplaceable—in the age of AI.

James Allen

Futurist. Foresight & strategy.
2mo

AI may be a threat to industrialised labour and creativity. Yet it is also an invitation to re-examine what it actually means to be human. That invitation, should we choose to accept it, may see us rediscover and reintegrate those parts of human intelligence that, for many of us, were marginalised during our assembly-line education or which were burned away as part of our professional acculturation. Intuition. Wisdom. Empathy and emotional reasoning. Moral and ethical decision-making. Embodied cognition. I suspect it will be these faculties and others that will be in demand—and most undisplaceable—in the age of AI.

AI & the flat-packing of the human experience

allenj.substack.com
Like Comment
To view or add a comment, sign in
Anthony Alcaraz

AI/ML CPO Partner @Fribl | #ML/LLMOps #KeynoteSpeaker
2mo Edited
Report this post
Contextual AI: Neurosymbolic RAG and the Power of Knowledge Graphs 📐 Retrieval-Augmented Generation (RAG) with knowledge graphs is a critical AI stack for real world LLM application RAG, a technique that enables language models to retrieve and incorporate external knowledge during the generation process, holds immense promise. However, to truly unleash its potential, we must constrain the augmentation process with symbolic pre-reasoning, like applying collaborative filtering techniques to in-context learning. The most flexible and scalable approach to achieving this lies in leveraging knowledge graphs — structured representations of real-world entities and their relationships, capturing the rich connexion of knowledge, constraints, and logical rules that govern our understanding of the world for a particular domain. By anchoring RAG within the symbolic scaffold of knowledge graphs, we can infuse our AI systems with the robust reasoning, context-aware generation, and enhanced explainability and interpretability that have long eluded purely neural approaches. Moreover, this neurosymbolic RAG paradigm presents a virtuous data flywheel effect. As AI systems leverage knowledge graphs for grounded augmentation, the insights and outputs they generate can, in turn, be fed back into the knowledge graphs, continuously enriching and refining the structured knowledge base. This iterative process not only fosters continuous learning and improvement but also paves the way for fine-tuning large language models (LLMs) on the final reasoning approach, further enhancing their context-aware capabilities. Additionally, by leveraging the power of many-shot in-context learning, where LLMs are provided with a multitude of relevant examples during the generation process, we can amplify the benefits of neurosymbolic RAG, enabling AI systems to learn from a rich pool of contextualized knowledge and reasoning patterns.

Contextual AI: Neurosymbolic RAG and the Power of Knowledge Graphs

medium.com

5 Comments
Like Comment
To view or add a comment, sign in

16,159 followers

View Profile Follow

Pranab Ghosh’s Post

Something-of-Thought in LLM Prompting: An Overview of Structured LLM Reasoning

towardsdatascience.com

More from this author

Does AutoML make Data Scientists obsolete? Not so fast.

Perishable Product Discounting with Reinforcement Learning

Quick and Easy Sentiment Analysis using Google Search Result size and Mutual Information

Explore topics