home.social

#incontextlearning — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #incontextlearning, aggregated by home.social.

  1. In-context learning has been consistently shown to exceed hand-crafted neural learning algorithms across the board.

    But it's limited by the length of the context. Even with neural architectures allowing context to grow to infinity, these come with high costs and scaling problems.

    Is there a way to incorporate new knowledge learned in-context back into neural network weights?

    Of course there is!

    Let's imagine we have a lot of data, sequences of instructions and outputs where in-context learning happens.

    From this data we can produce a dataset of synthetic data which presents the new knowledge learned. We can continually train the model with this dataset.

    Of course this is super slow and inconvenient. But as a result we'll get a dataset with in-context learning happening, and old model weights against new model weights.

    We can use this data to train a neural programmer model directly!

    That model would take in the context as such, and if in-context learning has happened in those interactions, it can predict the changes to the neural network weights which would happen if the long and heavy synthetic data pipeline had been run.

    Instead of the heavy pipeline, we can just use the neural programmer model to directly update the large model weights based on the in-context learning it experienced, to crystallize the learnings into its long-term memory, not unlike what hippocampus does in the human brain.

    #AI #LLMs #NeuralNetworks #InContextLearning

  2. Unlocking the Power of In-Context Learning: A New Era for Symbolic-AutoML

    In a groundbreaking critique, Mehmet Süzen explores the potential of Chain-of-Thought (CoT) in In-Context Learning (ICL) as a transformative tool for symbolic-AutoML. This approach not only enhances d...

    news.lavx.hu/article/unlocking

    #news #tech #InContextLearning #SymbolicAutoML #ChainOfThought

  3. Through scaling #DeepNeuralNetworks we have found in two different domains, #ReinforcementLearning and #LanguageModels, that these models learn to learn (#MetaLearning).

    They spontaneously learn internal models with memory and learning capability which are able to exhibit #InContextLearning much faster and much more effectively than any of our standard #backpropagation based deep neural networks can.

    These rather alien #LearningModels embedded inside the deep learning models are emulated by #neuron layers, but aren't necessarily deep learning models themselves.

    I believe it is possible to extract these internal models which have learned to learn, out of the scaled up #DeepLearning #substrate they run on, and run them natively and directly on #hardware.

    This allows those much more efficient learning models to be used either as #LearningAgents themselves, or as a further substrate for further meta-learning.

    I have an #embodiment #research on-going but with a related goal and focus specifically in extracting (or distilling) the models out of the meta-models here:
    github.com/keskival/embodied-e

    It is of course an open research problem how to do this, but I have a lot of ideas!

    If you're inspired by this, or if you think the same, let's chat!

  4. Through scaling #DeepNeuralNetworks we have found in two different domains, #ReinforcementLearning and #LanguageModels, that these models learn to learn (#MetaLearning).

    They spontaneously learn internal models with memory and learning capability which are able to exhibit #InContextLearning much faster and much more effectively than any of our standard #backpropagation based deep neural networks can.

    These rather alien #LearningModels embedded inside the deep learning models are emulated by #neuron layers, but aren't necessarily deep learning models themselves.

    I believe it is possible to extract these internal models which have learned to learn, out of the scaled up #DeepLearning #substrate they run on, and run them natively and directly on #hardware.

    This allows those much more efficient learning models to be used either as #LearningAgents themselves, or as a further substrate for further meta-learning.

    I have an #embodiment #research on-going but with a related goal and focus specifically in extracting (or distilling) the models out of the meta-models here:
    github.com/keskival/embodied-e

    It is of course an open research problem how to do this, but I have a lot of ideas!

    If you're inspired by this, or if you think the same, let's chat!

  5. Through scaling #DeepNeuralNetworks we have found in two different domains, #ReinforcementLearning and #LanguageModels, that these models learn to learn (#MetaLearning).

    They spontaneously learn internal models with memory and learning capability which are able to exhibit #InContextLearning much faster and much more effectively than any of our standard #backpropagation based deep neural networks can.

    These rather alien #LearningModels embedded inside the deep learning models are emulated by #neuron layers, but aren't necessarily deep learning models themselves.

    I believe it is possible to extract these internal models which have learned to learn, out of the scaled up #DeepLearning #substrate they run on, and run them natively and directly on #hardware.

    This allows those much more efficient learning models to be used either as #LearningAgents themselves, or as a further substrate for further meta-learning.

    I have an #embodiment #research on-going but with a related goal and focus specifically in extracting (or distilling) the models out of the meta-models here:
    github.com/keskival/embodied-e

    It is of course an open research problem how to do this, but I have a lot of ideas!

    If you're inspired by this, or if you think the same, let's chat!

  6. Through scaling #DeepNeuralNetworks we have found in two different domains, #ReinforcementLearning and #LanguageModels, that these models learn to learn (#MetaLearning).

    They spontaneously learn internal models with memory and learning capability which are able to exhibit #InContextLearning much faster and much more effectively than any of our standard #backpropagation based deep neural networks can.

    These rather alien #LearningModels embedded inside the deep learning models are emulated by #neuron layers, but aren't necessarily deep learning models themselves.

    I believe it is possible to extract these internal models which have learned to learn, out of the scaled up #DeepLearning #substrate they run on, and run them natively and directly on #hardware.

    This allows those much more efficient learning models to be used either as #LearningAgents themselves, or as a further substrate for further meta-learning.

    I have an #embodiment #research on-going but with a related goal and focus specifically in extracting (or distilling) the models out of the meta-models here:
    github.com/keskival/embodied-e

    It is of course an open research problem how to do this, but I have a lot of ideas!

    If you're inspired by this, or if you think the same, let's chat!

  7. Through scaling #DeepNeuralNetworks we have found in two different domains, #ReinforcementLearning and #LanguageModels, that these models learn to learn (#MetaLearning).

    They spontaneously learn internal models with memory and learning capability which are able to exhibit #InContextLearning much faster and much more effectively than any of our standard #backpropagation based deep neural networks can.

    These rather alien #LearningModels embedded inside the deep learning models are emulated by #neuron layers, but aren't necessarily deep learning models themselves.

    I believe it is possible to extract these internal models which have learned to learn, out of the scaled up #DeepLearning #substrate they run on, and run them natively and directly on #hardware.

    This allows those much more efficient learning models to be used either as #LearningAgents themselves, or as a further substrate for further meta-learning.

    I have an #embodiment #research on-going but with a related goal and focus specifically in extracting (or distilling) the models out of the meta-models here:
    github.com/keskival/embodied-e

    It is of course an open research problem how to do this, but I have a lot of ideas!

    If you're inspired by this, or if you think the same, let's chat!