home.social

#andrewbarto — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #andrewbarto, aggregated by home.social.

  1. #ACMPrize
    #2024ACMPrize
    #ACMTuringAward

    #AndrewBarto
    #RichardSutton

    » #ReinforcementLearning
    An Introduction
    1998
    standard reference...cited over 75,000
    ...
    prominent example of #RL
    #AlphaGo victory
    over best human #Go players
    2016 2017
    ....
    recently has been the development of the chatbot #ChatGPT
    ...
    large language model #LLM trained in two phases ...employs a technique called
    reinforcement learning from human feedback #RLHF «

    aka cheap labor unnamed in papers

    awards.acm.org/about/2024-turi

    2/2