#2024acmprize — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #2024acmprize, aggregated by home.social.
-
#ACMPrize
#2024ACMPrize
#ACMTuringAward» #ReinforcementLearning
An Introduction
1998
standard reference...cited over 75,000
...
prominent example of #RL
#AlphaGo victory
over best human #Go players
2016 2017
....
recently has been the development of the chatbot #ChatGPT
...
large language model #LLM trained in two phases ...employs a technique called
reinforcement learning from human feedback #RLHF «aka cheap labor unnamed in papers
https://awards.acm.org/about/2024-turing
2/2
-
#ACMPrize
#2024ACMPrize
#ACMTuringAward» #ReinforcementLearning
An Introduction
1998
standard reference...cited over 75,000
...
prominent example of #RL
#AlphaGo victory
over best human #Go players
2016 2017
....
recently has been the development of the chatbot #ChatGPT
...
large language model #LLM trained in two phases ...employs a technique called
reinforcement learning from human feedback #RLHF «aka cheap labor unnamed in papers
https://awards.acm.org/about/2024-turing
2/2
-
#ACMPrize
#2024ACMPrize
#ACMTuringAward» #ReinforcementLearning
An Introduction
1998
standard reference...cited over 75,000
...
prominent example of #RL
#AlphaGo victory
over best human #Go players
2016 2017
....
recently has been the development of the chatbot #ChatGPT
...
large language model #LLM trained in two phases ...employs a technique called
reinforcement learning from human feedback #RLHF «aka cheap labor unnamed in papers
https://awards.acm.org/about/2024-turing
2/2
-
#ACMPrize
#2024ACMPrize
#ACMTuringAward» #ReinforcementLearning
An Introduction
1998
standard reference...cited over 75,000
...
prominent example of #RL
#AlphaGo victory
over best human #Go players
2016 2017
....
recently has been the development of the chatbot #ChatGPT
...
large language model #LLM trained in two phases ...employs a technique called
reinforcement learning from human feedback #RLHF «aka cheap labor unnamed in papers
https://awards.acm.org/about/2024-turing
2/2
-
#ACMPrize
#2024ACMPrize
#ACMTuringAward» #ReinforcementLearning
An Introduction
1998
standard reference...cited over 75,000
...
prominent example of #RL
#AlphaGo victory
over best human #Go players
2016 2017
....
recently has been the development of the chatbot #ChatGPT
...
large language model #LLM trained in two phases ...employs a technique called
reinforcement learning from human feedback #RLHF «aka cheap labor unnamed in papers
https://awards.acm.org/about/2024-turing
2/2