#humanfeedback — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #humanfeedback, aggregated by home.social.
-
Reinforcement Learning from Human Feedback (RLHF) in Notebooks
https://github.com/ash80/RLHF_in_notebooks
#HackerNews #ReinforcementLearning #HumanFeedback #RLHF #Notebooks #AIResearch
-
🎯 Think AI just "learns"? Think again.
Today's smartest models don't memorize — they listen to YOU.
📊 Discover 3 powerful ways human feedback (RLHF) is transforming AI into something far more intuitive.
👇 Don’t just use AI. Understand how you’re shaping it.🔗 https://medium.com/@rogt.x1997/3-game-changing-ways-rlhf-is-rewiring-ai-behavior-5f082ce6ec01
#RLHF #AIbehavior #HumanFeedback #MachineLearning
https://medium.com/@rogt.x1997/3-game-changing-ways-rlhf-is-rewiring-ai-behavior-5f082ce6ec01 -
One poorly delivered joke in 2019 became the catalyst for the most human breakthrough in AI: RLHF.
Now, machines aren’t just answering—they’re understanding us.
This isn’t the future. It’s happening now.
⬇️ See how empathy, feedback, and a little comedy changed everything.
#AIAlignment #RLHF #EthicalAI #HumanFeedback
👉
https://medium.com/@rogt.x1997/the-joke-that-taught-ai-empathy-inside-the-rlhf-breakthrough-174a56d91bf7 -
Good point made by @soumith on X:
"Open LLMs need to get organized and co-ordinated about sharing human feedback. It's the weakest link with Open LLMs right now. They don't have 100m+ people giving feedback like in the case of OpenAI/Anthropic/Bard."
#Opensource #AI #LLM #GenerativeAI #humanfeedback -
The secret to making #AIChatbots sound #smart and #spew less #toxic nonsense is to use a technique called reinforcement learning from #HumanFeedback, which uses input from people to improve the model’s answers. It relies on a small army of #human #data #annotators who evaluate whether a string of text makes sense and sounds fluent and natural. They decide whether a response should be kept in the AI model’s database or removed. https://www.technologyreview.com/2023/06/13/1074560/we-are-all-ais-free-data-workers
-
The secret to making #AIChatbots sound #smart and #spew less #toxic nonsense is to use a technique called reinforcement learning from #HumanFeedback, which uses input from people to improve the model’s answers. It relies on a small army of #human #data #annotators who evaluate whether a string of text makes sense and sounds fluent and natural. They decide whether a response should be kept in the AI model’s database or removed. https://www.technologyreview.com/2023/06/13/1074560/we-are-all-ais-free-data-workers
-
The secret to making #AIChatbots sound #smart and #spew less #toxic nonsense is to use a technique called reinforcement learning from #HumanFeedback, which uses input from people to improve the model’s answers. It relies on a small army of #human #data #annotators who evaluate whether a string of text makes sense and sounds fluent and natural. They decide whether a response should be kept in the AI model’s database or removed. https://www.technologyreview.com/2023/06/13/1074560/we-are-all-ais-free-data-workers
-
The secret to making #AIChatbots sound #smart and #spew less #toxic nonsense is to use a technique called reinforcement learning from #HumanFeedback, which uses input from people to improve the model’s answers. It relies on a small army of #human #data #annotators who evaluate whether a string of text makes sense and sounds fluent and natural. They decide whether a response should be kept in the AI model’s database or removed. https://www.technologyreview.com/2023/06/13/1074560/we-are-all-ais-free-data-workers
-
The secret to making #AIChatbots sound #smart and #spew less #toxic nonsense is to use a technique called reinforcement learning from #HumanFeedback, which uses input from people to improve the model’s answers. It relies on a small army of #human #data #annotators who evaluate whether a string of text makes sense and sounds fluent and natural. They decide whether a response should be kept in the AI model’s database or removed. https://www.technologyreview.com/2023/06/13/1074560/we-are-all-ais-free-data-workers
-
In the intro to his keynote on Reasoning with Realistically Imperfect Knowledge, Alexander Gray is comparing gpt-3 rlhf to Shub-Niggurath, a mythical goddess from the Lovecraftian monster universe
#eswc2023 #lovecraft #reinforcementlearning #humanfeedback #gpt #rlhf -
In the intro to his keynote on Reasoning with Realistically Imperfect Knowledge, Alexander Gray is comparing gpt-3 rlhf to Shub-Niggurath, a mythical goddess from the Lovecraftian monster universe
#eswc2023 #lovecraft #reinforcementlearning #humanfeedback #gpt #rlhf