#largemodel — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #largemodel, aggregated by home.social.
-
Analyzing And Editing Inner Mechanisms Of Backdoored Language Models
"We can successfully insert a weak backdoor mechanism in the benign model, even without also editing the embeddings of the trigger words."
"Our framework can reverse-engineer backdoor mechanisms in toy and large models for the first time, scale the strength of the backdoor mechanism ..."
https://arxiv.org/abs/2302.12461
#ai #llm #pcpablation #mlp #toymodel #largemodel #backdoor #backdooredlanguagemodel #chatgpt
-
Analyzing And Editing Inner Mechanisms Of Backdoored Language Models
"We can successfully insert a weak backdoor mechanism in the benign model, even without also editing the embeddings of the trigger words."
"Our framework can reverse-engineer backdoor mechanisms in toy and large models for the first time, scale the strength of the backdoor mechanism ..."
https://arxiv.org/abs/2302.12461
#ai #llm #pcpablation #mlp #toymodel #largemodel #backdoor #backdooredlanguagemodel #chatgpt
-
Analyzing And Editing Inner Mechanisms Of Backdoored Language Models
"We can successfully insert a weak backdoor mechanism in the benign model, even without also editing the embeddings of the trigger words."
"Our framework can reverse-engineer backdoor mechanisms in toy and large models for the first time, scale the strength of the backdoor mechanism ..."
https://arxiv.org/abs/2302.12461
#ai #llm #pcpablation #mlp #toymodel #largemodel #backdoor #backdooredlanguagemodel #chatgpt
-
Analyzing And Editing Inner Mechanisms Of Backdoored Language Models
"We can successfully insert a weak backdoor mechanism in the benign model, even without also editing the embeddings of the trigger words."
"Our framework can reverse-engineer backdoor mechanisms in toy and large models for the first time, scale the strength of the backdoor mechanism ..."
https://arxiv.org/abs/2302.12461
#ai #llm #pcpablation #mlp #toymodel #largemodel #backdoor #backdooredlanguagemodel #chatgpt
-
Analyzing And Editing Inner Mechanisms Of Backdoored Language Models
"We can successfully insert a weak backdoor mechanism in the benign model, even without also editing the embeddings of the trigger words."
"Our framework can reverse-engineer backdoor mechanisms in toy and large models for the first time, scale the strength of the backdoor mechanism ..."
https://arxiv.org/abs/2302.12461
#ai #llm #pcpablation #mlp #toymodel #largemodel #backdoor #backdooredlanguagemodel #chatgpt