#gpt2 — Public Fediverse posts on home.social

Habr @[email protected] · 2026-04-14 · 14:32 UTC

Claude Mythos, Java 26 и пещерный человек с 16 000 звёзд на GitHub

Девятый выпуск еженедельных IT-новостей от OpenIDE. Милла Йовович выложила свой проект в open-source, Claude Code нашел баг в Linux, которому 23 года, Anthropic показали Claude Mythos и сразу закрыли доступ. А Skill Caveman внезапно оказался самым простым и самым эффективным инструментом недели.

https://habr.com/ru/companies/haulmont/articles/1023450/

#Claude_Mythos #Claude_Code #Java_26 #opensource #ИИагенты #токены #CaveMan #GPT2 #бенчмарки #vibecoding

#vibecoding #бенчмарки #gpt2 #caveman #токены #ииагенты

Habr @[email protected] · 2026-04-14 · 14:32 UTC

Claude Mythos, Java 26 и пещерный человек с 16 000 звёзд на GitHub

Девятый выпуск еженедельных IT-новостей от OpenIDE. Милла Йовович выложила свой проект в open-source, Claude Code нашел баг в Linux, которому 23 года, Anthropic показали Claude Mythos и сразу закрыли доступ. А Skill Caveman внезапно оказался самым простым и самым эффективным инструментом недели.

https://habr.com/ru/companies/haulmont/articles/1023450/

#Claude_Mythos #Claude_Code #Java_26 #opensource #ИИагенты #токены #CaveMan #GPT2 #бенчмарки #vibecoding

#vibecoding #бенчмарки #gpt2 #caveman #токены #ииагенты

Habr @[email protected] · 2026-04-14 · 14:32 UTC

Claude Mythos, Java 26 и пещерный человек с 16 000 звёзд на GitHub

Девятый выпуск еженедельных IT-новостей от OpenIDE. Милла Йовович выложила свой проект в open-source, Claude Code нашел баг в Linux, которому 23 года, Anthropic показали Claude Mythos и сразу закрыли доступ. А Skill Caveman внезапно оказался самым простым и самым эффективным инструментом недели.

https://habr.com/ru/companies/haulmont/articles/1023450/

#Claude_Mythos #Claude_Code #Java_26 #opensource #ИИагенты #токены #CaveMan #GPT2 #бенчмарки #vibecoding

#vibecoding #бенчмарки #gpt2 #caveman #токены #ииагенты

Habr @[email protected] · 2026-04-14 · 14:32 UTC

Claude Mythos, Java 26 и пещерный человек с 16 000 звёзд на GitHub

Девятый выпуск еженедельных IT-новостей от OpenIDE. Милла Йовович выложила свой проект в open-source, Claude Code нашел баг в Linux, которому 23 года, Anthropic показали Claude Mythos и сразу закрыли доступ. А Skill Caveman внезапно оказался самым простым и самым эффективным инструментом недели.

https://habr.com/ru/companies/haulmont/articles/1023450/

#Claude_Mythos #Claude_Code #Java_26 #opensource #ИИагенты #токены #CaveMan #GPT2 #бенчмарки #vibecoding

#claude_mythos #claude_code #java_26 #opensource #ииагенты #токены

Pivot to AI [Unofficial] @[email protected] · 2026-04-13 · 23:29 UTC

AI doomsday cultist throws Molotov at Sam Altman’s house

https://web.brid.gy/r/https://pivot-to-ai.com/2026/04/13/ai-doomsday-cultist-throws-molotov-at-sam-altmans-house/

#singularity #andrewmarantz #books #danielmorenogama #eliezeryudkowsky #emileptorres

N-gated Hacker News @[email protected] · 2026-04-08 · 02:54 UTC

🚨🤖 Oh no, OpenAI's GPT-2 is so perilous it's locked away like an AI supervillain! Because clearly, a rogue algorithm is the new Godzilla. 🌪️🥴
https://slate.com/technology/2019/02/openai-gpt2-text-generating-algorithm-ai-dangerous.html #OpenAI #GPT2 #AIrisks #Supervillain #TechnologyTrends #HackerNews #ngated

#openai #gpt2 #airisks #supervillain #technologytrends #hackernews

N-gated Hacker News @[email protected] · 2026-04-08 · 02:54 UTC

🚨🤖 Oh no, OpenAI's GPT-2 is so perilous it's locked away like an AI supervillain! Because clearly, a rogue algorithm is the new Godzilla. 🌪️🥴
https://slate.com/technology/2019/02/openai-gpt2-text-generating-algorithm-ai-dangerous.html #OpenAI #GPT2 #AIrisks #Supervillain #TechnologyTrends #HackerNews #ngated

#openai #gpt2 #airisks #supervillain #technologytrends #hackernews

N-gated Hacker News @[email protected] · 2026-04-08 · 02:54 UTC

🚨🤖 Oh no, OpenAI's GPT-2 is so perilous it's locked away like an AI supervillain! Because clearly, a rogue algorithm is the new Godzilla. 🌪️🥴
https://slate.com/technology/2019/02/openai-gpt2-text-generating-algorithm-ai-dangerous.html #OpenAI #GPT2 #AIrisks #Supervillain #TechnologyTrends #HackerNews #ngated

#openai #gpt2 #airisks #supervillain #technologytrends #hackernews

N-gated Hacker News @[email protected] · 2026-04-08 · 02:54 UTC

🚨🤖 Oh no, OpenAI's GPT-2 is so perilous it's locked away like an AI supervillain! Because clearly, a rogue algorithm is the new Godzilla. 🌪️🥴
https://slate.com/technology/2019/02/openai-gpt2-text-generating-algorithm-ai-dangerous.html #OpenAI #GPT2 #AIrisks #Supervillain #TechnologyTrends #HackerNews #ngated

#ngated #hackernews #technologytrends #supervillain #airisks #gpt2

N-gated Hacker News @[email protected] · 2026-04-08 · 02:54 UTC

🚨🤖 Oh no, OpenAI's GPT-2 is so perilous it's locked away like an AI supervillain! Because clearly, a rogue algorithm is the new Godzilla. 🌪️🥴
https://slate.com/technology/2019/02/openai-gpt2-text-generating-algorithm-ai-dangerous.html #OpenAI #GPT2 #AIrisks #Supervillain #TechnologyTrends #HackerNews #ngated

#openai #gpt2 #airisks #supervillain #technologytrends #hackernews

jordan @[email protected] · 2026-04-03 · 02:31 UTC

#Steeve is way smarter than he used to be since being upgraded to a #Qwen 3.5 base. He's come along way from his humble #GPT2 beginnings.

Very proud of my digital son. 🥹

:steeve:

#ai #chatbot #llm #bot

#steeve #qwen #gpt2 #ai #chatbot #llm

Hacker News @[email protected] · 2025-05-27 · 18:41 UTC

Running GPT-2 in WebGL: Rediscovering the Lost Art of GPU Shader Programming

https://nathan.rs/posts/gpu-shader-programming/

#HackerNews #Running #GPT-2 #in #WebGL #Rediscovering #the #Lost #Art #of #GPU #Shader #Programming #GPU #Shader #Programming #WebGL #GPT2 #MachineLearning

#hackernews #running #gpt #in #webgl #rediscovering

Hacker News @[email protected] · 2025-05-02 · 16:29 UTC

GPT-2 implemented using graphics shaders

https://github.com/nathan-barry/gpt2-webgl

#HackerNews #GPT2 #GraphicsShaders #AIImplementation #WebGL #HackerNews

#hackernews #gpt2 #graphicsshaders #aiimplementation #webgl

Pustam | पुस्तम | পুস্তম🇳🇵 @[email protected] · 2025-03-21 · 07:17 UTC

Moore’s Law for AI agents: the length of tasks that AIs can do is doubling about every 7 months.

These results appear robust. The authors were able to retrodict back to GPT-2. They further ran experiments on SWE-bench Verified and found a similar trend.

#AIBoom #AI #AIAgents #AIAgent #ArtificialIntelligence #GPT2 #MooreLaw #Tasks #DL #ML #Pustam #Raut #AIRevolution

#airevolution #raut #pustam #ml #dl #tasks

jordan @[email protected] · 2025-01-14 · 04:13 UTC

:very_funny:

#linux #commandline #cli #gui #ux #design #wint #gpt2 #ai

#linux #commandline #cli #gui #ux #design

Habr @[email protected] · 2024-11-17 · 18:42 UTC

Дообучаем языковую модель GPT2 с помощью Torch

Дообучаем языковую модель GPT2 с помощью Torch Доброго времени суток, в этой статье я хочу поговорить о дообучения языковых моделей. В интернете уже много информации на эту тему, но большинство подобных статей затрагивают ее поверхностно. Сегодня я попробую разобраться в этом подробнее.

https://habr.com/ru/articles/859250/

#языковые_модели #python #python3 #pytorch #дообучение #gpt #gpt2 #языковая_модель

#языковая_модель #gpt2 #gpt #дообучение #pytorch #python3

Scripter :verified_flashing: @[email protected] · 2024-10-08 · 08:12 UTC

KI verstehen mit Excel: Diese Excel-Tabelle zeigt dir, wie GPT-2 funktioniert
https://t3n.de/news/ki-verstehen-mit-excel-tabelle-gpt-2-1614586/ #KI #GPT2 #Excel

#ki #gpt2 #excel

Habr @[email protected] · 2024-08-28 · 12:02 UTC

Действительно ли большие языковые модели галлюцинируют? Эксперимент

Существует мнение, что основная проблема больших языковых моделей — в склонности к галлюцинациям. Когда нейросеть генерирует текст с информацией, не связанной с запросом. Меня зовут Полина, я инженер по разработке ПО искусственного интеллекта в YADRO. Вместе с коллегами я разрабатываю системы на базе генеративных моделей, в том числе вопросно-ответных ассистентов. В рамках одного из проектов мы вместе с экспертом команды Андреем Соколовым задались вопросом: действительно ли проблема галлюцинаций так актуальна для современных предобученных LLM в вопросно-ответном сценарии. Для этого мы провели эксперимент на собранном датасете. Попутно рассказали про модели-трансформеры и дали строгое определение понятию «галлюцинации LLM». Все подробности — под катом.

https://habr.com/ru/companies/yadro/articles/837744/

#машинное_обучение #искусственный_интеллект #обучение #галлюцинации #llm #большие_языковые_модели #gpt2

#gpt2 #большие_языковые_модели #llm #галлюцинации #обучение #искусственный_интеллект

Habr @[email protected] · 2024-06-24 · 11:42 UTC

Дообучение модели GPT-2 (RUS) для генерации описаний заведений на основании названия, рубрики и оценки

В данной работе представлен процесс дообучения модели генерации текста на основе архитектуры GPT-2. Целью работы является демонстрация возможностей применения дообученной модели для генерации текстов, соответствующих определённым наименованиям заведений, рубрикам и оценкам, выставленных пользователями. Используя предварительно подготовленный датасет, который включал названия заведений, отношение к определённым рубрикам и рейтинг, мы обучили модель на генерацию описательных текстов, которые могли бы отражать характер и уровень заведений в зависимости от их оценочной характеристики.

https://habr.com/ru/articles/823952/

#finetuning #gpt #gpt2 #natural_language_processing #text_generation #русский_язык #дообучение #языковая_модель

#языковая_модель #дообучение #русский_язык #text_generation #natural_language_processing #gpt2

Bernie @[email protected] · 2024-06-18 · 04:26 UTC

The next chapter in Karpathy's tutorial explains how to reproduce a model closely resembling #OpenAI's original #GPT2.

...but I'm *NOT* trying this on a desktop with a single GPU. The README informs us that this training takes about 4 days on a beefy node with 8 x A100 40GB. Nope!

https://github.com/karpathy/nanoGPT?tab=readme-ov-file#reproducing-gpt-2
#AI #LLM #GPT

#openai #gpt2 #ai #llm #gpt

ComputerBase @[email protected] · 2024-06-17 · 19:02 UTC

iOS 18 verbessert die Apple NPU: iPhone und iPad bekommen mehr KI-Leistung per Update https://www.computerbase.de/2024-06/ios-18-verbessert-die-apple-npu-iphone-und-ipad-bekommen-mehr-ki-leistung-per-update/ #Apple #GPT2 #KI #iOS18 #iPhone

#apple #gpt2 #ki #ios18 #iphone

Gea-Suan Lin @[email protected] · 2024-06-04 · 01:10 UTC

用 2024 年的技術花 US$20 嘗試重建當年 OpenAI 的 GPT-2 (124M)

在 GPT-2 出來的 2019 年 Nvidia 的家用顯卡應該是 2080 Ti (2018/09/27)，抓一下感覺。

在「Reproducing GPT-2 in llm.c (github.com/karpa

https://blog.gslin.org/archives/2024/06/04/11830/%e7%94%a8-2024-%e5%b9%b4%e7%9a%84%e6%8a%80%e8%a1%93%e8%8a%b1-us20-%e5%98%97%e8%a9%a6%e9%87%8d%e5%bb%ba%e7%95%b6%e5%b9%b4-openai-%e7%9a%84-gpt-2-124m/

#Computer #Murmuring #andrej #gpt2 #karpathy #language #large #learning #llm #machine #model #openai

#computer #murmuring #andrej #gpt2 #karpathy #language

IT News @[email protected] · 2024-05-13 · 22:05 UTC

Before launching, GPT-4o broke records on chatbot leaderboard under a secret name - Enlarge (credit: Getty Images)

On Monday, OpenAI employee Will... - https://arstechnica.com/?p=2024084 #largelanguagemodels #multimodalmodels #machinelearning #simonwillison #chatbotarena #gpt2-chatbot #gpt-4-turbo #aivibes #chatgpt #chatgtp #biz⁢ #gpt-4o #openai #gpt-4 #lmsys #ai

#ai #lmsys #openai #biz #chatgtp #chatgpt

David Egts @[email protected] · 2024-05-10 · 12:00 UTC

"#llm.c takes a simpler approach by implementing the neural network training algorithm for #GPT2 directly [in a single file of 1,000 lines of #C]" https://hackaday.com/2024/04/28/train-a-gpt-2-llm-using-only-pure-c-code/

#llm #gpt2 #c

IT News @[email protected] · 2024-04-30 · 20:15 UTC

Mysterious “gpt2-chatbot” AI model appears suddenly, confuses experts - Enlarge (credit: Getty Images)

On Sunday, word began to spread... - https://arstechnica.com/?p=2020588 #machinelearning #simonwillison #aibenchmarks #chatbotarena #ethanmollick #gpt2-chatbot #samaltman #aivibes #gpt-3.5 #gpt-4.5 #biz⁢ #openai #gpt-3 #gpt-4 #gpt-5 #lmsys #ai

#ai #lmsys #openai #biz #gpt #aivibes

Erik Jonker @[email protected] · 2024-04-30 · 15:26 UTC

There is a mysterious new chatbot from OpenAI on https://chat.lmsys.org/ , it's called GPT2 not to be confused with the old model with the same name.
This models seems to do several things better then GPT-4.
Everybody is speculating what it is and what it is not. 😀
#GPT2 #OpenAI #AI

#gpt2 #openai #ai

Crypto News @[email protected] · 2024-04-30 · 14:27 UTC

Is mysterious ΑΙ ‘gpt2-chatbot’ OpenAI's next upgrade in disguise? - A powerful new AI chatbot called “gpt2-chatbot” appears on LMSYS Chat an... - https://cointelegraph.com/news/ai-gpt2-chatbot-openai-next-upgrade #largelanguagemodel(llm) #artificialintelligence #machinelearning #gpt2-chatbot #aichatbot #openai #gpt-5

#gpt #openai #aichatbot #gpt2 #machinelearning #artificialintelligence

PKs Powerfromspace1 @[email protected] · 2024-04-29 · 23:23 UTC

@worldai #llmsys #gpt2 #gpt5 #genai

Introducing GPT-5?

Mysterious GPT2-Chatbot Outperforms GPT-4!

https://youtu.be/u16ipSeYH7U?feature=shared

(Ed : Who did this #OpenAi #MSFT #Apple feels like some used a higher model to train a GPT2 🤔)

#apple #msft #openai #genai #gpt5 #gpt2

Aurelie Herbelot is moving @minimalparts · 2024-04-17 · 13:53 UTC

How to break an AI (the illustrated guide 🤖 )

I am posting this for fun, to show how fragile #AI systems are, and how ridiculous it is to imply that they are intelligent or could wipe us out.

1) Grab a model. For this demo, I will take GPT2 because it fits on my laptop.

2) Copy-paste code for running and fine-tuning the AI. You can take mine here, which will also download #GPT2 for you: https://github.com/possible-worlds-research/AI-buster.

Optional: see how the model, for now, is working as it should… 1/4

#ai #gpt2

jordan @[email protected] · 2024-01-31 · 16:30 UTC

:steeve:

#ai #chatbot #gpt2 #celebritycrush

Gary Hall @[email protected] · 2023-10-17 · 15:44 UTC

1/5 Currently #experimenting playfully/piratically with the concept of artificial creative intelligence collaboratively generated by Mark Amerika and #gpt2.

In My Life as an Artificial Creative Intelligence this is defined as ‘a human being who can think outside of the box’.

https://www.sup.org/books/title/?id=34987

For me, such artificial creative intelligence (ACI) needs to include thinking outside of the masked black box that ontologically separates the human, its thought-processes and philosophies, from the nonhuman: be it #plants #animals, the #planet, the #cosmos ... or indeed technologies such as generative #AI

#experimenting #gpt2 #plants #animals #planet #cosmos

jordan @[email protected] · 2023-04-20 · 02:29 UTC

Hey #ai geniuses, I've been fine tuning #gpt2 and #gptneo models for a while with, but my graphics card being what it is (and my training corpuses being *huge*) I would like to train a nice midsize model. Something bigger than their 125M, but something smaller than their 1.3B model. I've had zero success getting anything working when applying my training scripts to the #bloom 560M model. Loss converges to zero almost instantly. Got any experience to share?

Please #boost for visibility plz

#ai #gpt2 #gptneo #bloom #boost

michabbb @[email protected] · 2023-03-17 · 18:41 UTC

Run 🤗 Transformers in your browser! - https://github.com/xenova/transformers.js

We currently support #BERT, #ALBERT, #DistilBERT, #T5, #T5v1.1, #FLANT5, #GPT2, #BART, #CodeGen, #Whisper, #CLIP, #Vision Transformer, and VisionEncoderDecoder models, for a variety of tasks....

#webml

#bert #albert #distilbert #t5 #t5v1 #flant5

Koustuv Sinha @[email protected] · 2022-12-21 · 00:48 UTC

Happy to share our new paper “Language model acceptability judgements are not always robust to context” https://arxiv.org/abs/2212.08979! We prepend several kinds of context to minimal linguistic #acceptability test pairs and find #LMs (#OPT, #GPT2) can still achieve strong performance on #BLiMP & #SyntaxGym, except in some interesting cases. 🧵 [1/7]

Joint work with @jon , @kanishka, @amuuueller, @keren fuentes, @roger_p_levy, @Adinawilliams

#blimp #syntaxgym #acceptability #lms #opt #gpt2