#amldgenai — Public Fediverse posts on home.social

Andrei Kucharavy @[email protected] · 2023-08-28 · 18:56 UTC

According to him the main OpenAI ChatGPT advance was UX + fine-tuning.

And that's it for AMLD Conference Generative Learning!

FIN

#AMLD #AMLDGenAI

40/🧵

#amld #amldgenai

Andrei Kucharavy @[email protected] · 2023-08-28 · 17:53 UTC

Flywheel of test-train-correct-retrain-... => Will RLHF scale (small tuning) or will keep it rolling?

Question: experimentation in high-risk area (healthcare).

Tay is getting mentionned! (fun story, back in 2015 I initially thought @SwiftOnSecurity actually was the actual Tay, because the impression was so good).

More suggestions about human-in the loop (that makes me once again want to cite one of the last @pluralistic 's articles).

#AMLD #AMLDGenAI

39/🧵

#amld #amldgenai

Andrei Kucharavy @[email protected] · 2023-08-28 · 17:46 UTC

Pretraining vs Fine-tuning?

OpenAI 2022 - Instruct; (but it was alredy stated in the 2018 GPT-1 paper and done by someone else back in 2020-ish).

Switch to conversational implicature.

Foundational Models fail with it, but not conversationally fine-tuned. Larger size does not help, more training date neither, nor multi-talsk ones.

(Hypothesis: destructive pruning of the neurons during fine-tuning is very powerful: maybe he should talk to Evelina Fedorenko about that)

#AMLD #AMLDGenAI

38/🧵

#amld #amldgenai

Andrei Kucharavy @[email protected] · 2023-08-28 · 17:34 UTC

Edward Grefenstette, director of research at Google Deep Mind.

Talk: 4 waves of computational revolution (funny that google search and Web 2.0 are not mentioned).

General formula : inference of joint probaility by optimizing a set of parameters.
=> General translators/autocompelters (T5)
=> joint code + execution (aka the compiler)

So why LLMs revolution is happening now?

Nice Chinchilla excerpt.

#AMLD #AMLDGenAI

37/🧵

#amld #amldgenai

Andrei Kucharavy @[email protected] · 2023-08-28 · 17:26 UTC

Good robot: qualitative description to metrics to imrpove the robot description.

#AMLD #AMLDGenAI

36/🧵

#amld #amldgenai

Andrei Kucharavy @[email protected] · 2023-08-28 · 17:14 UTC

Generalization:
- LLMs being plugged instead of GA algoritms over the "DNA" of machines.
- LLMs for fabrication of designed forms
- Use LLMs to simulate the likely falure modes of physically simulated solutions

(in my opinion still does not solve the predifinite vocabulary of DNA in robotics problem).

How to discribe a good robot?
(Evo: reproduction, but it is not acceptable for human assistants).

#AMLD #AMLDGenAI

35/🧵

#amld #amldgenai

Andrei Kucharavy @[email protected] · 2023-08-28 · 17:07 UTC

Next up - Josie Hughes, Prof at EPFL, robotics design.

=> Embodied intelligence (#ALIFE themes - yay!)

Problem; depends on the human factor, which needs to be multi-disciplinary (I see this is an emergent space where LLMs are supposed to be a solution. I am afraid this will lead them to become a "Google university 2.0" instead...)

GPT3 was used to support iteration (Nature Machine Intelligence 2023 (why am I not surprised...) - so not yay)

#AMLD #AMLDGenAI

34/🧵

#alife #amld #amldgenai

Andrei Kucharavy @[email protected] · 2023-08-28 · 16:53 UTC

(I disagree - things such as generalization in rare cases matter).

Nvidia HPARAM kicking vs examples of Paretto Curves + stacking recipes.

Sweeps of hyperparams

> mosaicML github acceleration library link.

#AMLD #AMLDGenAI

33/🧵

#amld #amldgenai

Andrei Kucharavy @[email protected] · 2023-08-28 · 16:49 UTC

Goal: time and $ necessary to train networks

- Same cost of inference
- Same hardware and software
- Have to optimize baselines

Recipes examples :
- Selective backprop. Drop examples that has low loss. Don't work raw, because of torch autograd, but selection with forwardpasses can work well (eg low-resolution losses)

Pb:
- What is just as good?
- What is acceptable loss of good?

=> Paretto frounteers

#AMLD #AMLDGenAI

32/🧵

#amld #amldgenai