Sign in Create account

#phindinstant — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #phindinstant, aggregated by home.social.

michabbb @[email protected] · 2024-09-12 · 08:05 UTC

Introducing Phind-405B and faster, high quality #AI answers for everyone
🚀 Phind-405B: New flagship #llm, based on Meta Llama 3.1 405B, designed for programming & technical tasks. #Phind405B
⚡ 128K tokens, 32K context window at launch, 92% on HumanEval, great for web app design. #Programming #AIModel
💡 Trained on 256 H100 GPUs with FP8 mixed precision, 40% memory reduction. #DeepSpeed #FP8
⚡ Phind Instant Model: Super fast, 350 tokens/sec, based on Meta Llama 3.1 8B. #PhindInstant
🚀 Runs on NVIDIA TensorRT-LLM with flash decoding, fused CUDA kernels. #NVIDIA #GPUs
🔍 Faster Search: Prefetches results, saves up to 800ms latency, better embeddings. #FastSearch
👨‍💻 Goal: Help developers experiment faster, new features coming soon! #DevTools #Innovation
https://www.phind.com/blog/introducing-phind-405b-and-better-faster-searches

#ai #llm #phind405b #programming #aimodel #deepspeed
michabbb @[email protected] · 2024-09-12 · 08:05 UTC

Introducing Phind-405B and faster, high quality #AI answers for everyone
🚀 Phind-405B: New flagship #llm, based on Meta Llama 3.1 405B, designed for programming & technical tasks. #Phind405B
⚡ 128K tokens, 32K context window at launch, 92% on HumanEval, great for web app design. #Programming #AIModel
💡 Trained on 256 H100 GPUs with FP8 mixed precision, 40% memory reduction. #DeepSpeed #FP8
⚡ Phind Instant Model: Super fast, 350 tokens/sec, based on Meta Llama 3.1 8B. #PhindInstant
🚀 Runs on NVIDIA TensorRT-LLM with flash decoding, fused CUDA kernels. #NVIDIA #GPUs
🔍 Faster Search: Prefetches results, saves up to 800ms latency, better embeddings. #FastSearch
👨‍💻 Goal: Help developers experiment faster, new features coming soon! #DevTools #Innovation
https://www.phind.com/blog/introducing-phind-405b-and-better-faster-searches

#ai #llm #phind405b #programming #aimodel #deepspeed
michabbb @[email protected] · 2024-09-12 · 08:05 UTC

Introducing Phind-405B and faster, high quality #AI answers for everyone
🚀 Phind-405B: New flagship #llm, based on Meta Llama 3.1 405B, designed for programming & technical tasks. #Phind405B
⚡ 128K tokens, 32K context window at launch, 92% on HumanEval, great for web app design. #Programming #AIModel
💡 Trained on 256 H100 GPUs with FP8 mixed precision, 40% memory reduction. #DeepSpeed #FP8
⚡ Phind Instant Model: Super fast, 350 tokens/sec, based on Meta Llama 3.1 8B. #PhindInstant
🚀 Runs on NVIDIA TensorRT-LLM with flash decoding, fused CUDA kernels. #NVIDIA #GPUs
🔍 Faster Search: Prefetches results, saves up to 800ms latency, better embeddings. #FastSearch
👨‍💻 Goal: Help developers experiment faster, new features coming soon! #DevTools #Innovation
https://www.phind.com/blog/introducing-phind-405b-and-better-faster-searches

#ai #llm #phind405b #programming #aimodel #deepspeed
michabbb @[email protected] · 2024-09-12 · 08:05 UTC

Introducing Phind-405B and faster, high quality #AI answers for everyone
🚀 Phind-405B: New flagship #llm, based on Meta Llama 3.1 405B, designed for programming & technical tasks. #Phind405B
⚡ 128K tokens, 32K context window at launch, 92% on HumanEval, great for web app design. #Programming #AIModel
💡 Trained on 256 H100 GPUs with FP8 mixed precision, 40% memory reduction. #DeepSpeed #FP8
⚡ Phind Instant Model: Super fast, 350 tokens/sec, based on Meta Llama 3.1 8B. #PhindInstant
🚀 Runs on NVIDIA TensorRT-LLM with flash decoding, fused CUDA kernels. #NVIDIA #GPUs
🔍 Faster Search: Prefetches results, saves up to 800ms latency, better embeddings. #FastSearch
👨‍💻 Goal: Help developers experiment faster, new features coming soon! #DevTools #Innovation
https://www.phind.com/blog/introducing-phind-405b-and-better-faster-searches

#innovation #devtools #fastsearch #gpus #nvidia #phindinstant
michabbb @[email protected] · 2024-09-12 · 08:05 UTC

Introducing Phind-405B and faster, high quality #AI answers for everyone
🚀 Phind-405B: New flagship #llm, based on Meta Llama 3.1 405B, designed for programming & technical tasks. #Phind405B
⚡ 128K tokens, 32K context window at launch, 92% on HumanEval, great for web app design. #Programming #AIModel
💡 Trained on 256 H100 GPUs with FP8 mixed precision, 40% memory reduction. #DeepSpeed #FP8
⚡ Phind Instant Model: Super fast, 350 tokens/sec, based on Meta Llama 3.1 8B. #PhindInstant
🚀 Runs on NVIDIA TensorRT-LLM with flash decoding, fused CUDA kernels. #NVIDIA #GPUs
🔍 Faster Search: Prefetches results, saves up to 800ms latency, better embeddings. #FastSearch
👨‍💻 Goal: Help developers experiment faster, new features coming soon! #DevTools #Innovation
https://www.phind.com/blog/introducing-phind-405b-and-better-faster-searches

#ai #llm #phind405b #programming #aimodel #deepspeed