#fluidx3d — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #fluidx3d, aggregated by home.social.
-
#FluidX3D #CFD v3.7 brings faster Q-criterion isosurface rendering with #OpenCL local memory optimization! 🖖🤠
https://github.com/ProjectPhysX/FluidX3D/releases/tag/v3.7Instead of 32 velocities for each #GPU thread, now an 8x8x8 workgroup loads & reuses 11x11x11 velocities in L1$, a 12x VRAM BW reduction.
Fascinating insight: Which thread loads which cell from VRAM to L1$, and which thread renders which grid cell within the workgroup, can be very different!
https://github.com/ProjectPhysX/FluidX3D/blob/master/src/kernel.cpp#L2827-L2956PS: plugged X-wing Gif in #GitHub preview 🖖😜
-
Newest #IntelArc #GPU family member is here, the Panther Lake Arc B390... and it... purrs? 🖖 🥺 🐈⬛
My OpenCL-Benchmark on the B390 measures ~7.4 TFlops FP32 and ~120GB/s memory bandwidth. hw-smi also works with the B390.
#FluidX3D benchmarks here: https://github.com/ProjectPhysX/FluidX3D#single-gpucpu-benchmarks
And the #OpenCL infos:
-Arc B390: https://opencl.gpuinfo.org/displayreport.php?id=6718
- Core Ultra X7 358H: https://opencl.gpuinfo.org/displayreport.php?id=6717 -
#FluidX3D #CFD has reached ⭐ 5000 Stargazers on #GitHub! 🖖🥳
Grid refinement update is still in development, I haven't forgotten... ⬜◻️◽▫️
https://github.com/ProjectPhysX/FluidX3D -
Finally Intel #GPU support on Linux too. Watch all the metrics go brrr in multi-GPU #FluidX3D #CFD workload! Will #opensource soon™️
Hardening against the myriads of broken counters in all those bugged APIs was a long shot. 🖖🫠
____________ | Windows | #Linux |
CPU / RAM | ✅️️WinAPI | ✅️️/proc |
#Nvidia GPU | ✅️️NVML | ✅️️NVML |
#Intel GPU | ✅IGCL | ✅SYSMAN |
#AMD GPU | ✅️️️️ADLX | ✅️️️️AMDSMI | -
#FluidX3D #CFD v3.6 is out! This release accumulates a number of small improvements over the last months. Most notably, better interactive graphics support on #macOS with XQuartz. Have fun! 🖖😎🌊🍏
https://github.com/ProjectPhysX/FluidX3D/releases/tag/v3.6 -
Here's me demoing #Intel Arc Pro B60 #GPU workstations at #SC25 in St. Louis, runnig SolidWorks and #FluidX3D! 🖖😎
https://www.youtube.com/watch?v=Z8yxiyXTi7I -
#SC25 in wonderful St. Louis was a blast! I showcased this FluidX3D-on-Arc-Pro workstation demo there, met so many friends, had exciting conversations with all sorts of #HPC enthusiasts. And I finally saw the Cardinals - the bright red/brown birds that became the symbol of the city's famous Baseball team. 🐦
This is Flynn's #Tron Light Cycle simulated in #FluidX3D #CFD, on 4x #Intel #ArcPro B60 #GPUs with 24GB VRAM each, at a massive 1.8 Billion grid cells resolution. 🖖🤠
https://www.youtube.com/watch?v=5kZ3MmNOLoE -
8x AMD Instinct #MI355X (288GB @8TB/s) take back the lead over 8x Nvidia #B200 (180GB @8TB/s) in #FluidX3D #CFD, achieving 362k MLUPs/s (vs. 219k MLUPs/s). Thanks to Jon Stevens from Hot Aisle to run the benchmarks! 🖖😊
In single-GPU, both perform about the same, but in 8x #GPU config, MI355X is 65% faster. The difference comes from PCIe bandwidth - MI355X does 55GB/s, B200 only 14GB/s. #Nvidia leaves a lot of perf on the table by not exposing #NVLink P2P to #OpenCL.
-
My keynote talk at #IXPUG #ISC25 is finally online! 🖖🤠
GigaScale #FluidX3D #CFD Simulations from Communication and IO Perspective
https://www.youtube.com/watch?v=uAIkpcX5EFc&list=PLA-vfTt7YHI2HEFrpzPhhQ8PhiztKhHU8 -
Battle of the giants: Nvidia #Blackwell B200 takes the lead in FluidX3D CFD performance
#Nvidia #B200 just launched, and I'm one of the first people to benchmark 8x B200 via Shadeform, in a WhiteFiber server with 2x #Intel #Xeon6 6960P 72-core CPUs. 🖖😋
8x Nvidia B200 go head-to-head with 8x #AMD #MI300X in the #FluidX3D #CFD benchmark, winning overall (with FP16S storage) at 219300 MLUPs/s (~17TB/s combined VRAM bandwidth), but losing in FP32 & FP16C storage. 8x MI300X achieve 204924 MLUPs/s.
-
What an honor to start the #IWOCL conference with my keynote talk! Nowhere else you get to talk to so many #OpenCL and #SYCL experts in one room! I shared some updates on my #FluidX3D #CFD solver, how I optimized it at the smallest level of a single grid cell, to scale it up on the largest #Intel #Xeon6 #HPC systems that provide more memory capacity than any #GPU server. 🖖😃
-
Hot Aisle's 8x AMD #MI300X server is the fastest computer I've ever tested in #FluidX3D #CFD, achieving a peak #LBM performance of 205 GLUPs/s, and a combined VRAM bandwidth of 23 TB/s. 🖖🤯
The #RTX 5090 looks like a toy in comparison.MI300X beats even Nvidia's GH200 94GB. This marks a very fascinating inflection point in #GPGPU: #CUDA is not the performance leader anymore. 🖖😛
You need a cross-vendor language like #OpenCL to leverage its power.FluidX3D on #GitHub: https://github.com/ProjectPhysX/FluidX3D
-
The 4x #Nvidia #H100 SXM5 server in the new Festus cluster at Uni Bayreuth is the fastest system I've ever tested in #FluidX3D #CFD, achieving 78 GLUPs/s #LBM performance at ~1650W #GPU power draw. 🖖😋🖥️🔥
https://github.com/ProjectPhysX/FluidX3D?tab=readme-ov-file#multi-gpu-benchmarks
https://www.hpc.uni-bayreuth.de/clusters/festus/#__tabbed_1_3 -
Oh look, #Nvidia makes CPUs now! And I got my hands on one! 🖖😋
Today I benchmarked #FluidX3D on Nvidia's #GH200, both #GPU and #CPU with #PoCL. Finally I can answer the question: How does that exotic 2-chip #HPC APU show up in #OpenCL?
--> It's 2 separate devices, a GPU with 94GB @ 4TB/s and a 72-core CPU with 480GB @ 384GB/s. The NVLink interconnect between the two is much faster than PCIe, achieving ~380GB/s host<->device bandwidth, only limited by poor misaligned VRAM BW on the GPU or RAM BW. -
When I say #FluidX3D #CFD runs on every toaster, I mean it. I finally got it running on the AMD Athlon X2 QL-65 dual-core CPU of my very first computer, a Toshiba Satellite L500D I got in 2009. The CPU itself is from 2008, a year before #OpenCL even existed. Modern #PoCL makes it compatible. Does close to 3 MLUPs/s! 🔥🔥🔥
https://opencl.gpuinfo.org/listreports.php?devicename=cpu-x86-64-AMD+Athlon%28tm%29+X2+Dual-Core+QL-65&platform= -
I've always wanted to do a helicopter! 🖖😎🚁
Here it is, a Bell 222 in #CFD. With massive 10 billion cells. 71 TeraByte data visualized. Just for #SimulationFriday fun, because I can! Took 6.4 hours on 8x AMD Instinct #MI200 #GPUs.
https://youtu.be/BStzTRmLW7Q
#FluidX3D on GitHub: https://github.com/ProjectPhysX/FluidX3D