home.social

#oneapi — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #oneapi, aggregated by home.social.

  1. Even now, Thrust as a dependency is one of the main reason why we have a #CUDA backend, a #HIP / #ROCm backend and a pure #CPU backend in #GPUSPH, but not a #SYCL or #OneAPI backend (which would allow us to extend hardware support to #Intel GPUs). <doi.org/10.1002/cpe.8313>

    This is also one of the reason why we implemented our own #BLAS routines when we introduced the semi-implicit integrator. A side-effect of this choice is that it allowed us to develop the improved #BiCGSTAB that I've had the opportunity to mention before <doi.org/10.1016/j.jcp.2022.111>. Sometimes I do wonder if it would be appropriate to “excorporate” it into its own library for general use, since it's something that would benefit others. OTOH, this one was developed specifically for GPUSPH and it's tightly integrated with the rest of it (including its support for multi-GPU), and refactoring to turn it into a library like cuBLAS is

    a. too much effort
    b. probably not worth it.

    Again, following @eniko's original thread, it's really not that hard to roll your own, and probably less time consuming than trying to wrangle your way through an API that may or may not fit your needs.

    6/

  2. Even now, Thrust as a dependency is one of the main reason why we have a #CUDA backend, a #HIP / #ROCm backend and a pure #CPU backend in #GPUSPH, but not a #SYCL or #OneAPI backend (which would allow us to extend hardware support to #Intel GPUs). <doi.org/10.1002/cpe.8313>

    This is also one of the reason why we implemented our own #BLAS routines when we introduced the semi-implicit integrator. A side-effect of this choice is that it allowed us to develop the improved #BiCGSTAB that I've had the opportunity to mention before <doi.org/10.1016/j.jcp.2022.111>. Sometimes I do wonder if it would be appropriate to “excorporate” it into its own library for general use, since it's something that would benefit others. OTOH, this one was developed specifically for GPUSPH and it's tightly integrated with the rest of it (including its support for multi-GPU), and refactoring to turn it into a library like cuBLAS is

    a. too much effort
    b. probably not worth it.

    Again, following @eniko's original thread, it's really not that hard to roll your own, and probably less time consuming than trying to wrangle your way through an API that may or may not fit your needs.

    6/

  3. Even now, Thrust as a dependency is one of the main reason why we have a #CUDA backend, a #HIP / #ROCm backend and a pure #CPU backend in #GPUSPH, but not a #SYCL or #OneAPI backend (which would allow us to extend hardware support to #Intel GPUs). <doi.org/10.1002/cpe.8313>

    This is also one of the reason why we implemented our own #BLAS routines when we introduced the semi-implicit integrator. A side-effect of this choice is that it allowed us to develop the improved #BiCGSTAB that I've had the opportunity to mention before <doi.org/10.1016/j.jcp.2022.111>. Sometimes I do wonder if it would be appropriate to “excorporate” it into its own library for general use, since it's something that would benefit others. OTOH, this one was developed specifically for GPUSPH and it's tightly integrated with the rest of it (including its support for multi-GPU), and refactoring to turn it into a library like cuBLAS is

    a. too much effort
    b. probably not worth it.

    Again, following @eniko's original thread, it's really not that hard to roll your own, and probably less time consuming than trying to wrangle your way through an API that may or may not fit your needs.

    6/

  4. Even now, Thrust as a dependency is one of the main reason why we have a #CUDA backend, a #HIP / #ROCm backend and a pure #CPU backend in #GPUSPH, but not a #SYCL or #OneAPI backend (which would allow us to extend hardware support to #Intel GPUs). <doi.org/10.1002/cpe.8313>

    This is also one of the reason why we implemented our own #BLAS routines when we introduced the semi-implicit integrator. A side-effect of this choice is that it allowed us to develop the improved #BiCGSTAB that I've had the opportunity to mention before <doi.org/10.1016/j.jcp.2022.111>. Sometimes I do wonder if it would be appropriate to “excorporate” it into its own library for general use, since it's something that would benefit others. OTOH, this one was developed specifically for GPUSPH and it's tightly integrated with the rest of it (including its support for multi-GPU), and refactoring to turn it into a library like cuBLAS is

    a. too much effort
    b. probably not worth it.

    Again, following @eniko's original thread, it's really not that hard to roll your own, and probably less time consuming than trying to wrangle your way through an API that may or may not fit your needs.

    6/

  5. Even now, Thrust as a dependency is one of the main reason why we have a #CUDA backend, a #HIP / #ROCm backend and a pure #CPU backend in #GPUSPH, but not a #SYCL or #OneAPI backend (which would allow us to extend hardware support to #Intel GPUs). <doi.org/10.1002/cpe.8313>

    This is also one of the reason why we implemented our own #BLAS routines when we introduced the semi-implicit integrator. A side-effect of this choice is that it allowed us to develop the improved #BiCGSTAB that I've had the opportunity to mention before <doi.org/10.1016/j.jcp.2022.111>. Sometimes I do wonder if it would be appropriate to “excorporate” it into its own library for general use, since it's something that would benefit others. OTOH, this one was developed specifically for GPUSPH and it's tightly integrated with the rest of it (including its support for multi-GPU), and refactoring to turn it into a library like cuBLAS is

    a. too much effort
    b. probably not worth it.

    Again, following @eniko's original thread, it's really not that hard to roll your own, and probably less time consuming than trying to wrangle your way through an API that may or may not fit your needs.

    6/

  6. I'm getting the material ready for my upcoming #GPGPU course that starts on March. Even though I most probably won't get to it,I also checked my trivial #SYCL programs. Apparently the 2025.0 version of the #Intel #OneAPI #DPCPP runtime doesn't like any #OpenCL platform except Intel's own (I have two other platforms that support #SPIRV, so why aren't they showing up? From the documentation I can find online this should be sufficient, but apparently it's not …)

  7. I'm getting the material ready for my upcoming #GPGPU course that starts on March. Even though I most probably won't get to it,I also checked my trivial #SYCL programs. Apparently the 2025.0 version of the #Intel #OneAPI #DPCPP runtime doesn't like any #OpenCL platform except Intel's own (I have two other platforms that support #SPIRV, so why aren't they showing up? From the documentation I can find online this should be sufficient, but apparently it's not …)

  8. I'm getting the material ready for my upcoming #GPGPU course that starts on March. Even though I most probably won't get to it,I also checked my trivial #SYCL programs. Apparently the 2025.0 version of the #Intel #OneAPI #DPCPP runtime doesn't like any #OpenCL platform except Intel's own (I have two other platforms that support #SPIRV, so why aren't they showing up? From the documentation I can find online this should be sufficient, but apparently it's not …)

  9. I'm getting the material ready for my upcoming #GPGPU course that starts on March. Even though I most probably won't get to it,I also checked my trivial #SYCL programs. Apparently the 2025.0 version of the #Intel #OneAPI #DPCPP runtime doesn't like any #OpenCL platform except Intel's own (I have two other platforms that support #SPIRV, so why aren't they showing up? From the documentation I can find online this should be sufficient, but apparently it's not …)

  10. I'm getting the material ready for my upcoming #GPGPU course that starts on March. Even though I most probably won't get to it,I also checked my trivial #SYCL programs. Apparently the 2025.0 version of the #Intel #OneAPI #DPCPP runtime doesn't like any #OpenCL platform except Intel's own (I have two other platforms that support #SPIRV, so why aren't they showing up? From the documentation I can find online this should be sufficient, but apparently it's not …)

  11. Just how deep is #Nvidia's #CUDA moat really?
    Not as impenetrable as you might think, but still more than Intel or AMD would like
    It's not enough just to build a competitive part: you also have to have #software that can harness all those #FLOPS — something Nvidia has spent the better part of two decades building with its CUDA runtime, while competing frameworks for low-level #GPU #programming are far less mature like AMD's #ROCm or Intel's #OneAPI.
    theregister.com/2024/12/17/nvi #developers

  12. Just how deep is #Nvidia's #CUDA moat really?
    Not as impenetrable as you might think, but still more than Intel or AMD would like
    It's not enough just to build a competitive part: you also have to have #software that can harness all those #FLOPS — something Nvidia has spent the better part of two decades building with its CUDA runtime, while competing frameworks for low-level #GPU #programming are far less mature like AMD's #ROCm or Intel's #OneAPI.
    theregister.com/2024/12/17/nvi #developers

  13. Just how deep is 's moat really?
    Not as impenetrable as you might think, but still more than Intel or AMD would like
    It's not enough just to build a competitive part: you also have to have that can harness all those — something Nvidia has spent the better part of two decades building with its CUDA runtime, while competing frameworks for low-level are far less mature like AMD's or Intel's .
    theregister.com/2024/12/17/nvi

  14. Just how deep is #Nvidia's #CUDA moat really?
    Not as impenetrable as you might think, but still more than Intel or AMD would like
    It's not enough just to build a competitive part: you also have to have #software that can harness all those #FLOPS — something Nvidia has spent the better part of two decades building with its CUDA runtime, while competing frameworks for low-level #GPU #programming are far less mature like AMD's #ROCm or Intel's #OneAPI.
    theregister.com/2024/12/17/nvi #developers

  15. Just how deep is #Nvidia's #CUDA moat really?
    Not as impenetrable as you might think, but still more than Intel or AMD would like
    It's not enough just to build a competitive part: you also have to have #software that can harness all those #FLOPS — something Nvidia has spent the better part of two decades building with its CUDA runtime, while competing frameworks for low-level #GPU #programming are far less mature like AMD's #ROCm or Intel's #OneAPI.
    theregister.com/2024/12/17/nvi #developers

  16. #BSI WID-SEC-2024-3422: [NEU] [mittel] #Intel #oneAPI #Math #Kernel #Library: Schwachstelle ermöglicht Privilegieneskalation

    Ein lokaler Angreifer kann eine Schwachstelle in Intel oneAPI Math Kernel Library ausnutzen, um seine Privilegien zu erhöhen.

    wid.cert-bund.de/portal/wid/se

  17. #BSI WID-SEC-2024-3422: [NEU] [mittel] #Intel #oneAPI #Math #Kernel #Library: Schwachstelle ermöglicht Privilegieneskalation

    Ein lokaler Angreifer kann eine Schwachstelle in Intel oneAPI Math Kernel Library ausnutzen, um seine Privilegien zu erhöhen.

    wid.cert-bund.de/portal/wid/se

  18. Howdy all - registrations are still open for the first oneAPI DevSummit hosted by the UXL Foundation! Learn about GPGPU programming, oneAPI and how companies are coalescing around #oneapi / #sycl
    linuxfoundation.regfox.com/one

    Registration will closeat 5pm today. The DevSummit will start at 8pm PT or 8:30am IST. See you there!

  19. Howdy all - registrations are still open for the first oneAPI DevSummit hosted by the UXL Foundation! Learn about GPGPU programming, oneAPI and how companies are coalescing around #oneapi / #sycl
    linuxfoundation.regfox.com/one

    Registration will closeat 5pm today. The DevSummit will start at 8pm PT or 8:30am IST. See you there!

  20. Howdy all - registrations are still open for the first oneAPI DevSummit hosted by the UXL Foundation! Learn about GPGPU programming, oneAPI and how companies are coalescing around #oneapi / #sycl
    linuxfoundation.regfox.com/one

    Registration will closeat 5pm today. The DevSummit will start at 8pm PT or 8:30am IST. See you there!

  21. Howdy all - registrations are still open for the first oneAPI DevSummit hosted by the UXL Foundation! Learn about GPGPU programming, oneAPI and how companies are coalescing around #oneapi / #sycl
    linuxfoundation.regfox.com/one

    Registration will closeat 5pm today. The DevSummit will start at 8pm PT or 8:30am IST. See you there!

  22. Howdy all - registrations are still open for the first oneAPI DevSummit hosted by the UXL Foundation! Learn about GPGPU programming, oneAPI and how companies are coalescing around #oneapi / #sycl
    linuxfoundation.regfox.com/one

    Registration will closeat 5pm today. The DevSummit will start at 8pm PT or 8:30am IST. See you there!

  23. 📢 Introduction to #oneAPI, #SYCL2020 & #OpenMP offloading
    📆September 23-25, 2024

    In this 3-day online course, HLRS - High-Performance Computing Center Stuttgart provides an introduction to Intel Corporation's oneAPI implementation 🖥

    Read more & Register👉 hlrs.de/training/2024/intel-on

  24. 📢 Introduction to #oneAPI, #SYCL2020 & #OpenMP offloading
    📆September 23-25, 2024

    In this 3-day online course, HLRS - High-Performance Computing Center Stuttgart provides an introduction to Intel Corporation's oneAPI implementation 🖥

    Read more & Register👉 hlrs.de/training/2024/intel-on

  25. 📢 Introduction to #oneAPI, #SYCL2020 & #OpenMP offloading
    📆September 23-25, 2024

    In this 3-day online course, HLRS - High-Performance Computing Center Stuttgart provides an introduction to Intel Corporation's oneAPI implementation 🖥

    Read more & Register👉 hlrs.de/training/2024/intel-on

  26. 📢 Introduction to #oneAPI, #SYCL2020 & #OpenMP offloading
    📆September 23-25, 2024

    In this 3-day online course, HLRS - High-Performance Computing Center Stuttgart provides an introduction to Intel Corporation's oneAPI implementation 🖥

    Read more & Register👉 hlrs.de/training/2024/intel-on

  27. 📢 Introduction to #oneAPI, #SYCL2020 & #OpenMP offloading
    📆September 23-25, 2024

    In this 3-day online course, HLRS - High-Performance Computing Center Stuttgart provides an introduction to Intel Corporation's oneAPI implementation 🖥

    Read more & Register👉 hlrs.de/training/2024/intel-on

  28. Just one more day to submit your session for the UXL oneAPI DevSummit being held October 9th & 10th!

    Learn more: sessionize.com/uxldevsummit

  29. Just one more day to submit your session for the UXL oneAPI DevSummit being held October 9th & 10th!

    Learn more: sessionize.com/uxldevsummit
    #SYCL #oneAPI #UXL

  30. Just one more day to submit your session for the UXL oneAPI DevSummit being held October 9th & 10th!

    Learn more: sessionize.com/uxldevsummit
    #SYCL #oneAPI #UXL

  31. Just one more day to submit your session for the UXL oneAPI DevSummit being held October 9th & 10th!

    Learn more: sessionize.com/uxldevsummit
    #SYCL #oneAPI #UXL

  32. Just one more day to submit your session for the UXL oneAPI DevSummit being held October 9th & 10th!

    Learn more: sessionize.com/uxldevsummit
    #SYCL #oneAPI #UXL

  33. / 2020 / ++ / Is starting to look pretty nice. I’m not sure if there is a story for mobile devices and older devices. It seems that OpenCL 1.2 isn’t enough for SYCL 2020, but unsure. Also not sure how Vulcan fits in here. It’s also unclear how it scales to smaller kernels and data. The API looks like kernel/buffer/queue. Not sure if you can queue up many kernels or if they can be pipelined.

    How does the new IR that llvm is using fit in here? So many questions.

  34. #oneAPI / #SYCL 2020 / #DPC++ / #AdaptiveCpp Is starting to look pretty nice. I’m not sure if there is a story for mobile devices and older devices. It seems that OpenCL 1.2 isn’t enough for SYCL 2020, but unsure. Also not sure how Vulcan fits in here. It’s also unclear how it scales to smaller kernels and data. The API looks like kernel/buffer/queue. Not sure if you can queue up many kernels or if they can be pipelined.

    How does the new IR that llvm is using fit in here? So many questions.

  35. #oneAPI / #SYCL 2020 / #DPC++ / #AdaptiveCpp Is starting to look pretty nice. I’m not sure if there is a story for mobile devices and older devices. It seems that OpenCL 1.2 isn’t enough for SYCL 2020, but unsure. Also not sure how Vulcan fits in here. It’s also unclear how it scales to smaller kernels and data. The API looks like kernel/buffer/queue. Not sure if you can queue up many kernels or if they can be pipelined.

    How does the new IR that llvm is using fit in here? So many questions.

  36. #oneAPI / #SYCL 2020 / #DPC++ / #AdaptiveCpp Is starting to look pretty nice. I’m not sure if there is a story for mobile devices and older devices. It seems that OpenCL 1.2 isn’t enough for SYCL 2020, but unsure. Also not sure how Vulcan fits in here. It’s also unclear how it scales to smaller kernels and data. The API looks like kernel/buffer/queue. Not sure if you can queue up many kernels or if they can be pipelined.

    How does the new IR that llvm is using fit in here? So many questions.

  37. #oneAPI / #SYCL 2020 / #DPC++ / #AdaptiveCpp Is starting to look pretty nice. I’m not sure if there is a story for mobile devices and older devices. It seems that OpenCL 1.2 isn’t enough for SYCL 2020, but unsure. Also not sure how Vulcan fits in here. It’s also unclear how it scales to smaller kernels and data. The API looks like kernel/buffer/queue. Not sure if you can queue up many kernels or if they can be pipelined.

    How does the new IR that llvm is using fit in here? So many questions.

  38. I should mention that this isn't just a matter of the dominant player intentionally boycotting standards that would make them lose the vendor lock-in advantage (hello #NVIDIA). All major vendors are guilty of this one way or the other. For example, #AMD unjustifiably pulled (or maybe failed to add) #SPIR and #CPU support from their new #OpenCL implementation. #Intel's #oneAPI (even while still leveraging the OpenCL backend) effectively failed on any other OpenCL platform.

  39. I should mention that this isn't just a matter of the dominant player intentionally boycotting standards that would make them lose the vendor lock-in advantage (hello #NVIDIA). All major vendors are guilty of this one way or the other. For example, #AMD unjustifiably pulled (or maybe failed to add) #SPIR and #CPU support from their new #OpenCL implementation. #Intel's #oneAPI (even while still leveraging the OpenCL backend) effectively failed on any other OpenCL platform.

  40. I should mention that this isn't just a matter of the dominant player intentionally boycotting standards that would make them lose the vendor lock-in advantage (hello #NVIDIA). All major vendors are guilty of this one way or the other. For example, #AMD unjustifiably pulled (or maybe failed to add) #SPIR and #CPU support from their new #OpenCL implementation. #Intel's #oneAPI (even while still leveraging the OpenCL backend) effectively failed on any other OpenCL platform.

  41. .@Intel Advanced Matrix Extensions [AMX] Performance With Xeon Scalable #SapphireRapids

    -- The big #AI performance uplift and power efficiency benefits from #AMX w/ #oneAPI #oneDNN & #OpenVINO benchmarks

    phoronix.com/review/intel-xeon

    Original tweet : twitter.com/phoronix/status/16