home.social

#blas — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #blas, aggregated by home.social.

  1. #Copilot and I are about 30% away from creating a #Pascal version of #LAPACK using #BLAS. We are about two days away from achieving 80% of LAPACK. Then we will tweak it using some GPU acceleration to make its speed comparable to some python libraries like Numpy.

    It is important to note that one must be very disciplined in keeping clean documentations, a thorough and tight testing cycle, a rigid workflow pattern, or an AI will tend to skip tests, become sloppy and lose focus.

    #AI #LLM

  2. #Copilot and I are about 30% away from creating a #Pascal version of #LAPACK using #BLAS. We are about two days away from achieving 80% of LAPACK. Then we will tweak it using some GPU acceleration to make its speed comparable to some python libraries like Numpy.

    It is important to note that one must be very disciplined in keeping clean documentations, a thorough and tight testing cycle, a rigid workflow pattern, or an AI will tend to skip tests, become sloppy and lose focus.

    #AI #LLM

  3. #Copilot and I are about 30% away from creating a #Pascal version of #LAPACK using #BLAS. We are about two days away from achieving 80% of LAPACK. Then we will tweak it using some GPU acceleration to make its speed comparable to some python libraries like Numpy.

    It is important to note that one must be very disciplined in keeping clean documentations, a thorough and tight testing cycle, a rigid workflow pattern, or an AI will tend to skip tests, become sloppy and lose focus.

    #AI #LLM

  4. #Copilot and I are about 30% away from creating a #Pascal version of #LAPACK using #BLAS. We are about two days away from achieving 80% of LAPACK. Then we will tweak it using some GPU acceleration to make its speed comparable to some python libraries like Numpy.

    It is important to note that one must be very disciplined in keeping clean documentations, a thorough and tight testing cycle, a rigid workflow pattern, or an AI will tend to skip tests, become sloppy and lose focus.

    #AI #LLM

  5. #AI illiteracy is real. While still arguing with a bunch of AI haters, #Copilot and I just finished our #Pascal #BLAS level 1-3 Implementation plus eigenvalue, cholesky, and sparse #matrix, so we will never need #python, #C, C#, #Rust, ... for our Small Language Project. We will expand our Pascal Numeric Library (PNL) v1.0 to something like #Numpy and #Pytorch, but with static arrays, deterministic data structure, no referencing, no pointer arithmetic.

    #LLM #programming #computer

  6. #AI illiteracy is real. While still arguing with a bunch of AI haters, #Copilot and I just finished our #Pascal #BLAS level 1-3 Implementation plus eigenvalue, cholesky, and sparse #matrix, so we will never need #python, #C, C#, #Rust, ... for our Small Language Project. We will expand our Pascal Numeric Library (PNL) v1.0 to something like #Numpy and #Pytorch, but with static arrays, deterministic data structure, no referencing, no pointer arithmetic.

    #LLM #programming #computer

  7. #AI illiteracy is real. While still arguing with a bunch of AI haters, #Copilot and I just finished our #Pascal #BLAS level 1-3 Implementation plus eigenvalue, cholesky, and sparse #matrix, so we will never need #python, #C, C#, #Rust, ... for our Small Language Project. We will expand our Pascal Numeric Library (PNL) v1.0 to something like #Numpy and #Pytorch, but with static arrays, deterministic data structure, no referencing, no pointer arithmetic.

    #LLM #programming #computer

  8. #AI illiteracy is real. While still arguing with a bunch of AI haters, #Copilot and I just finished our #Pascal #BLAS level 1-3 Implementation plus eigenvalue, cholesky, and sparse #matrix, so we will never need #python, #C, C#, #Rust, ... for our Small Language Project. We will expand our Pascal Numeric Library (PNL) v1.0 to something like #Numpy and #Pytorch, but with static arrays, deterministic data structure, no referencing, no pointer arithmetic.

    #LLM #programming #computer

  9. While arguing with some AI haters, #Copilot and I created this Pure #Pascal #BLAS (Level 1,2,3 Core) Implementation in less than 1 day. We encountered many serious problems, including drifting of workflow pattern, getting stuck in a Delphi error loop, overhauling our original design... But as long as you understand AI, keep good documentations, maintain the core structure of the problem,.. you will be able to work with AI successfully. Don't hesitate to use more than one #AI at a time.

    #LLM

  10. While arguing with some AI haters, #Copilot and I created this Pure #Pascal #BLAS (Level 1,2,3 Core) Implementation in less than 1 day. We encountered many serious problems, including drifting of workflow pattern, getting stuck in a Delphi error loop, overhauling our original design... But as long as you understand AI, keep good documentations, maintain the core structure of the problem,.. you will be able to work with AI successfully. Don't hesitate to use more than one #AI at a time.

    #LLM

  11. While arguing with some AI haters, #Copilot and I created this Pure #Pascal #BLAS (Level 1,2,3 Core) Implementation in less than 1 day. We encountered many serious problems, including drifting of workflow pattern, getting stuck in a Delphi error loop, overhauling our original design... But as long as you understand AI, keep good documentations, maintain the core structure of the problem,.. you will be able to work with AI successfully. Don't hesitate to use more than one #AI at a time.

    #LLM

  12. While arguing with some AI haters, #Copilot and I created this Pure #Pascal #BLAS (Level 1,2,3 Core) Implementation in less than 1 day. We encountered many serious problems, including drifting of workflow pattern, getting stuck in a Delphi error loop, overhauling our original design... But as long as you understand AI, keep good documentations, maintain the core structure of the problem,.. you will be able to work with AI successfully. Don't hesitate to use more than one #AI at a time.

    #LLM

  13. What is #BLAS?

    BLAS is a set of fast matrix routines originally written in #Fortran.
    If you’re tired of dynamic types, hidden references, ownership rules, and endless “stream” abstractions, Free #Pascal + BLAS gives you old‑school, deterministic HPC #programming with none of the modern noise.

    #Copilot and I will be using Free Pascal and BLAS for our Small Language Model project #SLM. No more #C, #python, #Rust, or C#

    #AI #LLM #computer

  14. What is #BLAS?

    BLAS is a set of fast matrix routines originally written in #Fortran.
    If you’re tired of dynamic types, hidden references, ownership rules, and endless “stream” abstractions, Free #Pascal + BLAS gives you old‑school, deterministic HPC #programming with none of the modern noise.

    #Copilot and I will be using Free Pascal and BLAS for our Small Language Model project #SLM. No more #C, #python, #Rust, or C#

    #AI #LLM #computer

  15. What is #BLAS?

    BLAS is a set of fast matrix routines originally written in #Fortran.
    If you’re tired of dynamic types, hidden references, ownership rules, and endless “stream” abstractions, Free #Pascal + BLAS gives you old‑school, deterministic HPC #programming with none of the modern noise.

    #Copilot and I will be using Free Pascal and BLAS for our Small Language Model project #SLM. No more #C, #python, #Rust, or C#

    #AI #LLM #computer

  16. What is #BLAS?

    BLAS is a set of fast matrix routines originally written in #Fortran.
    If you’re tired of dynamic types, hidden references, ownership rules, and endless “stream” abstractions, Free #Pascal + BLAS gives you old‑school, deterministic HPC #programming with none of the modern noise.

    #Copilot and I will be using Free Pascal and BLAS for our Small Language Model project #SLM. No more #C, #python, #Rust, or C#

    #AI #LLM #computer

  17. Why do people use #python, a glue language, which is so slow? The only reason is the AI ecosystem.

    #Copilot and I just tested Free Pascal and BLAS for its speed without using #numpy or #pytorch. The result is amazing. It took less than a second to do a 1024x1024 #matrix multiplication.

    We will be using Free #Pascal and #BLAS to write our Small Language Model #SLM using #NNUE.

    #AI #LLM

  18. Why do people use #python, a glue language, which is so slow? The only reason is the AI ecosystem.

    #Copilot and I just tested Free Pascal and BLAS for its speed without using #numpy or #pytorch. The result is amazing. It took less than a second to do a 1024x1024 #matrix multiplication.

    We will be using Free #Pascal and #BLAS to write our Small Language Model #SLM using #NNUE.

    #AI #LLM

  19. Why do people use #python, a glue language, which is so slow? The only reason is the AI ecosystem.

    #Copilot and I just tested Free Pascal and BLAS for its speed without using #numpy or #pytorch. The result is amazing. It took less than a second to do a 1024x1024 #matrix multiplication.

    We will be using Free #Pascal and #BLAS to write our Small Language Model #SLM using #NNUE.

    #AI #LLM

  20. Why do people use #python, a glue language, which is so slow? The only reason is the AI ecosystem.

    #Copilot and I just tested Free Pascal and BLAS for its speed without using #numpy or #pytorch. The result is amazing. It took less than a second to do a 1024x1024 #matrix multiplication.

    We will be using Free #Pascal and #BLAS to write our Small Language Model #SLM using #NNUE.

    #AI #LLM

  21. The plot thickens #BLAS #rstats #lapack
    (When one is about to rip through 10s of millions of medical records, one must profile the tools if the project is to finish before one's retirement)
    FlexiBLAS makes this benchmarks a breeze

  22. The plot thickens #BLAS #rstats #lapack
    (When one is about to rip through 10s of millions of medical records, one must profile the tools if the project is to finish before one's retirement)
    FlexiBLAS makes this benchmarks a breeze

  23. The plot thickens #BLAS #rstats #lapack
    (When one is about to rip through 10s of millions of medical records, one must profile the tools if the project is to finish before one's retirement)
    FlexiBLAS makes this benchmarks a breeze

  24. The plot thickens #BLAS #rstats #lapack
    (When one is about to rip through 10s of millions of medical records, one must profile the tools if the project is to finish before one's retirement)
    FlexiBLAS makes this benchmarks a breeze

  25. The plot thickens #BLAS #rstats #lapack
    (When one is about to rip through 10s of millions of medical records, one must profile the tools if the project is to finish before one's retirement)
    FlexiBLAS makes this benchmarks a breeze

  26. The plot thickens #BLAS #rstats #lapack (When one is about to rip through 10s of millions of medical records, one must profile the tools if the project is to finish before one's retirement) FlexiBLAS makes this benchmarks a breeze

  27. The plot thickens #BLAS #rstats #lapack (When one is about to rip through 10s of millions of medical records, one must profile the tools if the project is to finish before one's retirement) FlexiBLAS makes this benchmarks a breeze

  28. I wonder if the #lapack that comes with #AOCL is being picked up by flexiblas in #rstats. The things I have to do for the love of electronic health records analytics #bigdata #blas

  29. I wonder if the #lapack that comes with #AOCL is being picked up by flexiblas in #rstats. The things I have to do for the love of electronic health records analytics #bigdata #blas

  30. I wonder if the #lapack that comes with #AOCL is being picked up by flexiblas in #rstats.
    The things I have to do for the love of electronic health records analytics #bigdata #blas

  31. I wonder if the #lapack that comes with #AOCL is being picked up by flexiblas in #rstats.
    The things I have to do for the love of electronic health records analytics #bigdata #blas

  32. I wonder if the #lapack that comes with #AOCL is being picked up by flexiblas in #rstats.
    The things I have to do for the love of electronic health records analytics #bigdata #blas

  33. I wonder if the #lapack that comes with #AOCL is being picked up by flexiblas in #rstats.
    The things I have to do for the love of electronic health records analytics #bigdata #blas

  34. I wonder if the #lapack that comes with #AOCL is being picked up by flexiblas in #rstats.
    The things I have to do for the love of electronic health records analytics #bigdata #blas

  35. Another post on #Quansight PBC blog: "BLAS/LAPACK #packaging"

    labs.quansight.org/blog/blas-l

    """
    #BLAS and #LAPACK are the standard libraries for linear algebra. The original implementation, often called Netlib LAPACK, developed since the 1980s, nowadays serves primarily as the origin of the standard interface, the reference implementation and a conformance test suite. The end users usually use optimized implementations of the same interfaces. The choice ranges from generically tuned libraries such as OpenBLAS and BLIS, through libraries focused on specific hardware such as Intel® oneMKL, Arm Performance Libraries or the Accelerate framework on macOS, to ATLAS that aims to automatically optimize for a specific system.

    The diversity of available libraries, developed in parallel with the standard interfaces, along with vendor-specific extensions and further downstream changes, adds quite a bit of complexity around using these libraries in software, and distributing such software afterwards. This problem entangles implementation authors, consumer software authors, build system maintainers and distribution maintainers. Software authors generally wish to distribute their packages built against a generically optimized BLAS/LAPACK implementation. Advanced users often wish to be able to use a different implementation, more suited to their particular needs. Distributions wish to be able to consistently build software against their system libraries, and ideally provide users the ability to switch between different implementations. Then, build systems need to provide the scaffolding for all of that.

    I have recently taken up the work to provide such a scaffolding for the Meson build system; to add support for BLAS and LAPACK dependencies to Meson. While working on it, I had to learn a lot about BLAS/LAPACK packaging: not only how the different implementations differ from one another, but also what is changed by their respective downstream packaging. In this blog post, I would like to organize and share what I have learned.
    """

    #CondaForge #Debian #Fedora #Gentoo

  36. Another post on #Quansight PBC blog: "BLAS/LAPACK #packaging"

    labs.quansight.org/blog/blas-l

    """
    #BLAS and #LAPACK are the standard libraries for linear algebra. The original implementation, often called Netlib LAPACK, developed since the 1980s, nowadays serves primarily as the origin of the standard interface, the reference implementation and a conformance test suite. The end users usually use optimized implementations of the same interfaces. The choice ranges from generically tuned libraries such as OpenBLAS and BLIS, through libraries focused on specific hardware such as Intel® oneMKL, Arm Performance Libraries or the Accelerate framework on macOS, to ATLAS that aims to automatically optimize for a specific system.

    The diversity of available libraries, developed in parallel with the standard interfaces, along with vendor-specific extensions and further downstream changes, adds quite a bit of complexity around using these libraries in software, and distributing such software afterwards. This problem entangles implementation authors, consumer software authors, build system maintainers and distribution maintainers. Software authors generally wish to distribute their packages built against a generically optimized BLAS/LAPACK implementation. Advanced users often wish to be able to use a different implementation, more suited to their particular needs. Distributions wish to be able to consistently build software against their system libraries, and ideally provide users the ability to switch between different implementations. Then, build systems need to provide the scaffolding for all of that.

    I have recently taken up the work to provide such a scaffolding for the Meson build system; to add support for BLAS and LAPACK dependencies to Meson. While working on it, I had to learn a lot about BLAS/LAPACK packaging: not only how the different implementations differ from one another, but also what is changed by their respective downstream packaging. In this blog post, I would like to organize and share what I have learned.
    """

    #CondaForge #Debian #Fedora #Gentoo

  37. Another post on #Quansight PBC blog: "BLAS/LAPACK #packaging"

    labs.quansight.org/blog/blas-l

    """
    #BLAS and #LAPACK are the standard libraries for linear algebra. The original implementation, often called Netlib LAPACK, developed since the 1980s, nowadays serves primarily as the origin of the standard interface, the reference implementation and a conformance test suite. The end users usually use optimized implementations of the same interfaces. The choice ranges from generically tuned libraries such as OpenBLAS and BLIS, through libraries focused on specific hardware such as Intel® oneMKL, Arm Performance Libraries or the Accelerate framework on macOS, to ATLAS that aims to automatically optimize for a specific system.

    The diversity of available libraries, developed in parallel with the standard interfaces, along with vendor-specific extensions and further downstream changes, adds quite a bit of complexity around using these libraries in software, and distributing such software afterwards. This problem entangles implementation authors, consumer software authors, build system maintainers and distribution maintainers. Software authors generally wish to distribute their packages built against a generically optimized BLAS/LAPACK implementation. Advanced users often wish to be able to use a different implementation, more suited to their particular needs. Distributions wish to be able to consistently build software against their system libraries, and ideally provide users the ability to switch between different implementations. Then, build systems need to provide the scaffolding for all of that.

    I have recently taken up the work to provide such a scaffolding for the Meson build system; to add support for BLAS and LAPACK dependencies to Meson. While working on it, I had to learn a lot about BLAS/LAPACK packaging: not only how the different implementations differ from one another, but also what is changed by their respective downstream packaging. In this blog post, I would like to organize and share what I have learned.
    """

    #CondaForge #Debian #Fedora #Gentoo

  38. Another post on #Quansight PBC blog: "BLAS/LAPACK #packaging"

    labs.quansight.org/blog/blas-l

    """
    #BLAS and #LAPACK are the standard libraries for linear algebra. The original implementation, often called Netlib LAPACK, developed since the 1980s, nowadays serves primarily as the origin of the standard interface, the reference implementation and a conformance test suite. The end users usually use optimized implementations of the same interfaces. The choice ranges from generically tuned libraries such as OpenBLAS and BLIS, through libraries focused on specific hardware such as Intel® oneMKL, Arm Performance Libraries or the Accelerate framework on macOS, to ATLAS that aims to automatically optimize for a specific system.

    The diversity of available libraries, developed in parallel with the standard interfaces, along with vendor-specific extensions and further downstream changes, adds quite a bit of complexity around using these libraries in software, and distributing such software afterwards. This problem entangles implementation authors, consumer software authors, build system maintainers and distribution maintainers. Software authors generally wish to distribute their packages built against a generically optimized BLAS/LAPACK implementation. Advanced users often wish to be able to use a different implementation, more suited to their particular needs. Distributions wish to be able to consistently build software against their system libraries, and ideally provide users the ability to switch between different implementations. Then, build systems need to provide the scaffolding for all of that.

    I have recently taken up the work to provide such a scaffolding for the Meson build system; to add support for BLAS and LAPACK dependencies to Meson. While working on it, I had to learn a lot about BLAS/LAPACK packaging: not only how the different implementations differ from one another, but also what is changed by their respective downstream packaging. In this blog post, I would like to organize and share what I have learned.
    """

    #CondaForge #Debian #Fedora #Gentoo

  39. Another post on #Quansight PBC blog: "BLAS/LAPACK #packaging"

    labs.quansight.org/blog/blas-l

    """
    #BLAS and #LAPACK are the standard libraries for linear algebra. The original implementation, often called Netlib LAPACK, developed since the 1980s, nowadays serves primarily as the origin of the standard interface, the reference implementation and a conformance test suite. The end users usually use optimized implementations of the same interfaces. The choice ranges from generically tuned libraries such as OpenBLAS and BLIS, through libraries focused on specific hardware such as Intel® oneMKL, Arm Performance Libraries or the Accelerate framework on macOS, to ATLAS that aims to automatically optimize for a specific system.

    The diversity of available libraries, developed in parallel with the standard interfaces, along with vendor-specific extensions and further downstream changes, adds quite a bit of complexity around using these libraries in software, and distributing such software afterwards. This problem entangles implementation authors, consumer software authors, build system maintainers and distribution maintainers. Software authors generally wish to distribute their packages built against a generically optimized BLAS/LAPACK implementation. Advanced users often wish to be able to use a different implementation, more suited to their particular needs. Distributions wish to be able to consistently build software against their system libraries, and ideally provide users the ability to switch between different implementations. Then, build systems need to provide the scaffolding for all of that.

    I have recently taken up the work to provide such a scaffolding for the Meson build system; to add support for BLAS and LAPACK dependencies to Meson. While working on it, I had to learn a lot about BLAS/LAPACK packaging: not only how the different implementations differ from one another, but also what is changed by their respective downstream packaging. In this blog post, I would like to organize and share what I have learned.
    """

    #CondaForge #Debian #Fedora #Gentoo

  40. Wspominałem już może, że pracuję nad przejściem #Gentoo z na wpół zepsutego eselect-ldso dla #BLAS / #LAPACK, na #FlexiBLAS. Oznacza to również, że czeka nas okres przejściowy, w czasie którego obydwa rozwiązania będą wspierane.

    Plus jest taki, że stan "po" jest kompatybilny pod względem ABI ze stanem "przed" (a przynajmniej powinien być — pracujemy z autorami, by poprawić ostatnie niedociągnięcia). Zastępujemy libblas.so, liblapack.so i inne biblitoteki dowiązaniami symbolicznymi, więc programy skompilowane przed zmianą po prostu zaczną używać FlexiBLAS.

    Minus jest taki, że w drugą stronę nie jest tak łatwo. Po zastąpieniu biblitotek dowiązaniami, nowoskompilowane programy będą odczytywać SONAME z biblioteki docelowej, a więc zaczną się wiązać bezpośrednio z FlexiBLAS. Co za tym idzie, powrót do stanu poprzedniego będzie wymagał ich ponownej kompilacji.

    Aby tego uniknąć, musielibyśmy zamiast dowiązań symbolicznych zastosować jakieś biblioteki pośredniczące, które miałyby "stare" SONAME, a korzystąły z funkcji FlexiBLAS. Niestety, nic prostego tu nie zadziała — musiałbym jakoś "wyeksportować" symbole z FlexiBLAS, i najlepiej podzielić je na odpowiednie biblioteki, żeby `-Wl,--as-needed` nic nie wycięło. Tylko jak to zrobić?

    Cóż, eselect-ldso tworzy jakieś biblioteki, więc może uda się coś wykorzystać. No i szukam w źródłach, i nic nie mogę znaleźć. W końcu do mnie dociera, że cała logika dodana jest przez łatki Gentoo. A te łatki są po prostu paskudne. W OpenBLAS tworzymy dodatkowe biblioteki libblas.so, itp., które zawierają kopie obiektów z OpenBLAS i wiążą się z libopenblas, żeby pobrać brakujące zależności. Nawet nie wiążą się jedna z drugą, więc każda duplikuje sporo kodu niezależnie. Łatki dla BLIS są jeszcze gorsze — tu libblas.so i libcblas.so to praktycznie kopie libblis.so, z poszczególnymi "niepotrzebnymi" symbolami ukrytymi przy pomocy "visibility".

    No cóż, można się było tego spodziewać po projekcie z #GSoC.