home.social

#benchmarking — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #benchmarking, aggregated by home.social.

  1. 😎 One of the coolest things about the F5 Academy last week was getting hands on with `warp` (from the MinIO team):

    🔗 github.com/minio/warp

    ✨ Built for for S3 benchmarking but also great for testing the impact of infra changes...and pairs well with Prometheus and Grafana for observability.

    🤭 I enjoyed it enough to reproduce the lab stack on `localhost` to continue experimenting tonight.

    #F5Academy #S3 #benchmarking

  2. Our Eurographics short paper “ConJEB: A Large Elastic Contact Jet Engine Bracket Quadratic Program Dataset” is now available online!

    Current QP benchmark datasets don't contain large, sparse problems that occur in many graphics applications. ConJEB addresses this by creating analogous contact problems for every simulation in the SimJEB dataset

    diglib.eg.org/handle/10.2312/e

    #Dataset #Simulation #QuadraticPrograms #Benchmarking #SimJEB #ConJEB

  3. Monday madness brings us a new SIGARCH blog and the call for tutorial & workshop proposals for IISWC. Check em out!

    "Beyond Qubits: A Systems View of Hybrid CV-DV Quantum Computing"
    by Yuan Liu, Zihan Chen, Shubdeep Mohapatra, Jim Furches, Zheng (Eddy) Zhang, Huiyang Zhou
    sigarch.org/beyond-qubits-a-sy

    The IISWC 2026 Tutorial & Workshop CFP is officially OPEN!
    📢 If you are working on simulation tools, evaluation methodologies, or emerging domains like AI/ML and Cloud, submit your session proposals.
    Sept 27, 2026 Boulder, Colorado Submit to: [email protected]
    iiswc.org/iiswc2026/cftw.html
    #IISWC2026 #ComputerArchitecture #Benchmarking #WorkloadCharacterization

  4. Monday madness brings us a new SIGARCH blog and the call for tutorial & workshop proposals for IISWC. Check em out!

    "Beyond Qubits: A Systems View of Hybrid CV-DV Quantum Computing"
    by Yuan Liu, Zihan Chen, Shubdeep Mohapatra, Jim Furches, Zheng (Eddy) Zhang, Huiyang Zhou
    sigarch.org/beyond-qubits-a-sy

    The IISWC 2026 Tutorial & Workshop CFP is officially OPEN!
    📢 If you are working on simulation tools, evaluation methodologies, or emerging domains like AI/ML and Cloud, submit your session proposals.
    Sept 27, 2026 Boulder, Colorado Submit to: [email protected]
    iiswc.org/iiswc2026/cftw.html
    #IISWC2026 #ComputerArchitecture #Benchmarking #WorkloadCharacterization

  5. Monday madness brings us a new SIGARCH blog and the call for tutorial & workshop proposals for IISWC. Check em out!

    "Beyond Qubits: A Systems View of Hybrid CV-DV Quantum Computing"
    by Yuan Liu, Zihan Chen, Shubdeep Mohapatra, Jim Furches, Zheng (Eddy) Zhang, Huiyang Zhou
    sigarch.org/beyond-qubits-a-sy

    The IISWC 2026 Tutorial & Workshop CFP is officially OPEN!
    📢 If you are working on simulation tools, evaluation methodologies, or emerging domains like AI/ML and Cloud, submit your session proposals.
    Sept 27, 2026 Boulder, Colorado Submit to: [email protected]
    iiswc.org/iiswc2026/cftw.html
    #IISWC2026 #ComputerArchitecture #Benchmarking #WorkloadCharacterization

  6. Monday madness brings us a new SIGARCH blog and the call for tutorial & workshop proposals for IISWC. Check em out!

    "Beyond Qubits: A Systems View of Hybrid CV-DV Quantum Computing"
    by Yuan Liu, Zihan Chen, Shubdeep Mohapatra, Jim Furches, Zheng (Eddy) Zhang, Huiyang Zhou
    sigarch.org/beyond-qubits-a-sy

    The IISWC 2026 Tutorial & Workshop CFP is officially OPEN!
    📢 If you are working on simulation tools, evaluation methodologies, or emerging domains like AI/ML and Cloud, submit your session proposals.
    Sept 27, 2026 Boulder, Colorado Submit to: [email protected]
    iiswc.org/iiswc2026/cftw.html
    #IISWC2026 #ComputerArchitecture #Benchmarking #WorkloadCharacterization

  7. Monday madness brings us a new SIGARCH blog and the call for tutorial & workshop proposals for IISWC. Check em out!

    "Beyond Qubits: A Systems View of Hybrid CV-DV Quantum Computing"
    by Yuan Liu, Zihan Chen, Shubdeep Mohapatra, Jim Furches, Zheng (Eddy) Zhang, Huiyang Zhou
    sigarch.org/beyond-qubits-a-sy

    The IISWC 2026 Tutorial & Workshop CFP is officially OPEN!
    📢 If you are working on simulation tools, evaluation methodologies, or emerging domains like AI/ML and Cloud, submit your session proposals.
    Sept 27, 2026 Boulder, Colorado Submit to: [email protected]
    iiswc.org/iiswc2026/cftw.html
    #IISWC2026 #ComputerArchitecture #Benchmarking #WorkloadCharacterization

  8. Ah yes, the riveting world of #C++ hashmaps—where time stands still and somehow still takes forever. 😴🔄 After 3 years, Martin bravely ventures back to prove that the art of #benchmarking is the best way to waste your time while pretending to be productive. ⚙️📉
    martin.ankerl.com/2022/08/27/h #hashmaps #productivity #timewasting #techhumor #HackerNews #ngated

  9. Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study We present a cross-architecture evaluation of production LLM inference on AMD Inst...

    #Computer #science #paper #AMD #Radeon #Instinct #MI325X #Benchmarking #LLM

    Origin | Interest | Match
  10. Thinkpad #X230 of 2013 (gift from friend),
    #apple #imac 18.2 of 2017 (350 euros 64 Gb RAM 27" Retina display) ,
    #hetzner CX43 #vm (10 euros per month) online server,
    #Supermicro (SM in table on right) X11-WTR SYS-5019P-WTR #Xeon Silver 4110 × 16 (basic parts 500 euros 2nd hand)
    CPU comparisons using #hardinfo2 #benchmarking #homelab

  11. Thinkpad of 2013 (gift from friend),
    18.2 of 2017 (350 euros 64 Gb RAM 27" Retina display) ,
    CX43 (10 euros per month) online server,
    (SM in table on right) X11-WTR SYS-5019P-WTR Silver 4110 × 16 (basic parts 500 euros 2nd hand)
    CPU comparisons using

  12. Thinkpad #X230 of 2013 (gift from friend),
    #apple #imac 18.2 of 2017 (350 euros 64 Gb RAM 27" Retina display) ,
    #hetzner CX43 #vm (10 euros per month) online server,
    #Supermicro (SM in table on right) X11-WTR SYS-5019P-WTR #Xeon Silver 4110 × 16 (basic parts 500 euros 2nd hand)
    CPU comparisons using #hardinfo2 #benchmarking #homelab

  13. Thinkpad #X230 of 2013 (gift from friend),
    #apple #imac 18.2 of 2017 (350 euros 64 Gb RAM 27" Retina display) ,
    #hetzner CX43 #vm (10 euros per month) online server,
    #Supermicro (SM in table on right) X11-WTR SYS-5019P-WTR #Xeon Silver 4110 × 16 (basic parts 500 euros 2nd hand)
    CPU comparisons using #hardinfo2 #benchmarking #homelab

  14. Thinkpad #X230 of 2013 (gift from friend),
    #apple #imac 18.2 of 2017 (350 euros 64 Gb RAM 27" Retina display) ,
    #hetzner CX43 #vm (10 euros per month) online server,
    #Supermicro (SM in table on right) X11-WTR SYS-5019P-WTR #Xeon Silver 4110 × 16 (basic parts 500 euros 2nd hand)
    CPU comparisons using #hardinfo2 #benchmarking #homelab

  15. I built labeille to find CPython JIT crashes, but it's a "run real world test suites at scale" platform.

    It also works for:
    — Checking which packages pass their tests on a new CPython version
    — Testing free-threaded (no-GIL) CPython compatibility
    — Measuring coverage.py or memray overhead across hundreds of packages
    — Comparing CPython vs PyPy performance on real code

    The registry of 350+ packages with install/test commands is the core.

    #Python #CPython #PyPI #testing #benchmarking #labeille

  16. I've been working on a new Python tool: labeille. Its main purpose is to look for CPython JIT crashes by running real world test suites.

    github.com/devdanzin/labeille

    But it's grown a feature that might interest more people: benchmarking using PyPI packages.

    How does that work?

    labeille allows you to run test suites in 2 different configurations. Say, with coverage on and off, or memray on and off. Here's an example:

    gist.github.com/devdanzin/6352

    #Python #labeille #fuzzing #JIT #PyPI #benchmarking

  17. I've been working on a new Python tool: labeille. Its main purpose is to look for CPython JIT crashes by running real world test suites.

    github.com/devdanzin/labeille

    But it's grown a feature that might interest more people: benchmarking using PyPI packages.

    How does that work?

    labeille allows you to run test suites in 2 different configurations. Say, with coverage on and off, or memray on and off. Here's an example:

    gist.github.com/devdanzin/6352

    #Python #labeille #fuzzing #JIT #PyPI #benchmarking

  18. I've been working on a new Python tool: labeille. Its main purpose is to look for CPython JIT crashes by running real world test suites.

    github.com/devdanzin/labeille

    But it's grown a feature that might interest more people: benchmarking using PyPI packages.

    How does that work?

    labeille allows you to run test suites in 2 different configurations. Say, with coverage on and off, or memray on and off. Here's an example:

    gist.github.com/devdanzin/6352

    #Python #labeille #fuzzing #JIT #PyPI #benchmarking

  19. Companies Overpay 5-10x for LLMs Without Benchmarking Alternatives Companies are wasting billions on expensive large language models (LLMs) without benchmarking them against specific needs, often o...

    #AITrends #AI #cost #efficiency #large #language #models #LLM #benchmarking #overpayment #trap

    Origin | Interest | Match
  20. #throwback What started as a simple DBaaS comparison turned into a deep dive into PostgreSQL benchmarking
    🚀 Dirk Krautschick shares hard-earned lessons on tools, workloads, tuning, and real vs synthetic benchmarks. Avoid common pitfalls and benchmark smarter.

    ▶️ Watch now! youtube.com/watch?v=aB5dNcpBI4

    #PostgreSQL #PGDay #PPDD #Benchmarking #DatabasePerformance

  21. #throwback What started as a simple DBaaS comparison turned into a deep dive into PostgreSQL benchmarking
    🚀 Dirk Krautschick shares hard-earned lessons on tools, workloads, tuning, and real vs synthetic benchmarks. Avoid common pitfalls and benchmark smarter.

    ▶️ Watch now! youtube.com/watch?v=aB5dNcpBI4

    #PostgreSQL #PGDay #PPDD #Benchmarking #DatabasePerformance

  22. #throwback What started as a simple DBaaS comparison turned into a deep dive into PostgreSQL benchmarking
    🚀 Dirk Krautschick shares hard-earned lessons on tools, workloads, tuning, and real vs synthetic benchmarks. Avoid common pitfalls and benchmark smarter.

    ▶️ Watch now! youtube.com/watch?v=aB5dNcpBI4

    #PostgreSQL #PGDay #PPDD #Benchmarking #DatabasePerformance

  23. #throwback What started as a simple DBaaS comparison turned into a deep dive into PostgreSQL benchmarking
    🚀 Dirk Krautschick shares hard-earned lessons on tools, workloads, tuning, and real vs synthetic benchmarks. Avoid common pitfalls and benchmark smarter.

    ▶️ Watch now! youtube.com/watch?v=aB5dNcpBI4

    #PostgreSQL #PGDay #PPDD #Benchmarking #DatabasePerformance

  24. #throwback What started as a simple DBaaS comparison turned into a deep dive into PostgreSQL benchmarking
    🚀 Dirk Krautschick shares hard-earned lessons on tools, workloads, tuning, and real vs synthetic benchmarks. Avoid common pitfalls and benchmark smarter.

    ▶️ Watch now! youtube.com/watch?v=aB5dNcpBI4

    #PostgreSQL #PGDay #PPDD #Benchmarking #DatabasePerformance

  25. Die @Cyberagentur startet HEGEMON, einen europaweit einzigartigen Forschungswettbewerb zur Bewertung und Anpassung von Foundation Models für sicherheitskritische Anwendungen. Vier Teams entwickeln Benchmarks und KI-Modelle für komplexe Aufgaben im Geoinformationswesen.
    Mehr dazu: t1p.de/7ct97
    #Cyberagentur #HEGEMON #KI #FoundationModels #Cybersicherheit #Benchmarking

  26. Cộng đồng đang tìm kiếm công cụ benchmark tốt nhất cho các cổng AI LiteLLM và mô hình. Các tiêu chí quan trọng bao gồm TTFT, tốc độ xuất token, độ chính xác, và kiểm tra dưới áp lực. Bạn có biết công cụ "plug and play" nào không?

    #AI #Benchmarking #LiteLLM #LLM #Tools #ArtificialIntelligence #ĐánhGiáAI #CôngCụAI #HọcMáy

    reddit.com/r/LocalLLaMA/commen

  27. [Перевод] GDPval: измерение производительности AI-моделей на реальных задачах

    Наша миссия — обеспечить то, чтобы искусственный общий интеллект (AGI) приносил пользу всему человечеству. В рамках этой миссии мы стремимся максимально прозрачно освещать прогресс того, как AI-модели учатся помогать людям в реальной жизни. Именно поэтому мы представляем GDPval — новую систему оценки, разработанную для отслеживания того, насколько эффективно наши модели и модели других разработчиков справляются с задачами, имеющими экономическую ценность и практическое значение. Мы назвали эту метрику GDPval, потому что она вдохновлена концепцией валового внутреннего продукта (ВВП, англ. GDP) как ключевого экономического индикатора, а набор задач основан на типичных ролях в индустриях, которые вносят наибольший вклад в ВВП. Люди часто рассуждают о масштабном влиянии AI на общество, но самый наглядный способ понять каков его потенциал, это посмотреть на то, что модели уже умеют делать на практике. История показывает, что крупным технологиям, от интернета до смартфонов, требовалось более десяти лет, чтобы пройти путь от изобретения до массового внедрения. Такие оценки, как GDPval, помогают приземлить разговоры о будущем ИИ на факты, а не на догадки, и дают возможность отслеживать прогресс моделей во времени.

    habr.com/ru/articles/962702/

    #ai #llm #openai #gpt #genai #benchmark #benchmarking #chatgpt #open_ai

  28. Oh joy, yet another benchmark suite promising to revolutionize #coding with a dazzling cocktail of #languages nobody actually uses 😴. I mean, really, #Oberon and Luon? Are we fast yet, or just fast asleep? 😂🚀
    github.com/rochus-keller/Are-w #benchmarking #revolution #Luon #fastasleep #HackerNews #ngated

  29. [Перевод] Неожиданный результат: ИИ замедляет опытных разработчиков

    Мы провели рандомизированное контролируемое исследование (RCT), чтобы оценить, как инструменты искусственного интеллекта начала 2025 года влияют на продуктивность опытных open-source разработчиков, работающих в своих собственных репозиториях. Неожиданно оказалось, что при использовании ИИ-инструментов разработчики выполняют задачи на 19% дольше, чем без них — то есть ИИ замедляет их работу. Мы рассматриваем этот результат как срез текущего уровня возможностей ИИ в одном из прикладных сценариев. Поскольку системы продолжают стремительно развиваться, мы планируем использовать аналогичную методологию в будущем, чтобы отслеживать, насколько ИИ способен ускорять работу в сфере автоматизации R&D [1] . Подробности — в полной версии статьи .

    habr.com/ru/articles/936938/

    #ai #ai_agent #ai_tools #benchmark #benchmarking #development #opensource #developer #ии #ии_помощник