home.social

#snakemake — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #snakemake, aggregated by home.social.

  1. RE: fediscience.org/@snakemake/116

    This little bit "performance improvements" lowered the number of file system access events for considerably! #Snakemake trigger many such events for keeping track of metadata. Which is important, but may cause some delays due to file system overhead - particularly on parallel and/or network file systems. The feature to outsource parts of this to sqlite was implemented during the #SnakemakeHackathon2026 . I hope, I can test the improvements next Monday!

    #HPC #ReproducibleComputing

  2. RE: fediscience.org/@snakemake/116

    This little bit "performance improvements" lowered the number of file system access events for considerably! #Snakemake trigger many such events for keeping track of metadata. Which is important, but may cause some delays due to file system overhead - particularly on parallel and/or network file systems. The feature to outsource parts of this to sqlite was implemented during the #SnakemakeHackathon2026 . I hope, I can test the improvements next Monday!

    #HPC #ReproducibleComputing

  3. RE: fediscience.org/@snakemake/116

    This little bit "performance improvements" lowered the number of file system access events for considerably! #Snakemake trigger many such events for keeping track of metadata. Which is important, but may cause some delays due to file system overhead - particularly on parallel and/or network file systems. The feature to outsource parts of this to sqlite was implemented during the #SnakemakeHackathon2026 . I hope, I can test the improvements next Monday!

    #HPC #ReproducibleComputing

  4. RE: fediscience.org/@snakemake/116

    This little bit "performance improvements" lowered the number of file system access events for considerably! #Snakemake trigger many such events for keeping track of metadata. Which is important, but may cause some delays due to file system overhead - particularly on parallel and/or network file systems. The feature to outsource parts of this to sqlite was implemented during the #SnakemakeHackathon2026 . I hope, I can test the improvements next Monday!

    #HPC #ReproducibleComputing

  5. RE: fediscience.org/@snakemake/116

    This little bit "performance improvements" lowered the number of file system access events for considerably! #Snakemake trigger many such events for keeping track of metadata. Which is important, but may cause some delays due to file system overhead - particularly on parallel and/or network file systems. The feature to outsource parts of this to sqlite was implemented during the #SnakemakeHackathon2026 . I hope, I can test the improvements next Monday!

    #HPC #ReproducibleComputing

  6. RE: fediscience.org/@snakemake/116

    Software provenance with #Snakemake: Using the reporter plugin for nanopublications, we can now get slightly improved nanopublications like this one: w3id.org/np/RAmgzfta63xx0wWc_z (press on the little blue arrow on the right to see the full details). Automatically captured for this workflow: w3id.org/np/RAjHDlPDghZzc9ZvQ3 - again expressed a nanopub declaration. 😉

    It now supports to capture the "classic" software support for #Conda and Snakemake wrappers.

    There is more work to do. Let's see when and if I get to it.

    #reproducibleComputing #softwareprovenance #nanopub

  7. The #SnakemakeHackathon2026 has ended, we are still preparing our preprint release. But, our host has prepared a note on their homepage: go.tum.de/946236 🥳

    #Snakemake #ReproducibleComputing

  8. RE: fediscience.org/@snakemake/116

    This is a big step forward: The SLURM plugin for Snakemake now supports so-called job arrays. These are cluster jobs, with ~ equal resource requirements in terms of memory and compute resources.

    The change in itself was big: The purpose of a workflow system is to make use of the vast resources of an HPC cluster. Hence, jobs are submitted to run concurrently. However, for a job array, we have to "wait" for all eligible jobs to be ready. And then we submit.

    To preserve concurrent execution of other jobs which are ready to be executed, a thread pool has been introduced. In itself, I do not see job arrays as such a big feature: The LSF system profited much more from arrays than the rather lean SLURM implementation does.

    BUT: the new code base will ease further development to pooling many shared memory tasks (applications which support no parallel execution or are confined to one computer by "only" supporting threading). Until then, there is more work to do.

    #HPC #SLURM #Snakemake #SnakemakeHackathon2026 #ReproducibleComputing #OpenScience

  9. RE: fediscience.org/@snakemake/116

    What a week at the #SnakemakeHackathon2026 !

    What a wonderful week with wonderful people!

    We were pretty productive and this #Snakemake release is just the peak of it. The list of features, bug fixes, performance improvement and additional documentation is so long — our little announcement robot cannot display it all. Even here on FediiScience with its 1500-character limit!

    #ReproducibleComputing #OpenScience

  10. What do you see here? This is an example knowledge graph describing a #Snakemake analysis workflow. You see the workflow description, a linked data set and a linked report.

    All work done to boost #HPC user support for those conducting their workflows on HPC systems (you can run Snakemake on other platforms, too).

    My to-do list:
    - an assertion template for workflows: ✅
    - another for reports: ✅ (simple datasets are already in the #nanopub verse)
    - a plugin to gather software metadata and publish as a nanopub ❌ (half done: #SnakemakeHackathon2026 )

    Kudos to @nanopub / @tkuhn and @johanneskoester - without them this pursuit would (have been) futile! And my feeling is that @fbartusch will play an important role in any further development ...

    #OpenScience #ReproducibleComputing

  11. Currently looking at tools to replace #snakemake for my usage (command runner + workflow manager).

    well, i can't say that i'm really surprised, but this snakemake dependency is **heavy**...

    and after trying a bit the other options listed here (x-axis), i still think #gnumake fits my usage the best given the size, i just need to survive the syntax... 😬

  12. Currently looking at tools to replace #snakemake for my usage (command runner + workflow manager).

    well, i can't say that i'm really surprised, but this snakemake dependency is **heavy**...

    and after trying a bit the other options listed here (x-axis), i still think #gnumake fits my usage the best given the size, i just need to survive the syntax... 😬

  13. Currently looking at tools to replace #snakemake for my usage (command runner + workflow manager).

    well, i can't say that i'm really surprised, but this snakemake dependency is **heavy**...

    and after trying a bit the other options listed here (x-axis), i still think #gnumake fits my usage the best given the size, i just need to survive the syntax... 😬

  14. Currently looking at tools to replace #snakemake for my usage (command runner + workflow manager).

    well, i can't say that i'm really surprised, but this snakemake dependency is **heavy**...

    and after trying a bit the other options listed here (x-axis), i still think #gnumake fits my usage the best given the size, i just need to survive the syntax... 😬

  15. Currently looking at tools to replace #snakemake for my usage (command runner + workflow manager).

    well, i can't say that i'm really surprised, but this snakemake dependency is **heavy**...

    and after trying a bit the other options listed here (x-axis), i still think #gnumake fits my usage the best given the size, i just need to survive the syntax... 😬

  16. I want to reach out: I have this pending release for the SLURM executor (github.com/snakemake/snakemake ). It implements better error feedback (in case of hardware failures and otherwise). It would need some thorough checking, and I cannot provoke too many hardware failures. Is anyone working on an older cluster? 😉

    Feedback (as issues) would be appreciated. Also nice, if you tell me it is working, here.

    #Snakemake #HPC #SLURM #ReproducibleComputing

  17. I finished the basic #tutorial for #snakemake and it really fits my vibe right now, hehe
    Also learned a bit more about #conda along the way. I'll take that

    snakemake.readthedocs.io/en/st

  18. Anyone here managing their experiments/workflows with gnu #make ? Any tips ?

    I was a #snakemake user, but I switched to #makefile recently because of the increasing complexity/bloat of Snakemake and I don't need the majority of the features... (plus colleagues were not using/familiar with Snakemake)

    The make language is for sure less user-friendly than Snakemake's, but I'm still able to do what I need/want (just with more boilerplate).
    I had to write small Makefile functions to keep some of my sanity...
    (BTW: `.RECIPEPREFIX` let you redefine the prefix instead of the annoying tab! [1])

    From what I understood, GNU Make can be extended with #guile [2], maybe that could help ? (but adds another dependency tho...)

    TL;DR: I just want a simple/easy/lightweight/expressive workflow manager... 😔

    [1] gnu.org/software/make/manual/h
    [2] gnu.org/software/make/manual/h

  19. The last blog post I wrote was about Life Science Support on HPC clusters. Honestly? It was more of a rant. Not a good blog post.

    So, someone suggested I delete it, which I did. It took me a long time to recover. Now, I have re-written this blog post. I think it is better. I weighted every phrase. It still has some rant flavour.

    Here it is: blogs.fediscience.org/rupture-

    The upcoming articles will have more of an example character. But I still will not be able to update on a regular basis.

    #Bioinformatics #HPC #ReproducibleComputing #Snakemake #Nextflow

  20. @Dutch_Reproducibility_Network

    In fact, I am a #Snakemake co-maintainer and teacher. I was not aware of WorkflowHub - and that was an omission on my part. We actually support and favour this kind of registration: snakemake.readthedocs.io/en/st

    In my original post, I also neglected to mention the integration of WorkflowHub with #RO-Crate and in turn, the integration of RO-Crates with nanopubs. I am actively working on a better support for #nanopub and RO-Crates with @fbartusch. The question, how I teach that stands: The #HPC world (at least my bubble) is not really supportive for #ReproducibleComputing . All #OpenScience shenanigans are frowned upon. And PIs in my vicinity are still on this level: phdcomics.com/comics/archive.p - so, how do we educate the educators?

  21. RE: fediscience.org/@snakemake/115

    Now, this is huge!

    Thanks to a contribution from Cade Mirchandani (Santa Cruz, CA), whom I met at this year's #SnakemakeHackathon users can now supply a partition profile. So, instead of wrangling #SLURM partition information into a workflow profile (indicated with --workflow-profile), we can now have a global file to contain this information.

    I added a time conversion function, such that the SLURM time format is obeyed, too.

    There are several other development needs, before we continue in this direction (e.g. parsing SLURM partition information directly). But a task to be done is summing this up for non-users, e.g. administrators, is due too.

    In any case, I think this merits a new major version.

    #Snakemake #HPC #ReproducibleComputing

  22. This took a while. After the new version of the Snakemake paper (a rolling paper on F1000) came out, the DOI now is "working" 🥳 :

    doi.org/10.12688/f1000research

    From my point of view, it particularly describes the working with various #HPC batch systems. And: Development did not cease. If you want to follow our announcement bot for updates: @snakemake

    #Snakemake #ReproducibleComputing #DataAnalysis #OpenScience #WorkflowManagement

  23. Just submitted a talk for FOSDEM (been invited). They asked to attach an icon-image for the talk. So I drew one. The compute racks are difficult to identify as such, but this is as far as my aquarelle skills go.

    #Snakemake #ReproducibleComputing #HPC

  24. This spring, we had a wonderful time at the CERN shaping the future of the Snakemake Workflow Management System during the Snakemake Hackathon. Next spring we will meet in Munich!

    If you want to take part in the Snakemake development, you can still register here: indico.cern.ch/e/snakemake-mee

    #Snakemake #HPC #Bioinformatics #physics #dataanalysis #ReproducibleComputing #OpenScience

  25. Where will I be in early March 2026?

    In Stuttgart! At the deRSE conference. I intend to submit a couple of work items dealing with my favourite workflow management system. And the call for contributions is open: mastodon.social/@de_rse/115270

    To give you an idea of what I have in mind, a few hashtags:

    #Snakemake #nanopub #fairdatamanagement #Fair #ReproducibleComputing

  26. Remember that I have been posting about the #SnakemakeHackathon2025 ?

    I never really finished that series. But now, we have two late contributions by Ward Deboutte and @johanneskoester . One describing the polishing of the multiple extension handling of #Snakemake for named inputs (zenodo.org/records/17121446) and stabilizing the JSON validator (zenodo.org/records/17121551).

    Cool!

    #ReproducibleComputing #OpenScience

  27. Interested in HPC compliant data analysis workflows?

    I am offering an NHR (German association for HPC resources) course for building and using Snakemake workflows on HPC clusters in Mainz, Germany! Two days: 9. & 10. December 2025 - on-site.

    To find out more and perhaps enrol, visit the course page: indico.zdv.uni-mainz.de/event/

    #Snakemake #HPC #ReproducibleResearch #ReproducibleComputing

  28. The #isc25 is over and I half-recovered from the weekend, too. Time to continue my thread summing up the #SnakemakeHackathon2025 !

    To me, an important contribution was from Michael Jahn from the Charpentier Lab: A complete re-design of the workflow catalogue. Have a look: snakemake.github.io/snakemake- - findability of ready-to-use workflows has greatly improved! Also, the description on how to contribute is now easy to find.

    A detailed description has been published in the #researchequals collection researchequals.com/collections under doi.org/10.5281/zenodo.1557464

    #Snakemake #ReproducibleComputing #ReproducibleResearch #OpenScience

  29. Returning from the #isc25 I will continue this thread with something applicable everywhere, not just on #HPC clusters:

    Workflow runs can crash. There are a number of possible reasons. Snakemake offers a `--rerun-incomple` flag (or short `--ri`) which lets a user resume a workflow.

    This contribution from Filipe G. Viera describes a small fix to stabilize the feature. Not only will incomplete files be removed after a crash, now it is ensured that all metadata with them are deleted too, before resuming: zenodo.org/records/15490098

    #Snakemake #SnakemakeHackathon2025 #ReproducibleComputing #OpenScience

  30. Today tooting from the #ISC25 - the International Supercomputing Conference. What better opportunity to brag about something I've done to facilitate using GPUs with Snakemake?

    Here is my contribution, simpler job configuration for GPU jobs:

    doi.org/10.5281/zenodo.1555179

    Not alone though: Without valuable input of @dryak . Without him, I would have overlooked something crucial.

    And when we talk about reproducible AI, my take is that we ought to consider workflow managers, too. Something which protocols what you have done with little effort.

    #SnakemakeHackathon2025 #Snakemake #ReproducibleComputing #OpenScience

  31. This morning, I am travelling to the #isc25 and hit a minor bug on #researchequals. Hence, no updates in the collection.

    But still a few to describe without adding the latest contributions:

    For instance, this one (zenodo.org/records/15490064) by Filipe G. Vieira: a helper function to extract checksums from files to compare with checksums Snakemake was already able to calculate. Really handy!

    #Snakemake #ReproducibleComputing #OpenScience

  32. does "snakemake --config config=in.yml -n --export-cwl "out.cwl works ? I don't have any ouput ( snakemake 9.1.3)

    #snakemake #cwl #workflow

  33. Time for a re-#introduction !

    I'm a #scicomm enthusiast and board member of #Fediscience. My background is in #Biophysics, done a Postdoc in #GeneticEpidemiology, industry detour, now working in #HPC for some years.

    Interested in #HPC, #bioinformatics, #OpenScience, #workflows (#snakemake), #RDM, #scientificsoftware and #sciencecommunication

    My blog can be found here: blogs.fediscience.org and my more political me can be found at @[email protected].

  34. I'm excited to announce a new episode of the #DeveloperStories Podcast! 🎉 This week we talk to a prominent leader in the biosciences community, Johannes Koester, creator of #Snakemake, #Bioconda, and several others, solving problems by building tools! 🐍

    Here are a few ways to listen!

    👉 Spotify: open.spotify.com/episode/2cTVZ
    👉 Apple Podcasts: podcasts.apple.com/us/podcast/
    👉 Home Page (with show notes)! rseng.github.io/devstories/202

  35. Time for an #introduction ! It's my 3rd or 4th day. Thanks for having me here. 😀

    My background: PhD in #Biophysics, PostDoc in #GeneticEpidemiology, industry detour, now working in #HPC for ~8y.

    Interested in #HPC, #bioinformatics, #OpenScience, #workflows (#snakemake), #RDM, #scientificsoftware and #sciencecommunication .

    Former scienceblogger (German), now looking for a new plattform, since the current (scienceblogs.de) is closing down.