home.social

#preprocessing — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #preprocessing, aggregated by home.social.

  1. Pipeline release! nf-core/sarek v3.8.1 - 3.8.1 - Laitaure!
    Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
    Please see the changelog: github.com/nf-core/sarek/relea

    #annotation #cancer #gatk4 #genomics #germline #preprocessing #somatic #targetpanels #variantcalling #wholeexomesequencing #wholegenomesequencing #nfcore #openscience #nextflow #bioinformatics

  2. Pipeline release! nf-core/sarek v3.8.0 - 3.8.0 - Sitojaure!
    Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
    Please see the changelog: github.com/nf-core/sarek/relea

    #annotation #cancer #gatk4 #genomics #germline #preprocessing #somatic #targetpanels #variantcalling #wholeexomesequencing #wholegenomesequencing #nfcore #openscience #nextflow #bioinformatics

  3. Компрессор для данных или как я написал свой первый custom transformer

    Эта статья будет полезна DS специалистам, и тем, кто хоть когда-нибудь сталкивался с такой проблемой, как выбросы в данных или OOD (out of distribution), и ищет пути решения проблем, возникающих из-за них.

    habr.com/ru/articles/988736/

    #выбросы #анализ_данных #data_science #preprocessing #compression #outliner #custom_transformer #transformer #sklearn

  4. Компрессор для данных или как я написал свой первый custom transformer

    Эта статья будет полезна DS специалистам, и тем, кто хоть когда-нибудь сталкивался с такой проблемой, как выбросы в данных или OOD (out of distribution), и ищет пути решения проблем, возникающих из-за них.

    habr.com/ru/articles/988736/

    #выбросы #анализ_данных #data_science #preprocessing #compression #outliner #custom_transformer #transformer #sklearn

  5. Компрессор для данных или как я написал свой первый custom transformer

    Эта статья будет полезна DS специалистам, и тем, кто хоть когда-нибудь сталкивался с такой проблемой, как выбросы в данных или OOD (out of distribution), и ищет пути решения проблем, возникающих из-за них.

    habr.com/ru/articles/988736/

    #выбросы #анализ_данных #data_science #preprocessing #compression #outliner #custom_transformer #transformer #sklearn

  6. Компрессор для данных или как я написал свой первый custom transformer

    Эта статья будет полезна DS специалистам, и тем, кто хоть когда-нибудь сталкивался с такой проблемой, как выбросы в данных или OOD (out of distribution), и ищет пути решения проблем, возникающих из-за них.

    habr.com/ru/articles/988736/

    #выбросы #анализ_данных #data_science #preprocessing #compression #outliner #custom_transformer #transformer #sklearn

  7. Pipeline release! nf-core/sarek v3.7.1 - 3.7.1 - Buollámtjåhkka!
    Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
    Please see the changelog: github.com/nf-core/sarek/relea

    #annotation #cancer #gatk4 #genomics #germline #preprocessing #somatic #targetpanels #variantcalling #wholeexomesequencing #wholegenomesequencing #nfcore #openscience #nextflow #bioinformatics

  8. Pipeline release! nf-core/sarek v3.7.0 - 3.7.0 - Saltoluokta!
    Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
    Please see the changelog: github.com/nf-core/sarek/relea

    #annotation #cancer #gatk4 #genomics #germline #preprocessing #somatic #targetpanels #variantcalling #wholeexomesequencing #wholegenomesequencing #nfcore #openscience #nextflow #bioinformatics

  9. #GESISMethodsHub #ResearchTools #OpenScience #SocialMedia #PreProcessing

    Your data needs some pre-processing before analysis? Methods Hub has you covered.

    Extract entities from social media posts such as hashtags or emojis that serve as indicators in further analysis:
    ➡️ doi.org/10.71627/extract_urls_

    Inspect your data for implicit biases before using it as training material in machine learning:
    ➡️ doi.org/10.71627/weat

    Or explore more at:
    ➡️ methodshub.gesis.org/

  10. This morning I finished another post on my tiny blog, this time about how I set up automatic image pre-processing in @eleventy to maintain a perfect Lighthouse score while allowing myself to be lazy about images: martingunnarsson.com/posts/ele

    #eleventy #11ty #web #webdev #webdevelopment #image #images #processing #preprocessing #performance #webperformance #lighthouse

  11. A new benchmark for data 📚
    Rather than test if a model is good
    This tests whether you can filter data
    360 languages

    They also share metrics for data redundancy if you want just those
    arxiv.org/abs/2311.06440
    github.com/toizzy/
    #data #preprocessing #dedup #enough2skim #NLP #NLProc

  12. Extremely noticeable #KNOWLEDGE GAPS of ChatGPT in the #history of #Holocaust-related art claims make it clearer than ever the urgency of understanding the data #pipelines that feed the #AI language model.

    What #filters are used in #OpenAI's data #preprocessing to EXCLUDE information? Who decides which information to exclude? What triggers exclusion?

    #ChatGPT fills gaps with plausible -sounding disinformation - which is a disaster

    #EHRI #YadVashem #memory #looted #histodon #tech #FAIR

  13. I decided to ditch CodeKit app as I made the mistake of looking at the developer's tweets and he... is not the kind of person I want to support.

    I'm trying out Prepros which seems to work nicely, in fact it went without a hitch whereas it took me a while to get CodeKit working with SSL.

    #development #webdev #webdevelopment #sass #preprocessing

    prepros.io

  14. One Hot Encoding categorical data is an important part of pre-processing for machine and deep learning models.

    ...but are you using the best method to achieve it?

    towardsdatascience.com/the-bes


  15. Can the Continuous Wavelet Transform (CWT) improve the predictions of your deep / machine learning models?

    Reduced chance of over-fitting to noise, or other anomalies, in your raw data. Resulting in simpler lightweight models.

    A powerful preprocessing technique.

    medium.com/mlearning-ai/the-po