#featureselection — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #featureselection, aggregated by home.social.

Chloé Azencott @[email protected] · 2026-05-21 · 07:10 UTC

Our paper (with Julie Cartier, Johanna Lagoas, Youmna Ayadi, Adeline Fermanian and @flomass) on the use of statistical knockoffs for the differential analysis of transcriptomics data just came out, very appropriately as it nicely illustrates my point:
https://academic.oup.com/bib/article/27/3/bbag148/8687371
Using simulated outcomes on real transcriptomics data, we've shown that KOs (and in particular, the KOPI approach) do retrieve important variables with better power than classical approaches (Wilcoxon, Lasso), while controlling FDR.
However, all methods perform poorly when the relationship between gene expressions and outcome is nonlinear.
On real outcomes, the method is overly conservative (having no discoveries is a surefire way of controlling your number of false discoveries), and we had to turn the false discovery rate threshold to 50% to select any gene at all.
#machineLearning #genomics #featureSelection #biomarkerDiscovery #transcriptomics

#machinelearning #genomics #featureselection #biomarkerdiscovery #transcriptomics
Chloé Azencott @[email protected] · 2026-05-21 · 07:10 UTC

Our paper (with Julie Cartier, Johanna Lagoas, Youmna Ayadi, Adeline Fermanian and @flomass) on the use of statistical knockoffs for the differential analysis of transcriptomics data just came out, very appropriately as it nicely illustrates my point:
https://academic.oup.com/bib/article/27/3/bbag148/8687371
Using simulated outcomes on real transcriptomics data, we've shown that KOs (and in particular, the KOPI approach) do retrieve important variables with better power than classical approaches (Wilcoxon, Lasso), while controlling FDR.
However, all methods perform poorly when the relationship between gene expressions and outcome is nonlinear.
On real outcomes, the method is overly conservative (having no discoveries is a surefire way of controlling your number of false discoveries), and we had to turn the false discovery rate threshold to 50% to select any gene at all.
#machineLearning #genomics #featureSelection #biomarkerDiscovery #transcriptomics

#machinelearning #genomics #featureselection #biomarkerdiscovery #transcriptomics
Chloé Azencott @[email protected] · 2026-05-21 · 07:10 UTC

Our paper (with Julie Cartier, Johanna Lagoas, Youmna Ayadi, Adeline Fermanian and @flomass) on the use of statistical knockoffs for the differential analysis of transcriptomics data just came out, very appropriately as it nicely illustrates my point:
https://academic.oup.com/bib/article/27/3/bbag148/8687371
Using simulated outcomes on real transcriptomics data, we've shown that KOs (and in particular, the KOPI approach) do retrieve important variables with better power than classical approaches (Wilcoxon, Lasso), while controlling FDR.
However, all methods perform poorly when the relationship between gene expressions and outcome is nonlinear.
On real outcomes, the method is overly conservative (having no discoveries is a surefire way of controlling your number of false discoveries), and we had to turn the false discovery rate threshold to 50% to select any gene at all.
#machineLearning #genomics #featureSelection #biomarkerDiscovery #transcriptomics

#machinelearning #genomics #featureselection #biomarkerdiscovery #transcriptomics
Chloé Azencott @[email protected] · 2026-05-21 · 07:10 UTC

Our paper (with Julie Cartier, Johanna Lagoas, Youmna Ayadi, Adeline Fermanian and @flomass) on the use of statistical knockoffs for the differential analysis of transcriptomics data just came out, very appropriately as it nicely illustrates my point:
https://academic.oup.com/bib/article/27/3/bbag148/8687371
Using simulated outcomes on real transcriptomics data, we've shown that KOs (and in particular, the KOPI approach) do retrieve important variables with better power than classical approaches (Wilcoxon, Lasso), while controlling FDR.
However, all methods perform poorly when the relationship between gene expressions and outcome is nonlinear.
On real outcomes, the method is overly conservative (having no discoveries is a surefire way of controlling your number of false discoveries), and we had to turn the false discovery rate threshold to 50% to select any gene at all.
#machineLearning #genomics #featureSelection #biomarkerDiscovery #transcriptomics

#transcriptomics #biomarkerdiscovery #featureselection #genomics #machinelearning
Chloé Azencott @[email protected] · 2026-05-21 · 07:10 UTC

Our paper (with Julie Cartier, Johanna Lagoas, Youmna Ayadi, Adeline Fermanian and @flomass) on the use of statistical knockoffs for the differential analysis of transcriptomics data just came out, very appropriately as it nicely illustrates my point:
https://academic.oup.com/bib/article/27/3/bbag148/8687371
Using simulated outcomes on real transcriptomics data, we've shown that KOs (and in particular, the KOPI approach) do retrieve important variables with better power than classical approaches (Wilcoxon, Lasso), while controlling FDR.
However, all methods perform poorly when the relationship between gene expressions and outcome is nonlinear.
On real outcomes, the method is overly conservative (having no discoveries is a surefire way of controlling your number of false discoveries), and we had to turn the false discovery rate threshold to 50% to select any gene at all.
#machineLearning #genomics #featureSelection #biomarkerDiscovery #transcriptomics

#machinelearning #genomics #featureselection #biomarkerdiscovery #transcriptomics
Tiago F. R. Ribeiro @[email protected] · 2025-07-07 · 15:51 UTC

Feature Selection: A Simplified Guide
“Feature selection is a key step in ML: it reduces dimensionality and complexity by keeping only the most informative variables.
It improves accuracy, speeds up training, and enhances interpretability, in both supervised and unsupervised tasks.“
📎https://tiagoribeiro.vercel.app/blog_posts/4_feature_selection.html
#MachineLearning #FeatureSelection #AI

#machinelearning #featureselection #ai
Eric Maugendre about data @[email protected] · 2024-12-17 · 19:42 UTC

Enteric Fermentation in 2022
Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.
(The bubble sizes depend on the amount of methane sent in 2022.)
#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon

#greenhouseforcing #methane #emissions #climatechange #climatebreakdown #climatecollapse
Eric Maugendre about data @[email protected] · 2024-12-17 · 19:42 UTC

Enteric Fermentation in 2022
Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.
(The bubble sizes depend on the amount of methane sent in 2022.)
#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon

#greenhouseforcing #methane #emissions #climatechange #climatebreakdown #climatecollapse
Eric Maugendre about data @maugendre · 2024-12-17 · 19:42 UTC

Enteric Fermentation in 2022
Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.
(The bubble sizes depend on the amount of methane sent in 2022.)
#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon

#greenhouseforcing #methane #emissions #climatechange #climatebreakdown #climatecollapse
Eric Maugendre about data @[email protected] · 2024-12-17 · 19:42 UTC

Enteric Fermentation in 2022
Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.
(The bubble sizes depend on the amount of methane sent in 2022.)
#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon

#datadon #featureselection #featureengineering #plotly #datamining #bubblechart
Eric Maugendre about data @[email protected] · 2024-12-17 · 19:42 UTC

Enteric Fermentation in 2022
Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.
(The bubble sizes depend on the amount of methane sent in 2022.)
#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon

#greenhouseforcing #methane #emissions #climatechange #climatebreakdown #climatecollapse
Eric Maugendre about data @[email protected] · 2024-12-11 · 05:13 UTC

Feature Selection in Python; a script ready to use: https://johfischer.com/2021/08/06/correlation-based-feature-selection-in-python-from-scratch/
#interpretability #featureSelection #python #probability #probabilities #bigData #classification #linearRegression #regression #Schusterbauer #inference #AIDev

#interpretability #featureselection #python #probability #probabilities #bigdata
IB Teguh TM @[email protected] · 2024-08-26 · 04:01 UTC

Discover effective feature selection strategies in machine learning. Learn how filter, wrapper, and embedded methods improve model accuracy and efficiency. #MachineLearning #FeatureSelection
https://teguhteja.id/feature-selection-strategies-machine-learning/

#machinelearning #featureselection
katch wreck @[email protected] · 2023-04-11 · 04:42 UTC

`It is considered a non-linear approach as the mapping cannot be represented as a linear combination of the original variables as possible in techniques such as principal component analysis, which also makes it more difficult to use for classification applications`
https://en.wikipedia.org/wiki/Sammon_mapping
#machineLearning #clustering #classification #featureExtraction #featureEngineering #featureSelection #featureRanking #dimensionalityReduction #nonlinear

#machinelearning #clustering #classification #featureextraction #featureengineering #featureselection
katch wreck @[email protected] · 2023-04-11 · 04:42 UTC

`It is considered a non-linear approach as the mapping cannot be represented as a linear combination of the original variables as possible in techniques such as principal component analysis, which also makes it more difficult to use for classification applications`
https://en.wikipedia.org/wiki/Sammon_mapping
#machineLearning #clustering #classification #featureExtraction #featureEngineering #featureSelection #featureRanking #dimensionalityReduction #nonlinear

#machinelearning #clustering #classification #featureextraction #featureengineering #featureselection
katch wreck @[email protected] · 2023-04-11 · 04:42 UTC

`It is considered a non-linear approach as the mapping cannot be represented as a linear combination of the original variables as possible in techniques such as principal component analysis, which also makes it more difficult to use for classification applications`
https://en.wikipedia.org/wiki/Sammon_mapping
#machineLearning #clustering #classification #featureExtraction #featureEngineering #featureSelection #featureRanking #dimensionalityReduction #nonlinear

#nonlinear #dimensionalityreduction #featureranking #featureselection #featureengineering #featureextraction
katch wreck @[email protected] · 2023-04-11 · 04:42 UTC

`It is considered a non-linear approach as the mapping cannot be represented as a linear combination of the original variables as possible in techniques such as principal component analysis, which also makes it more difficult to use for classification applications`
https://en.wikipedia.org/wiki/Sammon_mapping
#machineLearning #clustering #classification #featureExtraction #featureEngineering #featureSelection #featureRanking #dimensionalityReduction #nonlinear

#machinelearning #clustering #classification #featureextraction #featureengineering #featureselection
AlexHenderson @alexhenderson · 2022-12-16 · 16:12 UTC

iCite: "ITERATIVE RE‐WEIGHTED COVARIATES SELECTION FOR ROBUST FEATURE SELECTION MODELLING IN THE PRESENCE OF OUTLIERS (IRCOVSEL)"
Journal of Chemometrics 2022
https://doi.org/10.1002/cem.3458.
#openaccess #chemometrics #featureselection

#chemometrics #featureselection #openaccess
Gregor Kasieczka @[email protected] · 2022-12-02 · 09:08 UTC

📢📢📢 New #Paper: '#FeatureSelection with Distance Correlation' (https://arxiv.org/abs/2212.00046) - a short #PaperSummary thread
We investigates how to automatically find a small # of features that - when put into a simple #NeuralNetwork - yield good performance (e.g. for classification)
Two possible uses:
- Explain the behavior of a #BlackBox classifier
- Build a light-weight classifier from scratch

#paper #featureselection #papersummary #neuralnetwork #blackbox