#featureselection — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #featureselection, aggregated by home.social.
-
Our paper (with Julie Cartier, Johanna Lagoas, Youmna Ayadi, Adeline Fermanian and @flomass) on the use of statistical knockoffs for the differential analysis of transcriptomics data just came out, very appropriately as it nicely illustrates my point:
https://academic.oup.com/bib/article/27/3/bbag148/8687371Using simulated outcomes on real transcriptomics data, we've shown that KOs (and in particular, the KOPI approach) do retrieve important variables with better power than classical approaches (Wilcoxon, Lasso), while controlling FDR.
However, all methods perform poorly when the relationship between gene expressions and outcome is nonlinear.
On real outcomes, the method is overly conservative (having no discoveries is a surefire way of controlling your number of false discoveries), and we had to turn the false discovery rate threshold to 50% to select any gene at all.
#machineLearning #genomics #featureSelection #biomarkerDiscovery #transcriptomics
-
Our paper (with Julie Cartier, Johanna Lagoas, Youmna Ayadi, Adeline Fermanian and @flomass) on the use of statistical knockoffs for the differential analysis of transcriptomics data just came out, very appropriately as it nicely illustrates my point:
https://academic.oup.com/bib/article/27/3/bbag148/8687371Using simulated outcomes on real transcriptomics data, we've shown that KOs (and in particular, the KOPI approach) do retrieve important variables with better power than classical approaches (Wilcoxon, Lasso), while controlling FDR.
However, all methods perform poorly when the relationship between gene expressions and outcome is nonlinear.
On real outcomes, the method is overly conservative (having no discoveries is a surefire way of controlling your number of false discoveries), and we had to turn the false discovery rate threshold to 50% to select any gene at all.
#machineLearning #genomics #featureSelection #biomarkerDiscovery #transcriptomics
-
Our paper (with Julie Cartier, Johanna Lagoas, Youmna Ayadi, Adeline Fermanian and @flomass) on the use of statistical knockoffs for the differential analysis of transcriptomics data just came out, very appropriately as it nicely illustrates my point:
https://academic.oup.com/bib/article/27/3/bbag148/8687371Using simulated outcomes on real transcriptomics data, we've shown that KOs (and in particular, the KOPI approach) do retrieve important variables with better power than classical approaches (Wilcoxon, Lasso), while controlling FDR.
However, all methods perform poorly when the relationship between gene expressions and outcome is nonlinear.
On real outcomes, the method is overly conservative (having no discoveries is a surefire way of controlling your number of false discoveries), and we had to turn the false discovery rate threshold to 50% to select any gene at all.
#machineLearning #genomics #featureSelection #biomarkerDiscovery #transcriptomics
-
Our paper (with Julie Cartier, Johanna Lagoas, Youmna Ayadi, Adeline Fermanian and @flomass) on the use of statistical knockoffs for the differential analysis of transcriptomics data just came out, very appropriately as it nicely illustrates my point:
https://academic.oup.com/bib/article/27/3/bbag148/8687371Using simulated outcomes on real transcriptomics data, we've shown that KOs (and in particular, the KOPI approach) do retrieve important variables with better power than classical approaches (Wilcoxon, Lasso), while controlling FDR.
However, all methods perform poorly when the relationship between gene expressions and outcome is nonlinear.
On real outcomes, the method is overly conservative (having no discoveries is a surefire way of controlling your number of false discoveries), and we had to turn the false discovery rate threshold to 50% to select any gene at all.
#machineLearning #genomics #featureSelection #biomarkerDiscovery #transcriptomics
-
Our paper (with Julie Cartier, Johanna Lagoas, Youmna Ayadi, Adeline Fermanian and @flomass) on the use of statistical knockoffs for the differential analysis of transcriptomics data just came out, very appropriately as it nicely illustrates my point:
https://academic.oup.com/bib/article/27/3/bbag148/8687371Using simulated outcomes on real transcriptomics data, we've shown that KOs (and in particular, the KOPI approach) do retrieve important variables with better power than classical approaches (Wilcoxon, Lasso), while controlling FDR.
However, all methods perform poorly when the relationship between gene expressions and outcome is nonlinear.
On real outcomes, the method is overly conservative (having no discoveries is a surefire way of controlling your number of false discoveries), and we had to turn the false discovery rate threshold to 50% to select any gene at all.
#machineLearning #genomics #featureSelection #biomarkerDiscovery #transcriptomics
-
Feature Selection: A Simplified Guide
“Feature selection is a key step in ML: it reduces dimensionality and complexity by keeping only the most informative variables.
It improves accuracy, speeds up training, and enhances interpretability, in both supervised and unsupervised tasks.“📎https://tiagoribeiro.vercel.app/blog_posts/4_feature_selection.html
#MachineLearning #FeatureSelection #AI -
Enteric Fermentation in 2022
Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.(The bubble sizes depend on the amount of methane sent in 2022.)
#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon
-
Enteric Fermentation in 2022
Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.(The bubble sizes depend on the amount of methane sent in 2022.)
#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon
-
Enteric Fermentation in 2022
Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.(The bubble sizes depend on the amount of methane sent in 2022.)
#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon
-
Enteric Fermentation in 2022
Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.(The bubble sizes depend on the amount of methane sent in 2022.)
#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon
-
Enteric Fermentation in 2022
Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.(The bubble sizes depend on the amount of methane sent in 2022.)
#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon
-
Feature Selection in Python; a script ready to use: https://johfischer.com/2021/08/06/correlation-based-feature-selection-in-python-from-scratch/
#interpretability #featureSelection #python #probability #probabilities #bigData #classification #linearRegression #regression #Schusterbauer #inference #AIDev
-
Discover effective feature selection strategies in machine learning. Learn how filter, wrapper, and embedded methods improve model accuracy and efficiency. #MachineLearning #FeatureSelection
https://teguhteja.id/feature-selection-strategies-machine-learning/
-
`It is considered a non-linear approach as the mapping cannot be represented as a linear combination of the original variables as possible in techniques such as principal component analysis, which also makes it more difficult to use for classification applications`
https://en.wikipedia.org/wiki/Sammon_mapping
#machineLearning #clustering #classification #featureExtraction #featureEngineering #featureSelection #featureRanking #dimensionalityReduction #nonlinear
-
`It is considered a non-linear approach as the mapping cannot be represented as a linear combination of the original variables as possible in techniques such as principal component analysis, which also makes it more difficult to use for classification applications`
https://en.wikipedia.org/wiki/Sammon_mapping
#machineLearning #clustering #classification #featureExtraction #featureEngineering #featureSelection #featureRanking #dimensionalityReduction #nonlinear
-
`It is considered a non-linear approach as the mapping cannot be represented as a linear combination of the original variables as possible in techniques such as principal component analysis, which also makes it more difficult to use for classification applications`
https://en.wikipedia.org/wiki/Sammon_mapping
#machineLearning #clustering #classification #featureExtraction #featureEngineering #featureSelection #featureRanking #dimensionalityReduction #nonlinear
-
`It is considered a non-linear approach as the mapping cannot be represented as a linear combination of the original variables as possible in techniques such as principal component analysis, which also makes it more difficult to use for classification applications`
https://en.wikipedia.org/wiki/Sammon_mapping
#machineLearning #clustering #classification #featureExtraction #featureEngineering #featureSelection #featureRanking #dimensionalityReduction #nonlinear
-
iCite: "ITERATIVE RE‐WEIGHTED COVARIATES SELECTION FOR ROBUST FEATURE SELECTION MODELLING IN THE PRESENCE OF OUTLIERS (IRCOVSEL)"
Journal of Chemometrics 2022
https://doi.org/10.1002/cem.3458.
#openaccess #chemometrics #featureselection -
📢📢📢 New #Paper: '#FeatureSelection with Distance Correlation' (https://arxiv.org/abs/2212.00046) - a short #PaperSummary thread
We investigates how to automatically find a small # of features that - when put into a simple #NeuralNetwork - yield good performance (e.g. for classification)
Two possible uses:
- Explain the behavior of a #BlackBox classifier
- Build a light-weight classifier from scratch