#crossvalidation — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #crossvalidation, aggregated by home.social.
-
Favorability Mapping For Hydrothermal Power Resource Assessments Of The Great Basin, USA
--
https://doi.org/10.1016/j.geothermics.2025.103450 <-- shared paper
--
#geothermal #resourceassessment #hydrothermal #favorability #GreatBasin #datadriven #Machinelearning #ML #AI #USA #power #energy #powerplant #renewableenergy #survey #spatialanalysis #GIS #spatial #mapping #montecarlo #model #modeling #crossvalidation #predication #WesternUSA #algorithm #training #geology #vulcanism #favorability #research #prospecting #resources
#USGS -
Favorability Mapping For Hydrothermal Power Resource Assessments Of The Great Basin, USA
--
https://doi.org/10.1016/j.geothermics.2025.103450 <-- shared paper
--
#geothermal #resourceassessment #hydrothermal #favorability #GreatBasin #datadriven #Machinelearning #ML #AI #USA #power #energy #powerplant #renewableenergy #survey #spatialanalysis #GIS #spatial #mapping #montecarlo #model #modeling #crossvalidation #predication #WesternUSA #algorithm #training #geology #vulcanism #favorability #research #prospecting #resources
#USGS -
Favorability Mapping For Hydrothermal Power Resource Assessments Of The Great Basin, USA
--
https://doi.org/10.1016/j.geothermics.2025.103450 <-- shared paper
--
#geothermal #resourceassessment #hydrothermal #favorability #GreatBasin #datadriven #Machinelearning #ML #AI #USA #power #energy #powerplant #renewableenergy #survey #spatialanalysis #GIS #spatial #mapping #montecarlo #model #modeling #crossvalidation #predication #WesternUSA #algorithm #training #geology #vulcanism #favorability #research #prospecting #resources
#USGS -
Favorability Mapping For Hydrothermal Power Resource Assessments Of The Great Basin, USA
--
https://doi.org/10.1016/j.geothermics.2025.103450 <-- shared paper
--
#geothermal #resourceassessment #hydrothermal #favorability #GreatBasin #datadriven #Machinelearning #ML #AI #USA #power #energy #powerplant #renewableenergy #survey #spatialanalysis #GIS #spatial #mapping #montecarlo #model #modeling #crossvalidation #predication #WesternUSA #algorithm #training #geology #vulcanism #favorability #research #prospecting #resources
#USGS -
Favorability Mapping For Hydrothermal Power Resource Assessments Of The Great Basin, USA
--
https://doi.org/10.1016/j.geothermics.2025.103450 <-- shared paper
--
#geothermal #resourceassessment #hydrothermal #favorability #GreatBasin #datadriven #Machinelearning #ML #AI #USA #power #energy #powerplant #renewableenergy #survey #spatialanalysis #GIS #spatial #mapping #montecarlo #model #modeling #crossvalidation #predication #WesternUSA #algorithm #training #geology #vulcanism #favorability #research #prospecting #resources
#USGS -
Pipeline release! nf-core/drugresponseeval v1.1.0 - Drugresponseeval 1.1.0 - Humongous Zapdos!
Please see the changelog: https://github.com/nf-core/drugresponseeval/releases/tag/1.1.0
#celllines #crossvalidation #deeplearning #drugresponse #drugresponseprediction #drugs #fairprinciples #generalization #hyperparametertuning #machinelearning #randomizationtests #robustnessassessment #training #nfcore #openscience #nextflow #bioinformatics
-
Pipeline release! nf-core/drugresponseeval v1.1.0 - Drugresponseeval 1.1.0 - Humongous Zapdos!
Please see the changelog: https://github.com/nf-core/drugresponseeval/releases/tag/1.1.0
#celllines #crossvalidation #deeplearning #drugresponse #drugresponseprediction #drugs #fairprinciples #generalization #hyperparametertuning #machinelearning #randomizationtests #robustnessassessment #training #nfcore #openscience #nextflow #bioinformatics
-
Pipeline release! nf-core/drugresponseeval v1.1.0 - Drugresponseeval 1.1.0 - Humongous Zapdos!
Please see the changelog: https://github.com/nf-core/drugresponseeval/releases/tag/1.1.0
#celllines #crossvalidation #deeplearning #drugresponse #drugresponseprediction #drugs #fairprinciples #generalization #hyperparametertuning #machinelearning #randomizationtests #robustnessassessment #training #nfcore #openscience #nextflow #bioinformatics
-
Pipeline release! nf-core/drugresponseeval v1.1.0 - Drugresponseeval 1.1.0 - Humongous Zapdos!
Please see the changelog: https://github.com/nf-core/drugresponseeval/releases/tag/1.1.0
#celllines #crossvalidation #deeplearning #drugresponse #drugresponseprediction #drugs #fairprinciples #generalization #hyperparametertuning #machinelearning #randomizationtests #robustnessassessment #training #nfcore #openscience #nextflow #bioinformatics
-
Pipeline release! nf-core/drugresponseeval v1.1.0 - Drugresponseeval 1.1.0 - Humongous Zapdos!
Please see the changelog: https://github.com/nf-core/drugresponseeval/releases/tag/1.1.0
#celllines #crossvalidation #deeplearning #drugresponse #drugresponseprediction #drugs #fairprinciples #generalization #hyperparametertuning #machinelearning #randomizationtests #robustnessassessment #training #nfcore #openscience #nextflow #bioinformatics
-
Кросс-валидация на временных рядах: как не перемешать время
Привет, Хабр! Сегодня рассмотрим то, что чаще всего ломает даже круто выглядящие модели при работе с временными рядами — неправильная кросс‑валидация . Разберем, почему KFold тут не работает, как легко словить утечку будущего, какие сплиттеры реально честны по отношению ко времени, как валидировать фичи с лагами и агрегатами.
https://habr.com/ru/companies/otus/articles/921604/
#временные_ряды #time_series #машинное_обучение #прогнозирование #кроссвалидация #crossvalidation
-
Pipeline release! nf-core/drugresponseeval v1.0.0 - 1.0.0!
Please see the changelog: https://github.com/nf-core/drugresponseeval/releases/tag/1.0.0
#celllines #crossvalidation #deeplearning #drugresponse #drugresponseprediction #drugs #fairprinciples #generalization #hyperparametertuning #machinelearning #randomizationtests #robustnessassessment #training #nfcore #openscience #nextflow #bioinformatics
-
Pipeline release! nf-core/drugresponseeval v1.0.0 - 1.0.0!
Please see the changelog: https://github.com/nf-core/drugresponseeval/releases/tag/1.0.0
#celllines #crossvalidation #deeplearning #drugresponse #drugresponseprediction #drugs #fairprinciples #generalization #hyperparametertuning #machinelearning #randomizationtests #robustnessassessment #training #nfcore #openscience #nextflow #bioinformatics
-
Pipeline release! nf-core/drugresponseeval v1.0.0 - 1.0.0!
Please see the changelog: https://github.com/nf-core/drugresponseeval/releases/tag/1.0.0
#celllines #crossvalidation #deeplearning #drugresponse #drugresponseprediction #drugs #fairprinciples #generalization #hyperparametertuning #machinelearning #randomizationtests #robustnessassessment #training #nfcore #openscience #nextflow #bioinformatics
-
Pipeline release! nf-core/drugresponseeval v1.0.0 - 1.0.0!
Please see the changelog: https://github.com/nf-core/drugresponseeval/releases/tag/1.0.0
#celllines #crossvalidation #deeplearning #drugresponse #drugresponseprediction #drugs #fairprinciples #generalization #hyperparametertuning #machinelearning #randomizationtests #robustnessassessment #training #nfcore #openscience #nextflow #bioinformatics
-
Pipeline release! nf-core/drugresponseeval v1.0.0 - 1.0.0!
Please see the changelog: https://github.com/nf-core/drugresponseeval/releases/tag/1.0.0
#celllines #crossvalidation #deeplearning #drugresponse #drugresponseprediction #drugs #fairprinciples #generalization #hyperparametertuning #machinelearning #randomizationtests #robustnessassessment #training #nfcore #openscience #nextflow #bioinformatics
-
from the standpoint of model selection, parsimony often boils down to dimensionality reduction
#modelSelection #parsimony #OccamsRazor #dimensionalityReduction #degreesOfFreedom #complexity #informationTheory #biasVarianceTradeoff #overfitting #underfitting #optimization #parameterTuning #crossValidation #inverseProblems #inference #statisticalLearning #machineLearning #ML #dataScience #modeling #decisionTheory #fitting #regression #classification #residualError #costFunction #performanceLoss
-
from the standpoint of model selection, parsimony often boils down to dimensionality reduction
#modelSelection #parsimony #OccamsRazor #dimensionalityReduction #degreesOfFreedom #complexity #informationTheory #biasVarianceTradeoff #overfitting #underfitting #optimization #parameterTuning #crossValidation #inverseProblems #inference #statisticalLearning #machineLearning #ML #dataScience #modeling #decisionTheory #fitting #regression #classification #residualError #costFunction #performanceLoss
-
from the standpoint of model selection, parsimony often boils down to dimensionality reduction
#modelSelection #parsimony #OccamsRazor #dimensionalityReduction #degreesOfFreedom #complexity #informationTheory #biasVarianceTradeoff #overfitting #underfitting #optimization #parameterTuning #crossValidation #inverseProblems #inference #statisticalLearning #machineLearning #ML #dataScience #modeling #decisionTheory #fitting #regression #classification #residualError #costFunction #performanceLoss
-
from the standpoint of model selection, parsimony often boils down to dimensionality reduction
#modelSelection #parsimony #OccamsRazor #dimensionalityReduction #degreesOfFreedom #complexity #informationTheory #biasVarianceTradeoff #overfitting #underfitting #optimization #parameterTuning #crossValidation #inverseProblems #inference #statisticalLearning #machineLearning #ML #dataScience #modeling #decisionTheory #fitting #regression #classification #residualError #costFunction #performanceLoss
-
from the standpoint of model selection, parsimony often boils down to dimensionality reduction
#modelSelection #parsimony #OccamsRazor #dimensionalityReduction #degreesOfFreedom #complexity #informationTheory #biasVarianceTradeoff #overfitting #underfitting #optimization #parameterTuning #crossValidation #inverseProblems #inference #statisticalLearning #machineLearning #ML #dataScience #modeling #decisionTheory #fitting #regression #classification #residualError #costFunction #performanceLoss
-
Model Evaluation, Model Selection, and Algorithm
Selection in Machine Learning#MachineLearning #ModelEvaluation #CrossValidation
#HyperparameterOptimization -
⬆️
6) thankfully, Wager (2020) https://doi.org/10.1080/01621459.2020.1727235 shows that cross-validation is asymptotically consistant for model selection, so while what we're doing gives us poor estimates of generalization error and bad error bars, at least it's valid for model selection.
-
⬆️
5) Bates et al. (2023) https://doi.org/10.1080/01621459.2023.2197686 propose a nested cross-validation estimator of generalization error that's unbiased and has an unbiased mean squared error estimator. It's computationally quite intensive. I played a bit with it, and my in high-dimensional set ups (large p small n) I got error bars that had indeed good coverage of the generalization error, but were also covering most of the [0, 1] interval, which is less helpful.
⬇️
-
⬆️
4) in any case, error bars are wrong, because it's impossible to get an unbiased estimator of the mean squared error of an estimator that's based on a single fold of cross-validation, as shown by Bengio & Grandvalet (2004) https://dl.acm.org/doi/10.5555/1005332.1044695
⬇️
-
⬆️
3) cross-validation estimators are better estimators of *expected test error* (across all possible training sets) than of *generalization error* of a model.
This has been known for a while and even appears in The Elements of Statistical Learning, so I should have known about this much earlier. Bates et al. (2023) https://doi.org/10.1080/01621459.2023.2197686 show why this is for linear models.
⬇️
-
⬆️
2) (not a surprise, but worth remembering): cross-validation error bars can be very large when sample sizes are small (unsurprisingly, due to the \( \frac{1}{\sqrt{n}} \) factor).
This is discussed for example regarding microarray studies in Braga-Neto & Dougherty (2004) https://doi.org/10.1093/bioinformatics/btg419 and @GaelVaroquaux (2018) regarding brain image analysis https://doi.org/10.1016/j.neuroimage.2017.06.061
⬇️
-
⬆️
Reading the discussion of the paper by other statisticians is enlightening as to how the tone of scientific discourse has mercifully changed in 50 years.
Also, "The term 'assessment' is preferred to 'validation' which has a ring of excessive confidence about it."
⬇️
-
We were discussing cross-validation estimates of model performance recently with colleagues, and I dug a bit in the literature to better understand where we're at.
This is not my topic of expertise, but here are a few tidbits I'd like to share.
1) cross-validation has been the topic of much discussion for many decades. Stone (1974) https://www.jstor.org/stable/2984809 gives a good overview of what precedes.
⬇️
-
New on the blog: I explore the connection between Bayes factors and cross-validation and explain why I think it does not justify the use Bayes factors in most cases. https://www.martinmodrak.cz/2024/03/23/cross-validation-a-fourth-way-to-compute-a-bayes-factor/
-
Enjoying the discussion of cross-validation methods for use of sensor data for air quality applications at the EPA air sensor QA workshop. It’s easy to overestimate how well you are doing with sensor data corrections or fusion applications unless a rigorous independent test approach is used #airquality #airpollution #crossvalidation #lowcostsensors @dwestervelt https://www.epa.gov/amtic/2023-air-sensors-quality-assurance-workshop
-
3/
#Feynman: "it doesn’t make any sense to calculate after the event. You see, you found the peculiarity, and so you selected the peculiar case"
https://archive.org/details/meaningofitallth0000feyn_d8d3/page/80/mode/2up?q=%22it+doesn%E2%80%99t+make+any+sense+to+calculate+after+the+event%22&view=theaterSpecial trending case: #CrossValidation (where data for selecting/tuning a model are also used to test it, with allegedly "clever" methods to avoid fooling oneself) and other #MachineLearning math. tricks where many dimensions/parameters are tuned by using much less data
Without a deep understanding, black-box tools lead astray
-
New paper "Cross-validatory model selection for Bayesian autoregressions with exogenous regressors" with Alex Cooper, @dan_p_simpson, Lauren Kennedy, and Catherine Forbes
One FAQ is "Can you use LOO or cross-validation in general for time series?" The short answer is "Yes", and I've had a longer answer in CV-FAQ https://avehtari.github.io/modelselection/CV-FAQ.html#9_Can_cross-validation_be_used_for_time_series
Now we have a better answer on what kind of cross-validation is good with timeseries!
-
We also looked at the influence of the average domain used for the input properties and we conducted a #CrossValidation to assess how the parameterisations perform on time steps and ice shelves they have not seen during #tuning.
-
I have two binary classifiers A and B, trained and tested through #crossvalidation on the same training-set, strongly unbalanced, since the positive class samples are the 7% of the total samples.
The ROC-AUC of A and B is respectively 0.950 and 0.949, while the area under the precision-recall curve is respectively 0.716 and 0.717. Both this differences are not statistically significant.
#datascience #machinelearning #artificialintelligence #statistics #classification