home.social

#probabilities — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #probabilities, aggregated by home.social.

  1. About metrics for measuring agreement on regression on continuous datasets:
    Reasons to avoid R² and use RMSE instead: feat.engineering/03-Review_of_

    From Max Kuhn @topepo, Kjell Johnson (2026), "Feature Engineering and Selection: A Practical Approach for Predictive Models"

    #prediction #dataDev #modelEvaluation #regression #modelling #linearRegression #modeling #probability #probabilities #statistics #stats #gotcha

  2. About metrics for measuring agreement on regression on continuous datasets:
    Reasons to avoid R² and use RMSE instead: feat.engineering/03-Review_of_

    From Max Kuhn @topepo, Kjell Johnson (2026), "Feature Engineering and Selection: A Practical Approach for Predictive Models"

    #prediction #dataDev #modelEvaluation #regression #modelling #linearRegression #modeling #probability #probabilities #statistics #stats #gotcha

  3. About metrics for measuring agreement on regression on continuous datasets:
    Reasons to avoid R² and use RMSE instead: feat.engineering/03-Review_of_

    From Max Kuhn @topepo, Kjell Johnson (2026), "Feature Engineering and Selection: A Practical Approach for Predictive Models"

  4. About metrics for measuring agreement on regression on continuous datasets:
    Reasons to avoid R² and use RMSE instead: feat.engineering/03-Review_of_

    From Max Kuhn @topepo, Kjell Johnson (2026), "Feature Engineering and Selection: A Practical Approach for Predictive Models"

    #prediction #dataDev #modelEvaluation #regression #modelling #linearRegression #modeling #probability #probabilities #statistics #stats #gotcha

  5. About metrics for measuring agreement on regression on continuous datasets:
    Reasons to avoid R² and use RMSE instead: feat.engineering/03-Review_of_

    From Max Kuhn @topepo, Kjell Johnson (2026), "Feature Engineering and Selection: A Practical Approach for Predictive Models"

    #prediction #dataDev #modelEvaluation #regression #modelling #linearRegression #modeling #probability #probabilities #statistics #stats #gotcha

  6. Prediction markets coming up fast on he outside!

    Traders on Kalshi and Polymarket, have executed more than $800M of contracts tied to the Super Bowl - so far.

    Compare that to $1.8B Americans are expected to wager on the game through regulated sports books. The $$ flowing to prediction markets is sucking up $$ that would likely otherwise have ended up on traditional gambling forums - Ouch!

    The other interesting aspect, multi-person betting syndicates are now emerging as the way to leverage predication market betting at scale. bloomberg.com/news/articles/20 #PredictionMarkets #Kalshi #PolyMarket #SuperBowl #Betting #Gambling #SportsBooks #Odds #Wagers #Probabilities #Prediction

  7. We ♥️ #BSPS2025 - thanks to BSPS for a great conference!

    Highlighting just some of the great talks from CPC-CG members before we say goodbye for another year: Hale on #racism & #loneliness in UK Asian communities; Finney on #EVENS; Butterick on #kin number #probabilities; Li on #intergenerational proximity; Lyu on #parental support & #homeownership; Nur on #childlessness & #kinlessness 👏👏

    To find out more about CPC-CG research, read our latest newsletter: sway.cloud.microsoft/urKHaLPBn

  8. We ♥️ #BSPS2025 - thanks to BSPS for a great conference!

    Highlighting just some of the great talks from CPC-CG members before we say goodbye for another year: Hale on #racism & #loneliness in UK Asian communities; Finney on #EVENS; Butterick on #kin number #probabilities; Li on #intergenerational proximity; Lyu on #parental support & #homeownership; Nur on #childlessness & #kinlessness 👏👏

    To find out more about CPC-CG research, read our latest newsletter: sway.cloud.microsoft/urKHaLPBn

  9. We ♥️ #BSPS2025 - thanks to BSPS for a great conference!

    Highlighting just some of the great talks from CPC-CG members before we say goodbye for another year: Hale on #racism & #loneliness in UK Asian communities; Finney on #EVENS; Butterick on #kin number #probabilities; Li on #intergenerational proximity; Lyu on #parental support & #homeownership; Nur on #childlessness & #kinlessness 👏👏

    To find out more about CPC-CG research, read our latest newsletter: sway.cloud.microsoft/urKHaLPBn

  10. We ♥️ #BSPS2025 - thanks to BSPS for a great conference!

    Highlighting just some of the great talks from CPC-CG members before we say goodbye for another year: Hale on #racism & #loneliness in UK Asian communities; Finney on #EVENS; Butterick on #kin number #probabilities; Li on #intergenerational proximity; Lyu on #parental support & #homeownership; Nur on #childlessness & #kinlessness 👏👏

    To find out more about CPC-CG research, read our latest newsletter: sway.cloud.microsoft/urKHaLPBn

  11. Logistic regression may be used for classification.

    In order to preserve the convex nature for the loss function, a log-loss cost function has been designed for logistic regression. This cost function extremes at labels True and False.

    The gradient for the loss function of logistic regression comes out to have the same form of terms as the gradient for the Least Squared Error.

    More: baeldung.com/cs/gradient-desce

    #optimization #algebra #linearAlgebra #math #maths #mathematics #mathStodon #ML #dataScience #machineLearning #DeepLearning #neuralNetworks #NLP #modeling #modelling #models #dataDev #AIDev #regression #modelling #dataLearning #probabilities #logisticRegression #logLoss #sigmoid #classification #differentialCalculus #loss

  12. Logistic regression may be used for classification.

    In order to preserve the convex nature for the loss function, a log-loss cost function has been designed for logistic regression. This cost function extremes at labels True and False.

    The gradient for the loss function of logistic regression comes out to have the same form of terms as the gradient for the Least Squared Error.

    More: baeldung.com/cs/gradient-desce

    #optimization #algebra #linearAlgebra #math #maths #mathematics #mathStodon #ML #dataScience #machineLearning #DeepLearning #neuralNetworks #NLP #modeling #modelling #models #dataDev #AIDev #regression #modelling #dataLearning #probabilities #logisticRegression #logLoss #sigmoid #classification #differentialCalculus #loss

  13. Logistic regression may be used for classification.

    In order to preserve the convex nature for the loss function, a log-loss cost function has been designed for logistic regression. This cost function extremes at labels True and False.

    The gradient for the loss function of logistic regression comes out to have the same form of terms as the gradient for the Least Squared Error.

    More: baeldung.com/cs/gradient-desce

  14. Logistic regression may be used for classification.

    In order to preserve the convex nature for the loss function, a log-loss cost function has been designed for logistic regression. This cost function extremes at labels True and False.

    The gradient for the loss function of logistic regression comes out to have the same form of terms as the gradient for the Least Squared Error.

    More: baeldung.com/cs/gradient-desce

    #optimization #algebra #linearAlgebra #math #maths #mathematics #mathStodon #ML #dataScience #machineLearning #DeepLearning #neuralNetworks #NLP #modeling #modelling #models #dataDev #AIDev #regression #modelling #dataLearning #probabilities #logisticRegression #logLoss #sigmoid #classification #differentialCalculus #loss

  15. Logistic regression may be used for classification.

    In order to preserve the convex nature for the loss function, a log-loss cost function has been designed for logistic regression. This cost function extremes at labels True and False.

    The gradient for the loss function of logistic regression comes out to have the same form of terms as the gradient for the Least Squared Error.

    More: baeldung.com/cs/gradient-desce

    #optimization #algebra #linearAlgebra #math #maths #mathematics #mathStodon #ML #dataScience #machineLearning #DeepLearning #neuralNetworks #NLP #modeling #modelling #models #dataDev #AIDev #regression #modelling #dataLearning #probabilities #logisticRegression #logLoss #sigmoid #classification #differentialCalculus #loss

  16. @data @datadon 🧵

    How to assess a statistical model?
    How to choose between variables?

    Pearson's #correlation is irrelevant if you suspect that the relationship is not a straight line.

    If monotonic relationship:
    "#Spearman’s rho is particularly useful for small samples where weak correlations are expected, as it can detect subtle monotonic trends." It is "widespread across disciplines where the measurement precision is not guaranteed".
    "#Kendall’s Tau-b is less affected [than Spearman’s rho] by outliers in the data, making it a robust option for datasets with extreme values."
    Ref: statisticseasily.com/kendall-t

    #normality #normalDistribution #modeling #dataDev #AIDev #ML #modelEvaluation #regression #modelling #dataLearning #featureEngineering #linearRegression #modeling #probability #probabilities #statistics #stats #correctionRatio #ML #Pearson #bias #regressionRedress #distributions

  17. @data @datadon 🧵

    How to assess a statistical model?
    How to choose between variables?

    Pearson's #correlation is irrelevant if you suspect that the relationship is not a straight line.

    If monotonic relationship:
    "#Spearman’s rho is particularly useful for small samples where weak correlations are expected, as it can detect subtle monotonic trends." It is "widespread across disciplines where the measurement precision is not guaranteed".
    "#Kendall’s Tau-b is less affected [than Spearman’s rho] by outliers in the data, making it a robust option for datasets with extreme values."
    Ref: statisticseasily.com/kendall-t

    #normality #normalDistribution #modeling #dataDev #AIDev #ML #modelEvaluation #regression #modelling #dataLearning #featureEngineering #linearRegression #modeling #probability #probabilities #statistics #stats #correctionRatio #ML #Pearson #bias #regressionRedress #distributions

  18. @[email protected] @[email protected] 🧵

    How to assess a statistical model?
    How to choose between variables?

    Pearson's is irrelevant if you suspect that the relationship is not a straight line.

    If monotonic relationship:
    "’s rho is particularly useful for small samples where weak correlations are expected, as it can detect subtle monotonic trends." It is "widespread across disciplines where the measurement precision is not guaranteed".
    "’s Tau-b is less affected [than Spearman’s rho] by outliers in the data, making it a robust option for datasets with extreme values."
    Ref: statisticseasily.com/kendall-t

  19. @data @datadon 🧵

    How to assess a statistical model?
    How to choose between variables?

    Pearson's #correlation is irrelevant if you suspect that the relationship is not a straight line.

    If monotonic relationship:
    "#Spearman’s rho is particularly useful for small samples where weak correlations are expected, as it can detect subtle monotonic trends." It is "widespread across disciplines where the measurement precision is not guaranteed".
    "#Kendall’s Tau-b is less affected [than Spearman’s rho] by outliers in the data, making it a robust option for datasets with extreme values."
    Ref: statisticseasily.com/kendall-t

    #normality #normalDistribution #modeling #dataDev #AIDev #ML #modelEvaluation #regression #modelling #dataLearning #featureEngineering #linearRegression #modeling #probability #probabilities #statistics #stats #correctionRatio #ML #Pearson #bias #regressionRedress #distributions

  20. @data @datadon 🧵

    How to assess a statistical model?
    How to choose between variables?

    Pearson's #correlation is irrelevant if you suspect that the relationship is not a straight line.

    If monotonic relationship:
    "#Spearman’s rho is particularly useful for small samples where weak correlations are expected, as it can detect subtle monotonic trends." It is "widespread across disciplines where the measurement precision is not guaranteed".
    "#Kendall’s Tau-b is less affected [than Spearman’s rho] by outliers in the data, making it a robust option for datasets with extreme values."
    Ref: statisticseasily.com/kendall-t

    #normality #normalDistribution #modeling #dataDev #AIDev #ML #modelEvaluation #regression #modelling #dataLearning #featureEngineering #linearRegression #modeling #probability #probabilities #statistics #stats #correctionRatio #ML #Pearson #bias #regressionRedress #distributions

  21. "In real life, we weigh the anticipated consequences of the decisions that we are about to make. That approach is much more rational than limiting the percentage of making the error of one kind in an artificial (null hypothesis) setting or using a measure of evidence for each model as the weight."
    Longford (2005) stat.columbia.edu/~gelman/stuf

    #modeling #nullHypothesis #probability #probabilities #pValues #statistics #stats #statisticalLiteracy #bias #inference #modelling #regression #linearRegression

  22. I'm teaching my first lecture at the new job today, about probabilistic logic programming, probabilistic inference, and (weighted) model counting.

    Some of the required reading is a paper (eccc.weizmann.ac.il/eccc-repor) that was written by a great mentor of mine, prof. dr. Fahiem Bacchus. He passed away just over 2 years ago, and I am honoured to keep his memory alive by teaching his ideas to a new generation of students. Hope to do him proud. 🌱

    Please send good vibes? 🥺

    #AcademicChatter #AcademicLife #AcademicMastodon #Teaching #Probability #ProbabilisticInference #Probabilities #Logic #LogicProgramming #PropositionalModelCounting #ProbabilisticLogicProgramming #ModelCounting #PropositionalLogic #WeightedModelCounting #DPLL #BayesianProbability #BayesNets #BasianStatistics #BayesianInference #BayesianNetworks #KnowledgeCompilation #DecisionDiagrams #BinaryDecisionDiagrams

  23. CW: What depression is 🧶

    "It's like… my head is filled with something black and gooey, it takes up all the space, all the energy.
    I try to force myself to do things, to see people.
    But it exhausts me so much, it drains me out…
    And it's often painful, I can see that people are angry with me."
    ~ Mirion Malle, in "That's how I disappear"

    #depression #quotations #safety #selfCare #self #people #disappointing #relationship #relationships #friendship #mentalHealth #predictions #perceptions #beliefs #representations #probabilities #hallucinations #bias #psychology #SilentSunday #fragility #beliefs #hallucinations #bias #SilentSunday

  24. CW: What depression is 🧶

    Depression is when we don’t care for anything. We want to stay under the duvet. The duvet is comfortable. We would like to get out of it but we only have the motivation to stay warm.

    Depression can develop into a disinterest for the world. Smiles appear bland, uninteresting. Pleasure and displeasure become indistinguishable, leading to a progressive anaesthesia of emotions.

    #EstelleSays #depression #selfCare #self #people #disappointing #relationship #relationships #friendship #mentalHealth #predictions #perceptions #beliefs #representations #probabilities #hallucinations #bias #psychology #SilentSunday #fragility

  25. In 2016, the American Statistical Association #ASA made a formal statement that "a p-value, or statistical significance, does not measure the size of an effect or the importance of a result".

    It also stated that "p-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone".

    #nullHypothesis #probabilities #probability #maths #mathematics #vectors #data #bigData #matrices #ML #distributions #stats #statistics

  26. A perspective on #chatGPT (or Large Language Models #LLMs in general): #Hype or milestone?

    [Rodney Brooks (spectrum.ieee.org/amp/gpt-4-ca) tells us that

    What large language models are good at is saying what an answer should sound like, which is different from what an answer should be.

    For a nice in-depth technical analysis, see this blog post by Stephen Wolfram (himself!) on "What is ChatGPT Doing ... and Why Does It Work? ". Worth reading -even for non-experts- in a non-trivial effort to make the whole process explainable. The different steps are:

    • #LLMs compute probabilities for the next word. To do this, they aggregate huge datasets of text so that they create a function that, given a sequence of words, computes for all possible words in the dictionary the probability that adding this new word is statistically congruent with past words. Interestingly, this probability, conditioned on what has been observed so far, falls of as a power law, just like the global probability of words in the dictionary,

    • These #probabilities are computed by a function that leans on the dataset to generate the best approximation. Wolfram makes a minute description of how to do such an approximation, starting from linear regression to using non-linearities. This leads to deep learning methods and their potential for universal function approximators,

    • Crucial is how these #models are trainable, in particular by way of #backpropagation. This leads the author to describe the process, but also to point out some limitations of the trained model, especially, as you might have guessed, compared to potentially more powerful systems, like #cellularautomata of course...

    • This now brings us to #embeddings, the crucial ingredient to define "words" in these #LLMs models. To relate "alligator" to "crocodile" vs. a "vending machine," this technique computes distances between words based on their relative distance in the large dataset of text corpus, so that each word is assigned an address in a high-dimensional space, with the intuition that words that are syntactically closer should be closer in the embedding space. It is highly non-trivial to understand the geometry of high-dimensional spaces - especially when we try to relate it to our physical 3D space - but this technique has proven to give excellent results, I highly recommend the #cemantix puzzle to test your intuition about word embeddings: cemantle.certitudes.org

    • Finally, these different parts are glued together by a humongous #transformer network. A standard #NeuralNetwork could perform a computation to predict the probabilities for the next word, but the results would mostly give nonsensical answers... Something more is needed to make this work. Just as traditional Convolutional Neural Networks #CNNs hardwire the fact that operations applied to an image should be applied to nearby pixels first, transformers do not operate uniformly on the sequence of words (i.e., embeddings), but weight them differently to ultimately get a better approximation. It is clear that much of the mechanism is a bunch of heuristics selected based on their performance - but we can understand the mechanism as giving different weights to different tokens - specifically based on the position of each token and its importance in the meaning of the current sentence. Based on this calculation, the sequence is reweighted so that a probability is ultimately computed. When applied to a sequence of words where words are added progressively, this creates a kind of loop in which the past sequence is constantly re-processed to update the generation.

    • Can we do more and include syntax? Wolfram discusses the internals of #chatGPT, and in particular how it trained iOS to "be a good bot" - and adds another possibility, which is to inject the knowledge that language is organized grammatically, and whether #transformers are able to learn such rules. This points to certain limitations of the architecture and the potential of using graphs as a generalization of geometric rules. The post ends with a comparison of #LLMs, which just aim to sound right, with rule-based models, a debate reminiscent of the older days of AI...

  27. @darioringach This preprint is an excellent read on the #adaptation of the #firing rate of primary #visual #cortex neurons - showing a #powerlaw scaling between the ratio of #probabilities and the ratio of observed firing rates, and also relative invariance of directional scatter (though I am puzzled by the use of cosine similarity instead of more standard distances between distributions...).

    An interesting result is that the log firing rate is linearly proportional to the negative surprise of a stimulus, which will certainly bring a smile to proponents of #probabilistic representations in the brain or #predictive processing...

    The study extends these observations to more ecological distributions, but - as far as I understand it - still samples individual orientations. One can predict that this would extend to showing textures with different levels of anisotropy (as defined by the tuple $(n_x, n_y)$ in the Ringach, 2002 paper, for example), creating a stimulus set ${ s_i }$ that better covers the space of visual features.

    #computationalneuroscience