home.social

#experimental-design — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #experimental-design, aggregated by home.social.

fetched live
  1. "Five skills. Each one is counter-cyclical (becomes more valuable as hype recedes), resistant to LLM automation (requires human judgment that pattern-matching can’t replicate), and directly tied to the business outcomes executives actually pay for."
    by Kaushik Rajan: towardsdatascience.com/the-ai-

    #DataScience #BayesianStatistics #BayesianStats #Bayesian #causalInference #experimentalDesign #SPC #statisticalProcessControl

  2. New blog post out!

    Power analysis – A flexible simulation approach using R
    nicolaromano.net/data-thoughts

    I go through why power matters, how to use Monte Carlo simulations to estimate it, and how this approach can be useful not only to define sample size, but also to refine experimental design.
    #rstats #statistics #biostats #datascience #experimentaldesign #poweranalysis

  3. OMG, I absolutely appreciate doing bioinformatics analysis under a PI who actually knows molecular biology and biochemistry, and can effectively question our collaborators on why they chose the tissues they did, and whether they are actually measuring what they think they are.

    #Academia #Bioinformatics #ExperimentalDesign

  4. My Road to Bayesian Stats

    By 2015, I had heard of Bayesian Stats but didn’t bother to go deeper into it. After all, significance stars, and p-values worked fine. I started to explore Bayesian Statistics when considering small sample sizes in biological experiments. How much can you say when you are comparing means of 6 or even 60 observations? This is the nature work at the edge of knowledge. Not knowing what to expect is normal. Multiple possible routes to a seen a result is normal. Not knowing how to pick the route to the observed result is also normal. Yet, our statistics fails to capture this reality and the associated uncertainties. There must be a way I thought. 

    Free Curve to the Point: Accompanying Sound of Geometric Curves (1925) print in high resolution by Wassily Kandinsky. Original from The MET Museum. Digitally enhanced by rawpixel.

    I started by searching for ways to overcome small sample sizes. There are minimum sample sizes recommended for t-tests. Thirty is an often quoted number with qualifiers. Bayesian stats does not have a minimum sample size. This had me intrigued. Surely, this can’t be a thing. But it is. Bayesian stats creates a mathematical model using your observations and then samples from that model to make comparisons. If you have any exposure to AI, you can think of this a bit like training an AI model. Of course the more data you have the better the model can be. But even with a little data we can make progress. 

    How do you say, there is something happening and it’s interesting, but we are only x% sure. Frequentist stats have no way through. All I knew was to apply the t-test and if there are “***” in the plot, I’m golden. That isn’t accurate though. Low p-values indicate the strength of evidence against the null hypothesis. Let’s take a minute to unpack that. The null hypothesis is that nothing is happening. If you have a control set and do a treatment on the other set, the null hypothesis says that there is no difference. So, a low p-value says that it is unlikely that the null hypothesis is true. But that does not imply that the alternative hypothesis is true. What’s worse is that there is no way for us to say that the control and experiment have no difference. We can’t accept the null hypothesis using p-values either. 

    Guess what? Bayes stats can do all those things. It can measure differences, accept and reject both  null and alternative hypotheses, even communicate how uncertain we are (more on this later). All without making assumptions about our data.

    It’s often overlooked, but frequentist analysis also requires the data to have certain properties like normality and equal variance. Biological processes have complex behavior and, unless observed, assuming normality and equal variance is perilous. The danger only goes up with small sample sizes. Again, Bayes requires you to make no assumptions about your data. Whatever shape the distribution is, so called outliers and all, it all goes into the model. Small sample sets do produce weaker fits, but this is kept transparent. 

    Transparency is one of the key strengths of Bayesian stats. It requires you to work a little bit harder on two fronts though. First you have to think about your data generating process (DGP). This means how do the data points you observe came to be. As we said, the process is often unknown. We have at best some guesses of how this could happen. Thankfully, we have a nice way to represent this. DAGs, directed acyclic graphs, are a fancy name for a simple diagram showing what affects what. Most of the time we are trying to discover the DAG, ie the pathway of a biological outcome. Even if you don’t do Bayesian stats, using DAGs to lay out your thoughts is a great. In Bayesian stats the DAGs can be used to test if your model fits the data we observe. If the DAG captures the data generating process the fit is good, and not if it doesn’t. 

    The other hard bit is doing analysis and communicating the results. Bayesian stats forces you to be verbose about your assumptions in your model. This part is almost magicked away in t-tests. Frequentist stats also makes assumptions about the model that your data is assumed to follow. It all happens so quickly that there isn’t even a second to think about it. You put in your data, click t-test and woosh! You see stars. In Bayesian stats stating the assumptions you make in your model (using DAGs and hypothesis about DGPs) communicates to the world what and why you think this phenomenon occurs. 

    Discovering causality is the whole reason for doing science. Knowing the causality allows us to intervene in the forms of treatments and drugs. But if my tools don’t allow me to be transparent and worse if they block people from correcting me, why bother?

    Richard McElreath says it best:

    There is no method for making causal models other than science. There is no method to science other than honest anarchy.

    #AI #BayesianStatistics #BiologicalDataAnalysis #Business #CausalInference #DAGs #DataGeneratingProcess #dataScience #ExperimentalDesign #FrequentistVsBayesian #Leadership #machineLearning #philosophy #science #ScientificMethod #SmallSampleSize #StatisticalModeling #StatisticalPhilosophy #statistics #TransparentScience #UncertaintyQuantification

  5. How to more efficiently study complex treatment interactions | MIT News

    MIT researchers have developed a new theoretical framework for studying the mechanisms of treatment interactions. Their approach allows…
    #NewsBeep #News #US #USA #UnitedStates #UnitedStatesOfAmerica #Genetics #Biomedicine #CarolineUhler #DivyaShyamal #Experimentaldesign #JiaqiZhang #Science
    newsbeep.com/us/13944/

  6. When designing a scientific experiment, a key factor is the sample size to be used for the results of the experiment to be meaningful.

    How many cells do I need to measure? How many people do I interview? How many patients do I try my new drug on?

    This is of great importance especially for quantitative studies, where we use statistics to determine whether a treatment or condition has an effect. Indeed, when we test a drug on a (small) number of patients, we do so in the hope our results can generalise to any patient because it would be impossible to test it on everyone.

    The solution is to perform a "power analysis", a calculation that tells us whether given our experimental design, the statistical test we are using is able to see an effect of a certain magnitude, if that effect is really there. In other words, this is something that tells us whether the experiment we're planning to do could give us meaningful results.

    But, as I said, in order to do a power analysis we need to decide what size of effect we would like to see. So... do scientists actually do that?

    We explored this question in the context of the chronic variable stress literature.

    We found that only a few studies give a clear justification for the sample size used, and in those that do, only a very small fraction used a biologically meaningful effect size as part of the sample size calculation. We discuss challenges around identifying a biologically meaningful effect size and ways to overcome them.

    Read more here!
    physoc.onlinelibrary.wiley.com

    #experiments #ExperimentalDesign #effectsize #statistics #stress #research #article #power #biology

  7. The second part of our exploration of chronic variable #stress studies is out!

    biorxiv.org/content/10.1101/20

    Here we look at studies employing chronic variable stress in rodents and explore how sample size was chosen. Of the 385 studies that we analysed, only one reported calculating sample size based on a biologically meaningful effect size and only 25% mention sample size at all.

    A companion article where we analyse the relationship between protocols and reported effect size can be found here
    biorxiv.org/content/10.1101/20

    #ResearchEthics #ThreeRs #ExperimentalDesign #StatisticalPower

  8. @theluddite
    I don't have a reading list, sorry. However, I have been listening to several on my design team who are leveraging the language of "hypothesis" when discussing design options. I hear that as another layer of abstraction away from real people using technologies in authentic situations.
    They seem to use that language as an excuse for not talking with and observing actual humans.
    #ExperimentalDesign

  9. heh, from my Experimental Design textbook:
    "Batches of raw material, people, and time are also common nuisance sources of variability in an experiment."

    Yes, people can indeed be a nuisance.

    #DataScience #ExperimentalDesign

  10. This is an excellent argument by Jennifer Listgarten about why Large Language Models #LLM like #chatGPT are not a silver bullet for scientific discovery. I am also motivated to study #ExperimentalDesign using #MachineLearning for the reasons Jennifer argues in this paper.

    We need better data in #science.

    nature.com/articles/s41587-023

  11. Statistical #PowerAnalysis currently dominates #ExperimentalDesign. In this Essay, @itchyshin &co argue that we should move away from the current focus on power analysis and instead encourage smaller scale studies & collaborative projects #PLOSBiology plos.io/48Gk8Og

  12. Haven't blogged in a long long time.

    But I just about found time tonight to write a tiny bit more about PCI Registered Reports (@pcirr) and a recent peer-review experience that made me realise the community need for PCI Registered Reports:
    rossmounce.co.uk/2023/11/19/ku

    #OpenScience #Metaresearch #PeerReview #ExperimentalDesign #RiskOfBias

  13. I'm teaching a course on Experimental Design in #Biomedical #Research, and I like to start with the discussion of a short paper, pretending to be a scientific article but isn't (it draws conclusion with no actual data). The thing is that's a bit too obvious, so students tend to pick up the deception. Can anyone direct on a not so obvious yet false scientific paper? I'd appreciate it. #science #ExperimentalDesign #experiment

  14. We have a research assistant position to work on a study surveying how experimental study design and analysis are taught at the undergraduate level in the UK (position based at the University of Edinburgh, Scotland). The post is for four months full-time or equivalent part-time (e.g. eight. months 50%).

    If you are interested or know someone who is, you can find more details here:

    elxw.fa.em3.oraclecloud.com/hc

    #job #research #experimentalDesign #dataAnalysis #survey #statistics

  15. This paper models "discovery" as being interested in finding things with high label values, have a big set of things to label, and can only label a few of the things once each. Similar to scientific discovery. The setting is more general than #bandits, and uses #ExperimentalDesign type ideas. The paper considers the ratio of expected instant regret to information gain as a rule for selecting the next item to label.
    arxiv.org/abs/2205.14829
    #MachineLearning #ml4science

  16. Anyone know of software that will help optimize multi-variate experiments? Something sample-efficient since these are manually generated test results.

    I found github.com/yunshengtian/AutoOE which looks great, but having a hard time getting it working at the moment

    #software #optimization #multivariate #designofexperiments #ExperimentalDesign

  17. Another classic #BMJ Christmas edition article (really useful for teaching #experimentalDesign and #statistics )
    bmj.com/content/379/bmj-2022-0
    Direct Uptake of Nutrition and Caffeine Study (DUNCS): biscuit based comparative study

  18. Neuroscientists need strong training in experimental design.

    Another highlight of #SFN2022 was Mary Harrington's presentation on why and how to integrate experimental design into neuroscience training. Here slides (with sample syllabus) are here: osf.io/phzje

    And her textbook, The Design of Experiments in Neuroscience, is a great starting point (tinyurl.com/expdesignneuro)

    Mary's not on Mastodon (yet) -- but soon?

    #Neuroscience #OpenScience #ExperimentalDesign

  19. CW: Postdoc opportunity

    The New Zealand Institute of Language, Brain and Behaviour is seeking a Post-Doctoral Fellow to join the team of researchers working on a project funded by the Royal Society of New Zealand Marsden Fund "Do patterns of covariation in speech carry social meaning".

    #SpeechPerception #Sociolinguistics #Phonetics #ExperimentalDesign

    jobs.canterbury.ac.nz/jobdetai

  20. The evolution of emotion: Charles Darwin's little-known psychology experiment (2010)

    Darwin conducted one of the first studies on how people recognize emotion in faces, according to new archival research by Peter Snyder, a neuroscientist at Brown University. Snyder's findings rely on biographical documents never before published; they now appear in the May issue of the Journal of the History of the Neurosciences. ...

    blogs.scientificamerican.com/o

    #Evolution #psychology #emotions #CharlesDarwin #ExperimentalDesign