home.social

#mgcv — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #mgcv, aggregated by home.social.

  1. #Day28 | Uncertainties – Modeling | #30DayChartChallenge | Barro Colorado Island — Tree Species Richness Estimation. Built with #RStats using #ggplot2, #patchwork, #MASS, #mgcv, #scales, #vegan, #gridExtra and #grid.

  2. | Uncertainties – Modeling | | Barro Colorado Island — Tree Species Richness Estimation. Built with using , , , , , , and .

  3. #Day28 | Uncertainties – Modeling | #30DayChartChallenge | Barro Colorado Island — Tree Species Richness Estimation. Built with #RStats using #ggplot2, #patchwork, #MASS, #mgcv, #scales, #vegan, #gridExtra and #grid.

  4. #Day28 | Uncertainties – Modeling | #30DayChartChallenge | Barro Colorado Island — Tree Species Richness Estimation. Built with #RStats using #ggplot2, #patchwork, #MASS, #mgcv, #scales, #vegan, #gridExtra and #grid.

  5. #Day28 | Uncertainties – Modeling | #30DayChartChallenge | Barro Colorado Island — Tree Species Richness Estimation. Built with #RStats using #ggplot2, #patchwork, #MASS, #mgcv, #scales, #vegan, #gridExtra and #grid.

  6. The hottest ticket in R will be @gavinsimpson's live stream on What's New in Generalized Additive Models in R

    2026-03-06 (17:00–19:00 CET) at youtube.com/live/A9U8e1KdlU4?f

    • what GAMs are and how they work
    • recent {mgcv} updates (incl. Hierarchical GAMs)
    • new features in {gratia}
    • deeper inference with {marginaleffects}

    Post questions at github.com/gavinsimpson/gratia

    #RStats #mgcv #gratia #statistics #GAMs

  7. 📈 Yes you can do that in mgcv update

    big thanks to Zachary Susswein for spotting that my code was out of date in my neighbourhood cross-validation examples: calgary.converged.yt/articles/ calgary.converged.yt/articles/

    They are now up-to-date, as is the helper package mgcvUtils: github.com/dill/mgcvUtils

    #mgcvchat #mgcv

  8. Anyone got anything on using #mgcv with #mrf and #sf objects in #rstats? The package seems to want its own format for polygon regions and (can) compute its own adjacency list etc. But I haz sf objects...

  9. #quarto #rstats friends who use github action to publish articles:

    it's currently taking github actions ~30 mins to publish my little #mgcv help site (calgary.converged.yt/). This seems to be because it's installing a lot of R packages from source.

    What's the current state-of-the-art to get these things to render quickly? (And using minimal power.)

    (I'd like to not use github but I would also like to encourage PRs etc from folks without a huge overhead from them, so let's stick to github-based solutions for now.)

  10. new (out for a while but sitting in my browser from before Christmas) paper in Biometrika from Benjamin Säfken, Thomas Kneib and Simon Wood on smoothing parameter degrees of freedom

    Green OA @ Edinburgh pure.ed.ac.uk/ws/portalfiles/p

    #mgcvchat #mgcv

  11. #mgcv mini-lifehack:

    (assuming you have multithreading enabled) you can get a rough idea of what's happening when fitting a big model by looking at your CPU usage. If only 1 core is being used, the model is still "building" (assembling of design/penalty matrices), once you switch to all cores, then you're actually fitting the model. Sometimes that first model construction phase can take a long time (with a very big model), so it'll probably take a very very long time to fit. So buckle-up.

    #mgcvchat

  12. spending some more time thinking about neighbourhood cross-validation in #mgcv (see original post here: calgary.converged.yt/articles/), but for time series.

    Pretty nice to be able to get back to a yearly trend here without needing to specify an autoregressive structure. We just need to specify a cross-validation scheme and the autocorrelation is "dealt with" during fitting.

    Full post on this soon. #mgcvchat #rstats

  13. Ok, a more *specific* question: When using tensor product interaction terms with `ti()`, do the knots have to match? E.g. do I have to do ti(x, k = 10) + ti(y, k = 20) + ti(x,y k = c(10, 20))? Or can the knots in the interaction term be whatever? Would I want them to be different for some reason?

  14. A unifying modelling approach for hierarchical distributed lag models, by Theo Economou et al:

    doi.org/10.48550/arXiv.2407.13

    code: zenodo.org/records/10458640

    #rstats #mgcv

  15. Preprint from Simon Wood on the new cross-validation smoothness estimation in #mgcv: arxiv.org/abs/2404.16490. It's a neat performant + data-efficient way to estimate GAMs based on complex CV splits (like spatial/temporal/phylo ones).

    See ?NCV in latest {mgcv} for examples (cran.r-universe.dev/mgcv/doc/m)

    I might write a helper to convert {rsample}/{spatialsample} objects into mgcv's funny CV indexing structure.

    #rstats #ml #tidymodels #mgcvchat @MikeMahoney218 @gavinsimpson @ericJpedersen @millerdl

  16. @cameronpat I have wondered about this too! Especially since GAMs seem like a natural progression from "ordinary" linear models. Is it the choosing of bases or interpretation of coefficients that's a turn off? But those aren't decisions specific to #biostatistics. Perhaps it's just a lack of awareness? I've found #mgcv in #rstats super easy to use.

  17. #RStats issues I'm struggling with that seem impossible to Google: Building a {brms} model within the {tidymodels} framework using {bayesian}.

    The formula is inherently too complex (including splines and random effects) for the typical tidymodels workflow that involves recipes &c., so it must be added in at a later step. Two things:

    1. Complex {brms} multivariate formulas seem to not be possible using {tidymodels}. E.g., literally multivariate or including phi after my formula via brms::bf(). It simply errors. :( This may just need some tweaking of {bayesian}'s scripts or waiting for an update since it's still fairly young.

    2. Using {mgcv} random effect syntax like s(cat1, cat2, bs = "re") seems to not pick up as random effects in the model...I think? And I have never figured out if this is creating hierarchical random effects or not -- or if multilevel random effects just aren't possible in this syntax(?).

    3. Using {lme4} random effect like (1 | cat1 / cat2) to ensure the hierarchy is preserved *does* retain random effects I can pull out of the model later using `ranef`, but for some absurd reason I cannot run this model through cross-validation or a myriad of other steps later because it seems to force-create a complex web of interacting factor levels that don't exist. E.g., if my random effects are '(1 | realm / biome)', this eventually fails because it'll look for tundra biome types in Africa for some absurd reason.*

    Noticed this while trying to solve *separate* issues within broom.mixed:::tidy.brmsfit() -- that it seems to delete the names of all the fixed effects and return them as 'NULL' character strings (???), and its reliance on 'ranef' means it doesn't find the random effects using {mgcv} syntax.

    That's my rambling mess of an essay for the day. Not sure how many of these are real issues or me simply not understanding how these packages differ or wot.

    #brms #mgcv #tidymodels

    * Almost wondering if this might even be a separate {tidymodels} issue right now. Every recipe no matter what seems to factor every single character column regardless of how the recipe is built. Hmmmm.

  18. Absolutely gaga over this new preprint by Nick Clark and the @weecology group. So many methodological threads - long-term ecological monitoring, an open data system, careful semi-parametric models, simulation-based inference and forecasting rigor - combine into predicting complex multispecies dynamics while learning about their relationships + drivers

    ecoevorxiv.org/repository/view, code at github.com/nicholasjclark/port

    Thread from Nick at: twitter.com/nj_clark/status/16

    #ecology #forecasting #EFI #mgcv #rstats

  19. #rstats friends 🎉💻🐈

    two bots of potential interest

    @mgcv_updates tells you about what's new in #mgcv

    and

    @rverbsr is a silly bot that toots "verb that noun" phrases where the verbs are functions in R base and the nouns are R types

    enjoy!

  20. OK, a first convening of team #gams #mgcv here: @ericJpedersen @gavinsimpson @millerdl .

    If I want to fit a spline but constrain it to going through certain points (e.g., the start and end of an epicurve should be zero), what's the best way? I'm thinking of adding points to the data at the ends of the range with very high weights. Not sure what the consequences of that would be. #rstats