home.social

#statsmodels — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #statsmodels, aggregated by home.social.

  1. So I moved into industry 1.5 months ago, which has meant a proper switch from R :rstats: to Python :python: (I love both). Here are a few observations for statistics-related stuff in this switch (mainly GLMs, statistical inference, contrasts)

    - is really great, I love LazyFrames & streaming millions of rows of parquet files, categorical data, missing data.
    - I don't really like using , the interface is clunky and the formula API is unfinished

    1/n

  2. So I moved into industry 1.5 months ago, which has meant a proper switch from R :rstats: to Python :python: (I love both). Here are a few observations for statistics-related stuff in this switch (mainly GLMs, statistical inference, contrasts)

    - #polars is really great, I love LazyFrames & streaming millions of rows of parquet files, categorical data, missing data.
    - I don't really like using #statsmodels, the interface is clunky and the formula API is unfinished

    1/n

    #DataScience #Statistics

  3. So I moved into industry 1.5 months ago, which has meant a proper switch from R :rstats: to Python :python: (I love both). Here are a few observations for statistics-related stuff in this switch (mainly GLMs, statistical inference, contrasts)

    - #polars is really great, I love LazyFrames & streaming millions of rows of parquet files, categorical data, missing data.
    - I don't really like using #statsmodels, the interface is clunky and the formula API is unfinished

    1/n

    #DataScience #Statistics

  4. So I moved into industry 1.5 months ago, which has meant a proper switch from R :rstats: to Python :python: (I love both). Here are a few observations for statistics-related stuff in this switch (mainly GLMs, statistical inference, contrasts)

    - #polars is really great, I love LazyFrames & streaming millions of rows of parquet files, categorical data, missing data.
    - I don't really like using #statsmodels, the interface is clunky and the formula API is unfinished

    1/n

    #DataScience #Statistics

  5. So I moved into industry 1.5 months ago, which has meant a proper switch from R :rstats: to Python :python: (I love both). Here are a few observations for statistics-related stuff in this switch (mainly GLMs, statistical inference, contrasts)

    - #polars is really great, I love LazyFrames & streaming millions of rows of parquet files, categorical data, missing data.
    - I don't really like using #statsmodels, the interface is clunky and the formula API is unfinished

    1/n

    #DataScience #Statistics

  6. Due to a recent discussion with colleagues on whether and when to use #LinearMixedModels (#LMM), I wrote a blog post comparing LMM to other approaches using simulated data. I thought, it may also be useful for others working with hierarchical data structures in #neuroscience and beyond.

    🌍 fabriziomusacchio.com/blog/202

    #Python #Statistics #DataScience #MixedModels #Statsmodels #ANOVA #ANCOVA #GLMM #regression

  7. Due to a recent discussion with colleagues on whether and when to use #LinearMixedModels (#LMM), I wrote a blog post comparing LMM to other approaches using simulated data. I thought, it may also be useful for others working with hierarchical data structures in #neuroscience and beyond.

    🌍 fabriziomusacchio.com/blog/202

    #Python #Statistics #DataScience #MixedModels #Statsmodels #ANOVA #ANCOVA #GLMM #regression

  8. Due to a recent discussion with colleagues on whether and when to use #LinearMixedModels (#LMM), I wrote a blog post comparing LMM to other approaches using simulated data. I thought, it may also be useful for others working with hierarchical data structures in #neuroscience and beyond.

    🌍 fabriziomusacchio.com/blog/202

    #Python #Statistics #DataScience #MixedModels #Statsmodels #ANOVA #ANCOVA #GLMM #regression

  9. Due to a recent discussion with colleagues on whether and when to use #LinearMixedModels (#LMM), I wrote a blog post comparing LMM to other approaches using simulated data. I thought, it may also be useful for others working with hierarchical data structures in #neuroscience and beyond.

    🌍 fabriziomusacchio.com/blog/202

    #Python #Statistics #DataScience #MixedModels #Statsmodels #ANOVA #ANCOVA #GLMM #regression

  10. Due to a recent discussion with colleagues on whether and when to use #LinearMixedModels (#LMM), I wrote a blog post comparing LMM to other approaches using simulated data. I thought, it may also be useful for others working with hierarchical data structures in #neuroscience and beyond.

    🌍 fabriziomusacchio.com/blog/202

    #Python #Statistics #DataScience #MixedModels #Statsmodels #ANOVA #ANCOVA #GLMM #regression

  11. I have been running some #GLM models recently using the #Python library #statsmodels, and I am thoroughly delighted by the convenience, ease-of-use and flexibility of the library. Kudos to the devs!

  12. Thank you to #Kone & Mai and Tor Nessling Foundations for supporting this work. A quantitative work like this would not be possible without a robust suite of FOSS tools. My thanks to the maintainers of #QGIS, #pandas, #geopandas, #duckdb, #dask, #statsmodels, #jupyter and many more!

  13. Как пакет с пакетами помог аналитику решить задачу для бизнеса, или keep calm and import statsmodels

    Всем привет! Меня зовут Сабина, я лидер команды исследователей данных во ВкусВилле. Мы помогаем бизнесу принимать решения, ориентируясь в том числе на данные. Сегодня я расскажу об одном таком случае. Статья будет полезна аналитикам, которые хотят перестать беспокоиться и начать использовать линейную регрессию из питоновской библиотеки stasmodels.

    habr.com/ru/companies/vkusvill

    #data_science #python #statsmodels #linear_regression #линейная_регрессия

  14. Как пакет с пакетами помог аналитику решить задачу для бизнеса, или keep calm and import statsmodels

    Всем привет! Меня зовут Сабина, я лидер команды исследователей данных во ВкусВилле. Мы помогаем бизнесу принимать решения, ориентируясь в том числе на данные. Сегодня я расскажу об одном таком случае. Статья будет полезна аналитикам, которые хотят перестать беспокоиться и начать использовать линейную регрессию из питоновской библиотеки stasmodels.

    habr.com/ru/companies/vkusvill

    #data_science #python #statsmodels #linear_regression #линейная_регрессия

  15. Как пакет с пакетами помог аналитику решить задачу для бизнеса, или keep calm and import statsmodels

    Всем привет! Меня зовут Сабина, я лидер команды исследователей данных во ВкусВилле. Мы помогаем бизнесу принимать решения, ориентируясь в том числе на данные. Сегодня я расскажу об одном таком случае. Статья будет полезна аналитикам, которые хотят перестать беспокоиться и начать использовать линейную регрессию из питоновской библиотеки stasmodels.

    habr.com/ru/companies/vkusvill

    #data_science #python #statsmodels #linear_regression #линейная_регрессия

  16. Как пакет с пакетами помог аналитику решить задачу для бизнеса, или keep calm and import statsmodels

    Всем привет! Меня зовут Сабина, я лидер команды исследователей данных во ВкусВилле. Мы помогаем бизнесу принимать решения, ориентируясь в том числе на данные. Сегодня я расскажу об одном таком случае. Статья будет полезна аналитикам, которые хотят перестать беспокоиться и начать использовать линейную регрессию из питоновской библиотеки stasmodels.

    habr.com/ru/companies/vkusvill

    #data_science #python #statsmodels #linear_regression #линейная_регрессия

  17. Как обнаружить и устранить мультиколлинеарность с помощью Statsmodels в Питоне

    Привет, Хабр! Мультиколлинеарность возникает, когда в модели множественной регрессии одна из независимых переменных может быть линейно предсказана с помощью других независимых переменных с высокой степенью точности. Это явление приводит к тому, что расчетные коэффициенты регрессии становятся нестабильными и их значения могут сильно изменяться в зависимости от включения или исключения других переменных в модель. Высокая мультиколлинеарность может привести к значительному изменению коэффициентов при незначительных изменениях в данных или спецификации модели. Это усложняет интерпретацию коэффициентов, поскольку они могут значительно изменяться от одного анализа к другому. Когда переменные сильно коррелированы, стандартные ошибки оценок коэффициентов увеличиваются. Это ведет к увеличению p -значений, что может ошибочно привести к заключению о том, что переменные не имеют значимого влияния на зависимую переменную, хотя на самом деле это не так. В статье рассмотрим как обнаружить и устранить мультиколлинеарность с помощью Statsmodels в Питоне.

    habr.com/ru/companies/otus/art

    #data_science #ML #python #statsmodels

  18. Индуктивная статистика: доверительные интервалы, предельные ошибки, размер выборки и проверка гипотез

    Одной из самых распространённых задач современной аналитики является формирование суждений о большой совокупности (например, о миллионах пользователей приложения), опираясь на данные лишь о небольшой части этой совокупности - выборке. Можно ли сделать вывод о миллионной аудитории крупного мобильного приложения, собрав данные об использовании лишь для 100 пользователей? Или стоит собрать данные для 1000 пользователей? Ответ интуитивно прост и понятен: чем больше данных есть в наличии, тем более точными будут прогнозируемые результаты для всей совокупности. Какую вероятность ошибиться при анализе мы можем допустить: 5% или 1%? Относятся ли две выборки к одной совокупности, или между ними есть ощутимая значимая разница и они относятся к разным совокупностям? Точность прогноза и вероятность ошибки при ответе на эти и другие вопросы поддаются вполне конкретным расчётам и могут корректироваться в зависимости от потребностей продукта и бизнеса на этапе планирования и подготовки эксперимента. Рассмотрим подробнее, как параметры эксперимента и статистические критерии оказывают влияние на результаты анализа и выводы обо всей совокупности, а для этого смоделируем тысячу A/A , A/B и A/B/C/D тестов .

    habr.com/ru/articles/807051/

    #математика #математическая_статистика #анализ_данных #статистический_анализ #ab_тесты #statsmodels #scipy #python #matplotlib #проверка_гипотез

  19. #Python in #Excel (in Beta) #Microsoft 🤝 #Anaconda

    Default imported libraries:
    #matplotlib
    #numpy
    #pandas
    #seaborn
    #statsmodels

    only for Windows, needs internet access, code executed on MS servers without network or file access

    see aka.ms/python-in-excel-getting & anaconda.com/excel

  20. Noticias sobre Python y Datos de la semana, episodio 76 🐍⚙️

    En resumen: Edición ultrarrápida de domingo por la tarde: versiones nuevas de numba, statsmodels y geopandas, por qué no usar leyendas en matplotlib, y primeras pruebas con StarCoder.

    buttondown.email/astrojuanlu/a

    Apoya el noticiero suscribiéndote por correo 📬

    #noticieropythonydatos #python #pydata #numba #statsmodels #geopandas #matplotlib #starcoder #polars

    ¡Sigue a @numba y @geopandas en Mastodon!

  21. Noticias sobre Python y Datos de la semana, episodio 76 🐍⚙️

    En resumen: Edición ultrarrápida de domingo por la tarde: versiones nuevas de numba, statsmodels y geopandas, por qué no usar leyendas en matplotlib, y primeras pruebas con StarCoder.

    buttondown.email/astrojuanlu/a

    Apoya el noticiero suscribiéndote por correo 📬

    #noticieropythonydatos #python #pydata #numba #statsmodels #geopandas #matplotlib #starcoder #polars

    ¡Sigue a @numba y @geopandas en Mastodon!

  22. Noticias sobre Python y Datos de la semana, episodio 76 🐍⚙️

    En resumen: Edición ultrarrápida de domingo por la tarde: versiones nuevas de numba, statsmodels y geopandas, por qué no usar leyendas en matplotlib, y primeras pruebas con StarCoder.

    buttondown.email/astrojuanlu/a

    Apoya el noticiero suscribiéndote por correo 📬

    #noticieropythonydatos #python #pydata #numba #statsmodels #geopandas #matplotlib #starcoder #polars

    ¡Sigue a @numba y @geopandas en Mastodon!

  23. Noticias sobre Python y Datos de la semana, episodio 76 🐍⚙️

    En resumen: Edición ultrarrápida de domingo por la tarde: versiones nuevas de numba, statsmodels y geopandas, por qué no usar leyendas en matplotlib, y primeras pruebas con StarCoder.

    buttondown.email/astrojuanlu/a

    Apoya el noticiero suscribiéndote por correo 📬

    #noticieropythonydatos #python #pydata #numba #statsmodels #geopandas #matplotlib #starcoder #polars

    ¡Sigue a @numba y @geopandas en Mastodon!

  24. Noticias sobre Python y Datos de la semana, episodio 76 🐍⚙️

    En resumen: Edición ultrarrápida de domingo por la tarde: versiones nuevas de numba, statsmodels y geopandas, por qué no usar leyendas en matplotlib, y primeras pruebas con StarCoder.

    buttondown.email/astrojuanlu/a

    Apoya el noticiero suscribiéndote por correo 📬

    #noticieropythonydatos #python #pydata #numba #statsmodels #geopandas #matplotlib #starcoder #polars

    ¡Sigue a @numba y @geopandas en Mastodon!

  25. I just did an #introduction a few days ago, but I've moved servers, so let's try one more time, for the cheap seats in the back!

    I'm currently a data analyst/product #DataScientist working with free-to-play #VideoGames, and living in #Halifax, #NovaScotia, #Canada. I've done a lot of work on #Analytics design, with a focus on ensuring player telemetry events are sensibly cross-referenceable, and looking for relationships between engagement with different game features and business outcomes.

    Business teams in freemium games love looking for magic buttons.

    I primarily use #SQL, #Pandas, #Statsmodels, and #SKLearn on #Databricks (#Python), and #JuliaLang (DataFrames.jl, GLM.jl, Gadfly.jl, and Makie.jl, etc) for smaller, locally run projects. My interests lie in expanding the library of ML models I have in my back pocket for performing inference based knowledge generation. I'm not super keen on automating products with quasi-black-boxes for the sake of revenue optimization. If I'm not personally learning something new about people through my work, I don't usually see the value in it.

    I did my BSc in #Physics and my MSc in #Astronomy, and, though I had dreams of progressing further down that pipeline, life kind of got in the way. Between the two degrees, I worked at the #Edmonton #Planetarium for four years as a presenter/operator (should out to the #ZeidlerDome at #TWoSE!).

    I'm a life-long #Trekie, thanks to my mother. I grew up with #TNG and #DS9, and watched the first 5 seasons of #VOY before leaving home for university. Currently very bullish on #SNW and #LDS.

    I'm also a lifelong #Baseball fan (#BlueJays and #Expos), and actively play rec #Softball.

    About a year ago, I purchased my first lens-swappable digital #camera, and have been figuring out #Photography ever since. Most of my posts have focused on sharing my pictures, though I've recently decided to start a dedicated account for that on a #PixelFed server, for the sake of searchability.

    My wife is currently studying political sociology, and I find her work fascinating. She's not currently on the Fediverse, but maybe one of these days.

    This has really lost all sense of narrative flow, hasn't it? Oops!

  26. I just did an #introduction a few days ago, but I've moved servers, so let's try one more time, for the cheap seats in the back!

    I'm currently a data analyst/product #DataScientist working with free-to-play #VideoGames, and living in #Halifax, #NovaScotia, #Canada. I've done a lot of work on #Analytics design, with a focus on ensuring player telemetry events are sensibly cross-referenceable, and looking for relationships between engagement with different game features and business outcomes.

    Business teams in freemium games love looking for magic buttons.

    I primarily use #SQL, #Pandas, #Statsmodels, and #SKLearn on #Databricks (#Python), and #JuliaLang (DataFrames.jl, GLM.jl, Gadfly.jl, and Makie.jl, etc) for smaller, locally run projects. My interests lie in expanding the library of ML models I have in my back pocket for performing inference based knowledge generation. I'm not super keen on automating products with quasi-black-boxes for the sake of revenue optimization. If I'm not personally learning something new about people through my work, I don't usually see the value in it.

    I did my BSc in #Physics and my MSc in #Astronomy, and, though I had dreams of progressing further down that pipeline, life kind of got in the way. Between the two degrees, I worked at the #Edmonton #Planetarium for four years as a presenter/operator (should out to the #ZeidlerDome at #TWoSE!).

    I'm a life-long #Trekie, thanks to my mother. I grew up with #TNG and #DS9, and watched the first 5 seasons of #VOY before leaving home for university. Currently very bullish on #SNW and #LDS.

    I'm also a lifelong #Baseball fan (#BlueJays and #Expos), and actively play rec #Softball.

    About a year ago, I purchased my first lens-swappable digital #camera, and have been figuring out #Photography ever since. Most of my posts have focused on sharing my pictures, though I've recently decided to start a dedicated account for that on a #PixelFed server, for the sake of searchability.

    My wife is currently studying political sociology, and I find her work fascinating. She's not currently on the Fediverse, but maybe one of these days.

    This has really lost all sense of narrative flow, hasn't it? Oops!

  27. I just did an #introduction a few days ago, but I've moved servers, so let's try one more time, for the cheap seats in the back!

    I'm currently a data analyst/product #DataScientist working with free-to-play #VideoGames, and living in #Halifax, #NovaScotia, #Canada. I've done a lot of work on #Analytics design, with a focus on ensuring player telemetry events are sensibly cross-referenceable, and looking for relationships between engagement with different game features and business outcomes.

    Business teams in freemium games love looking for magic buttons.

    I primarily use #SQL, #Pandas, #Statsmodels, and #SKLearn on #Databricks (#Python), and #JuliaLang (DataFrames.jl, GLM.jl, Gadfly.jl, and Makie.jl, etc) for smaller, locally run projects. My interests lie in expanding the library of ML models I have in my back pocket for performing inference based knowledge generation. I'm not super keen on automating products with quasi-black-boxes for the sake of revenue optimization. If I'm not personally learning something new about people through my work, I don't usually see the value in it.

    I did my BSc in #Physics and my MSc in #Astronomy, and, though I had dreams of progressing further down that pipeline, life kind of got in the way. Between the two degrees, I worked at the #Edmonton #Planetarium for four years as a presenter/operator (should out to the #ZeidlerDome at #TWoSE!).

    I'm a life-long #Trekie, thanks to my mother. I grew up with #TNG and #DS9, and watched the first 5 seasons of #VOY before leaving home for university. Currently very bullish on #SNW and #LDS.

    I'm also a lifelong #Baseball fan (#BlueJays and #Expos), and actively play rec #Softball.

    About a year ago, I purchased my first lens-swappable digital #camera, and have been figuring out #Photography ever since. Most of my posts have focused on sharing my pictures, though I've recently decided to start a dedicated account for that on a #PixelFed server, for the sake of searchability.

    My wife is currently studying political sociology, and I find her work fascinating. She's not currently on the Fediverse, but maybe one of these days.

    This has really lost all sense of narrative flow, hasn't it? Oops!

  28. I just did an #introduction a few days ago, but I've moved servers, so let's try one more time, for the cheap seats in the back!

    I'm currently a data analyst/product #DataScientist working with free-to-play #VideoGames, and living in #Halifax, #NovaScotia, #Canada. I've done a lot of work on #Analytics design, with a focus on ensuring player telemetry events are sensibly cross-referenceable, and looking for relationships between engagement with different game features and business outcomes.

    Business teams in freemium games love looking for magic buttons.

    I primarily use #SQL, #Pandas, #Statsmodels, and #SKLearn on #Databricks (#Python), and #JuliaLang (DataFrames.jl, GLM.jl, Gadfly.jl, and Makie.jl, etc) for smaller, locally run projects. My interests lie in expanding the library of ML models I have in my back pocket for performing inference based knowledge generation. I'm not super keen on automating products with quasi-black-boxes for the sake of revenue optimization. If I'm not personally learning something new about people through my work, I don't usually see the value in it.

    I did my BSc in #Physics and my MSc in #Astronomy, and, though I had dreams of progressing further down that pipeline, life kind of got in the way. Between the two degrees, I worked at the #Edmonton #Planetarium for four years as a presenter/operator (should out to the #ZeidlerDome at #TWoSE!).

    I'm a life-long #Trekie, thanks to my mother. I grew up with #TNG and #DS9, and watched the first 5 seasons of #VOY before leaving home for university. Currently very bullish on #SNW and #LDS.

    I'm also a lifelong #Baseball fan (#BlueJays and #Expos), and actively play rec #Softball.

    About a year ago, I purchased my first lens-swappable digital #camera, and have been figuring out #Photography ever since. Most of my posts have focused on sharing my pictures, though I've recently decided to start a dedicated account for that on a #PixelFed server, for the sake of searchability.

    My wife is currently studying political sociology, and I find her work fascinating. She's not currently on the Fediverse, but maybe one of these days.

    This has really lost all sense of narrative flow, hasn't it? Oops!

  29. I just did an #introduction a few days ago, but I've moved servers, so let's try one more time, for the cheap seats in the back!

    I'm currently a data analyst/product #DataScientist working with free-to-play #VideoGames, and living in #Halifax, #NovaScotia, #Canada. I've done a lot of work on #Analytics design, with a focus on ensuring player telemetry events are sensibly cross-referenceable, and looking for relationships between engagement with different game features and business outcomes.

    Business teams in freemium games love looking for magic buttons.

    I primarily use #SQL, #Pandas, #Statsmodels, and #SKLearn on #Databricks (#Python), and #JuliaLang (DataFrames.jl, GLM.jl, Gadfly.jl, and Makie.jl, etc) for smaller, locally run projects. My interests lie in expanding the library of ML models I have in my back pocket for performing inference based knowledge generation. I'm not super keen on automating products with quasi-black-boxes for the sake of revenue optimization. If I'm not personally learning something new about people through my work, I don't usually see the value in it.

    I did my BSc in #Physics and my MSc in #Astronomy, and, though I had dreams of progressing further down that pipeline, life kind of got in the way. Between the two degrees, I worked at the #Edmonton #Planetarium for four years as a presenter/operator (should out to the #ZeidlerDome at #TWoSE!).

    I'm a life-long #Trekie, thanks to my mother. I grew up with #TNG and #DS9, and watched the first 5 seasons of #VOY before leaving home for university. Currently very bullish on #SNW and #LDS.

    I'm also a lifelong #Baseball fan (#BlueJays and #Expos), and actively play rec #Softball.

    About a year ago, I purchased my first lens-swappable digital #camera, and have been figuring out #Photography ever since. Most of my posts have focused on sharing my pictures, though I've recently decided to start a dedicated account for that on a #PixelFed server, for the sake of searchability.

    My wife is currently studying political sociology, and I find her work fascinating. She's not currently on the Fediverse, but maybe one of these days.

    This has really lost all sense of narrative flow, hasn't it? Oops!