#text-analysis — Public Fediverse posts on home.social

IRRJ @[email protected] · 2026-06-16 · 13:09 UTC

Published at #IRRJ: "Much Ado about Accessibility: An Exploration of Online Content Accessibility from an Autism-Informed Perspective" by Hrishita Chakrabarti and Maria Soledad Pera. #TextAnalysis, #Autism, #WebAccessibility, #SearchEngines, #LLMs

https://doi.org/10.54195/irrj.25297

#irrj #textanalysis #autism #webaccessibility #searchengines #llms

IRRJ @[email protected] · 2026-06-16 · 13:09 UTC

Published at #IRRJ: "Much Ado about Accessibility: An Exploration of Online Content Accessibility from an Autism-Informed Perspective" by Hrishita Chakrabarti and Maria Soledad Pera. #TextAnalysis, #Autism, #WebAccessibility, #SearchEngines, #LLMs

https://doi.org/10.54195/irrj.25297

#irrj #textanalysis #autism #webaccessibility #searchengines #llms

Alison @[email protected] · 2026-05-02 · 13:52 UTC

Poking around with sentiment analysis on the public domain copy of Pride and Prejudice by Jane Austen.

I extracted the speech, did a strict attribution, and ran sentiment analysis for different speakers based off chunks sampled from the text.

Elizabeth is neutral with a 28% confidence level, Jane is joyful at a 57% confidence. Darcy is sad with 94% confidence and Mrs Bennet is joyful at 95% confidence.

Those aren't the emotions I get from reading the text. Again, I'm learning more about the sentiment analysis than the text.
https://www.kaggle.com/code/alisonhawke/pride-and-prejudice

#DataScience #Python #Literature #TextAnalysis #SentimentAnalysis

#datascience #python #literature #textanalysis #sentimentanalysis

Alison @[email protected] · 2026-05-02 · 13:52 UTC

Poking around with sentiment analysis on the public domain copy of Pride and Prejudice by Jane Austen.

I extracted the speech, did a strict attribution, and ran sentiment analysis for different speakers based off chunks sampled from the text.

Elizabeth is neutral with a 28% confidence level, Jane is joyful at a 57% confidence. Darcy is sad with 94% confidence and Mrs Bennet is joyful at 95% confidence.

Those aren't the emotions I get from reading the text. Again, I'm learning more about the sentiment analysis than the text.
https://www.kaggle.com/code/alisonhawke/pride-and-prejudice

#DataScience #Python #Literature #TextAnalysis #SentimentAnalysis

#datascience #python #literature #textanalysis #sentimentanalysis

Alison @[email protected] · 2026-04-22 · 11:42 UTC

Spent some time doing data analysis on the Project Gutenberg text of Pride and Prejudice.

Pulling out all the speech, the library I used said it was "emotionally neutral" in sentiment. Which is interesting because when you read it, the speech is absolutely not that. There's a lot in the subtleties of the speech that makes it very pointed.

The confidence on the emotional rating was 57%, which seems low to me. Doing analysis on a book I'm familiar with and recently read is telling me as much about the means of evaluating the text as the text itself.
#DataScience #TextAnalysis #SentimentAnalysis

#datascience #textanalysis #sentimentanalysis

Alison @[email protected] · 2026-04-22 · 11:42 UTC

Spent some time doing data analysis on the Project Gutenberg text of Pride and Prejudice.

Pulling out all the speech, the library I used said it was "emotionally neutral" in sentiment. Which is interesting because when you read it, the speech is absolutely not that. There's a lot in the subtleties of the speech that makes it very pointed.

The confidence on the emotional rating was 57%, which seems low to me. Doing analysis on a book I'm familiar with and recently read is telling me as much about the means of evaluating the text as the text itself.
#DataScience #TextAnalysis #SentimentAnalysis

#datascience #textanalysis #sentimentanalysis

Hacker News @[email protected] · 2026-04-19 · 05:04 UTC

Metatextual Literacy

https://www.jenn.site/metatextual-literacy/

#HackerNews #metatextualliteracy #digitalliteracy #criticalreading #mediaawareness #textanalysis

#hackernews #metatextualliteracy #digitalliteracy #criticalreading #mediaawareness #textanalysis

Hacker News @[email protected] · 2026-04-19 · 05:04 UTC

Metatextual Literacy

https://www.jenn.site/metatextual-literacy/

#HackerNews #metatextualliteracy #digitalliteracy #criticalreading #mediaawareness #textanalysis

#hackernews #metatextualliteracy #digitalliteracy #criticalreading #mediaawareness #textanalysis

DiSC_uibk @[email protected] · 2026-02-09 · 11:54 UTC

Why do politicians always talk about "middle class," "immigrants," or "families"?

New research funded by @fwf and @dfg_public led by Dr. Lena Maria Huber (https://lenamariahuber.eu/, MZES, University of Mannheim) and Dr. Hauke Licht (University of Innsbruck), explores how politicians talk about social groups in campaign platforms and parliamentary speeches across 8 Western European countries.

🔗https://haukelicht.github.io/projects/gaepd/

#PoliticalCommunication #ComputationalSocialScience #Democracy #TextAnalysis

#politicalcommunication #computationalsocialscience #democracy #textanalysis

DiSC_uibk @[email protected] · 2026-02-09 · 11:54 UTC

Why do politicians always talk about "middle class," "immigrants," or "families"?

New research funded by @fwf and @dfg_public led by Dr. Lena Maria Huber (https://lenamariahuber.eu/, MZES, University of Mannheim) and Dr. Hauke Licht (University of Innsbruck), explores how politicians talk about social groups in campaign platforms and parliamentary speeches across 8 Western European countries.

🔗https://haukelicht.github.io/projects/gaepd/

#PoliticalCommunication #ComputationalSocialScience #Democracy #TextAnalysis

#politicalcommunication #computationalsocialscience #democracy #textanalysis

Nick Byrd, Ph.D. @[email protected] · 2026-01-21 · 12:28 UTC

Can #AI reasoning models infer people's underlying reasons in unstructured chat data from group decisions?

Across multiple prompting steps, #GTP5 usually did NOT select the same underlying reason as a human rater: https://doi.org/10.48550/arXiv.2601.05582

#AI #cogSci #textAnalysis #psychometrics

#ai #gtp5 #cogsci #textanalysis #psychometrics

Nick Byrd, Ph.D. @[email protected] · 2026-01-21 · 12:28 UTC

Can #AI reasoning models infer people's underlying reasons in unstructured chat data from group decisions?

Across multiple prompting steps, #GTP5 usually did NOT select the same underlying reason as a human rater: https://doi.org/10.48550/arXiv.2601.05582

#AI #cogSci #textAnalysis #psychometrics

#ai #gtp5 #cogsci #textanalysis #psychometrics

Hacker News @[email protected] · 2026-01-20 · 21:02 UTC

Fast Concordance: Instant concordance on a corpus of >1,200 books

https://iafisher.com/concordance/

#HackerNews #FastConcordance #InstantConcordance #Books #Corpus #TextAnalysis #LiteratureTech

#hackernews #fastconcordance #instantconcordance #books #corpus #textanalysis

Hacker News @[email protected] · 2026-01-20 · 21:02 UTC

Fast Concordance: Instant concordance on a corpus of >1,200 books

https://iafisher.com/concordance/

#HackerNews #FastConcordance #InstantConcordance #Books #Corpus #TextAnalysis #LiteratureTech

#hackernews #fastconcordance #instantconcordance #books #corpus #textanalysis

Dr Pen @[email protected] · 2025-12-15 · 07:28 UTC

Ive been digging around for text analysis OS apps and found AntConc via a Reddit thread. This app is very good from what I can see in early quick testing. Im looking at term frequency across relevant papers, and some 'concordance' context but AntConc will do a lot more. Together with Taguette you have all you need for a lot of analysis.

Im running portable on Windows but Mac and Linux also work.
https://www.laurenceanthony.net/software/antconc/

#AntConc #textanalysis #research #academia #academicchatter #linguistics

#antconc #textanalysis #research #academia #academicchatter #linguistics

Dr Pen @[email protected] · 2025-12-15 · 07:28 UTC

Ive been digging around for text analysis OS apps and found AntConc via a Reddit thread. This app is very good from what I can see in early quick testing. Im looking at term frequency across relevant papers, and some 'concordance' context but AntConc will do a lot more. Together with Taguette you have all you need for a lot of analysis.

Im running portable on Windows but Mac and Linux also work.
https://www.laurenceanthony.net/software/antconc/

#AntConc #textanalysis #research #academia #academicchatter #linguistics

#antconc #textanalysis #research #academia #academicchatter #linguistics

Dr Pen @[email protected] · 2025-12-08 · 14:28 UTC

Recs for text analysis tools, without any or only minimal genai - Taguette, QDA Miner, what else? Bulk document (around 50 papers) common word analysis is what Im mainly looking for, as well as individual document labelling. Open source, free, Windows 10.
#QualitativeData #textanalysis #software #research #academia #academicchatter #opensource

#qualitativedata #textanalysis #software #research #academia #academicchatter

Dr Pen @[email protected] · 2025-12-08 · 14:28 UTC

Recs for text analysis tools, without any or only minimal genai - Taguette, QDA Miner, what else? Bulk document (around 50 papers) common word analysis is what Im mainly looking for, as well as individual document labelling. Open source, free, Windows 10.
#QualitativeData #textanalysis #software #research #academia #academicchatter #opensource

#qualitativedata #textanalysis #software #research #academia #academicchatter

JCLS @[email protected] · 2025-11-21 · 15:37 UTC

New article in #JCLS 4(1)! 🎉
@dudarjulia & @christof introduce a method for evaluating measures of #distinctiveness ( #keyness ) using synthetically generated, fully controlled text data.
#CLS #TextAnalysis #Evaluation #NLP #NLG #LiteraryComputing #CCLS25
https://jcls.io/issue/118/info/

#jcls #distinctiveness #keyness #cls #textanalysis #evaluation

JCLS @[email protected] · 2025-11-21 · 15:37 UTC

New article in #JCLS 4(1)! 🎉
@dudarjulia & @christof introduce a method for evaluating measures of #distinctiveness ( #keyness ) using synthetically generated, fully controlled text data.
#CLS #TextAnalysis #Evaluation #NLP #NLG #LiteraryComputing #CCLS25
https://jcls.io/issue/118/info/

#jcls #distinctiveness #keyness #cls #textanalysis #evaluation

Markus Eisele @[email protected] · 2025-08-09 · 06:18 UTC

Charting Twain: Building a Character Interaction Graph with Quarkus, OpenNLP, and a local Ollama Model. Uncover hidden dynamics in Huckleberry Finn using Java, sentiment analysis, and modern NLP.
https://myfear.substack.com/p/text-analytics-quarkus-opennlp-huckleberry-finn
#Java #Quarkus #OpenLNP #TextAnalysis

#java #quarkus #openlnp #textanalysis

Markus Eisele @[email protected] · 2025-08-09 · 06:18 UTC

Charting Twain: Building a Character Interaction Graph with Quarkus, OpenNLP, and a local Ollama Model. Uncover hidden dynamics in Huckleberry Finn using Java, sentiment analysis, and modern NLP.
https://myfear.substack.com/p/text-analytics-quarkus-opennlp-huckleberry-finn
#Java #Quarkus #OpenLNP #TextAnalysis

#java #quarkus #openlnp #textanalysis

N-gated Hacker News @[email protected] · 2025-07-14 · 22:11 UTC

Ah, the groundbreaking revelation that #LLMs don't handle more words as well as they handle fewer. 🤯 Who knew that feeding a massive text blob would confuse a glorified autocomplete? 😂 Next week: water is wet! 🌊
https://research.trychroma.com/context-rot #textanalysis #AIhumor #technews #revelations #HackerNews #ngated

#llms #textanalysis #aihumor #technews #revelations #hackernews

N-gated Hacker News @[email protected] · 2025-07-14 · 22:11 UTC

Ah, the groundbreaking revelation that #LLMs don't handle more words as well as they handle fewer. 🤯 Who knew that feeding a massive text blob would confuse a glorified autocomplete? 😂 Next week: water is wet! 🌊
https://research.trychroma.com/context-rot #textanalysis #AIhumor #technews #revelations #HackerNews #ngated

#llms #textanalysis #aihumor #technews #revelations #hackernews

Nick Byrd, Ph.D. @[email protected] · 2025-04-16 · 10:29 UTC

Wow! #QualiService could be a great resource!

It wasn't obvious to me how to find the transcripts for these doctor-patient interaction data from 4 countries, but if such transcripts are accessible, that's GREAT!

https://www.qualiservice.org/en/qsearch.html?q=diagnosis

#medicine #openData #cogSci #TextAnalysis

#qualiservice #medicine #opendata #cogsci #textanalysis

Nick Byrd, Ph.D. @[email protected] · 2025-04-16 · 10:29 UTC

Wow! #QualiService could be a great resource!

It wasn't obvious to me how to find the transcripts for these doctor-patient interaction data from 4 countries, but if such transcripts are accessible, that's GREAT!

https://www.qualiservice.org/en/qsearch.html?q=diagnosis

#medicine #openData #cogSci #TextAnalysis

#qualiservice #medicine #opendata #cogsci #textanalysis

Mario Angst @[email protected] · 2025-03-26 · 12:11 UTC

🇪🇺 Want to analyze text from the EU public consultations? EU public consultations are a way in which the EU invites the broader public to publicly comment on upcoming legislation.

📦 :python: I just published a first version of a Python package {eu-consultations} to scrape and extract text from the EU website:
https://github.com/marioangst/eu_consultations

- download consultation data as displayed on the EU's frontend into a validated form
- download associated files (this is the hard part about analysing this data - lots of feedback is in .docx and .pdf files)
- extract text from the files using docling and attach to feedback

You get all data in validated form and possibly stored in huge (sorry for that) JSON files ;).

This package is part of an analysis project on feedback the EU has received via the public consultation process on digital policy we plan to present later this year, but I thought let's make some of the tools we use open source way earlier already.

#python #textanalysis #policyanalysis #CompSocSci

#python #textanalysis #policyanalysis #compsocsci

Mario Angst @[email protected] · 2025-03-26 · 12:11 UTC

🇪🇺 Want to analyze text from the EU public consultations? EU public consultations are a way in which the EU invites the broader public to publicly comment on upcoming legislation.

📦 :python: I just published a first version of a Python package {eu-consultations} to scrape and extract text from the EU website:
https://github.com/marioangst/eu_consultations

- download consultation data as displayed on the EU's frontend into a validated form
- download associated files (this is the hard part about analysing this data - lots of feedback is in .docx and .pdf files)
- extract text from the files using docling and attach to feedback

You get all data in validated form and possibly stored in huge (sorry for that) JSON files ;).

This package is part of an analysis project on feedback the EU has received via the public consultation process on digital policy we plan to present later this year, but I thought let's make some of the tools we use open source way earlier already.

#python #textanalysis #policyanalysis #CompSocSci

#python #textanalysis #policyanalysis #compsocsci

Nick Byrd, Ph.D. @[email protected] · 2024-09-25 · 13:13 UTC

Like we found in “Your Health vs. My Liberty” (https://doi.org/10.1016/j.cognition.2021.104649) Yael Rozenblum et al. found that compliance with #publicHealth guidance correlated with indicators of the perceived threat of a viral pandemic.

Also, relying on #misinformation correlated with reliance on simple (vs. complex) #reasoning.

The free paper: https://doi.org/10.1002/tea.21975

#medicine #health #education #psychology #epistemology #logic #textAnalysis

#publichealth #misinformation #reasoning #medicine #health #education

Nick Byrd, Ph.D. @[email protected] · 2024-09-25 · 13:13 UTC

Like we found in “Your Health vs. My Liberty” (https://doi.org/10.1016/j.cognition.2021.104649) Yael Rozenblum et al. found that compliance with #publicHealth guidance correlated with indicators of the perceived threat of a viral pandemic.

Also, relying on #misinformation correlated with reliance on simple (vs. complex) #reasoning.

The free paper: https://doi.org/10.1002/tea.21975

#medicine #health #education #psychology #epistemology #logic #textAnalysis

#publichealth #misinformation #reasoning #medicine #health #education

Daniela Schneider @[email protected] · 2024-09-09 · 15:09 UTC

Have you ever wanted to use a #LLM as one step in a workflow?

We integrated #GPT into the open-source analysis platform #useGalaxy, where you can link GPT to several thousand other tools, add more attachments for analysis and make your research reproducible.

https://galaxyproject.org/news/2024-09-02-chat-gpt/

In our example, we uploaded an audio file and used #Whisper to convert it into text, cut out the moderation, and prompted chatGPT to translate it into German.

#DH #textanalysis #tools
@galaxyfreiburg

#llm #gpt #usegalaxy #whisper #dh #textanalysis

Daniela Schneider @[email protected] · 2024-09-09 · 15:09 UTC

Have you ever wanted to use a #LLM as one step in a workflow?

We integrated #GPT into the open-source analysis platform #useGalaxy, where you can link GPT to several thousand other tools, add more attachments for analysis and make your research reproducible.

https://galaxyproject.org/news/2024-09-02-chat-gpt/

In our example, we uploaded an audio file and used #Whisper to convert it into text, cut out the moderation, and prompted chatGPT to translate it into German.

#DH #textanalysis #tools
@galaxyfreiburg

#llm #gpt #usegalaxy #whisper #dh #textanalysis

Fabio Giglietto @[email protected] · 2024-08-21 · 11:24 UTC

📚🇮🇹 New working paper: "Evaluating Embedding Models for Clustering Italian Political News"

This study compares embedding models for unsupervised clustering of Italian political news shared on Facebook before the 2018 and 2022 elections, aiming to advance NLP methods for political text analysis in non-English languages.

Paper: https://osf.io/preprints/osf/2j9ed

Code & data: https://github.com/fabiogiglietto/Semantic-Clustering-Italian-News

Feedback welcome!

#NLP #PoliticalScience #TextAnalysis #MachineLearning

#nlp #politicalscience #textanalysis #machinelearning

Fabio Giglietto @[email protected] · 2024-08-21 · 11:24 UTC

📚🇮🇹 New working paper: "Evaluating Embedding Models for Clustering Italian Political News"

This study compares embedding models for unsupervised clustering of Italian political news shared on Facebook before the 2018 and 2022 elections, aiming to advance NLP methods for political text analysis in non-English languages.

Paper: https://osf.io/preprints/osf/2j9ed

Code & data: https://github.com/fabiogiglietto/Semantic-Clustering-Italian-News

Feedback welcome!

#NLP #PoliticalScience #TextAnalysis #MachineLearning

#nlp #politicalscience #textanalysis #machinelearning

Paul Houle @[email protected] · 2024-08-17 · 20:47 UTC

🎯 Potential terrorists can be identified from social media posts, new research shows

https://phys.org/news/2024-08-potential-terrorists-social-media.html

#media #socialmedia #privacy #textanalysis #ai

Paul Houle @[email protected] · 2024-08-17 · 20:47 UTC

🎯 Potential terrorists can be identified from social media posts, new research shows

https://phys.org/news/2024-08-potential-terrorists-social-media.html

#media #socialmedia #privacy #textanalysis #ai

Elias Dabbas :verified: @[email protected] · 2024-08-15 · 14:14 UTC

Word co-occurrence matrix/heatmap

How to compute and visualize the correlation between terms that occur together in a list of documents*

*documents: keywords, page titles, product names/descriptions, social media posts, etc.

https://bit.ly/3Z4tiTx

#DataVisualization #textanalysis #DataScience #Python

#datavisualization #textanalysis #datascience #python

Harald Klinke @[email protected] · 2024-07-17 · 10:50 UTC

The Digital Humanities Team at the University of Vienna and the Ottoman Nature in Travelogues (ONiT) project are hosting a #hackathon focused on analyzing texts, images, and multimodal sources.

Thursday, November 14, 9:00 CET to Friday, November 15, 15:00 CET
https://dh.univie.ac.at/hackathon/
#DigitalHumanities #ComputationalHumanities #TextAnalysis #ImageAnalysis

#digitalhumanities #hackathon #computationalhumanities #textanalysis #imageanalysis

Harald Klinke @[email protected] · 2024-07-17 · 10:50 UTC

The Digital Humanities Team at the University of Vienna and the Ottoman Nature in Travelogues (ONiT) project are hosting a #hackathon focused on analyzing texts, images, and multimodal sources.

Thursday, November 14, 9:00 CET to Friday, November 15, 15:00 CET
https://dh.univie.ac.at/hackathon/
#DigitalHumanities #ComputationalHumanities #TextAnalysis #ImageAnalysis

#digitalhumanities #hackathon #computationalhumanities #textanalysis #imageanalysis

Marshall A. Taylor @[email protected] · 2024-07-08 · 18:48 UTC

It was also a methodologically fun paper, combining digitized archival text, Census & survey data, NLP, and panel models.

Email or dm me for a copy! #sociology #textanalysis #rstats

3/3

#sociology #textanalysis #rstats

Marshall A. Taylor @[email protected] · 2024-07-08 · 18:48 UTC

It was also a methodologically fun paper, combining digitized archival text, Census & survey data, NLP, and panel models.

Email or dm me for a copy! #sociology #textanalysis #rstats

3/3

#sociology #textanalysis #rstats

Paul Houle @[email protected] · 2024-06-27 · 22:56 UTC

😓 An NLP-Based System for Detecting Depression Levels through User Comments on Twitter (X)

https://www.mdpi.com/2227-7390/12/13/1926

#mentalhealth #depression #nlp #ai #socialmedia #textanalysis #privacy

#mentalhealth #depression #nlp #ai #socialmedia #textanalysis

Paul Houle @[email protected] · 2024-06-27 · 22:56 UTC

😓 An NLP-Based System for Detecting Depression Levels through User Comments on Twitter (X)

https://www.mdpi.com/2227-7390/12/13/1926

#mentalhealth #depression #nlp #ai #socialmedia #textanalysis #privacy

#mentalhealth #depression #nlp #ai #socialmedia #textanalysis

Axel Pichler @[email protected] · 2024-06-04 · 07:25 UTC

📣 Attention Linguistics & Digital Humanities students! 🎓📚
Join @janispagel and me for the »Prompting, Evaluation, Interpretation: An Introduction to LLMs in Text Analysis« course at the upcoming Deep Learning for Language Analysis Summer School in Cologne: http://ml-school.uni-koeln.de! 📝🔍
🗓️ Don't miss out – registration is open until June 16th! 🙌
#LLMs #TextAnalysis #NLP #AI #Linguistics #DigitalHumanities #CRETA

#llms #textanalysis #nlp #ai #linguistics #digitalhumanities

R User Group @Harvard :rstats: @[email protected] · 2024-05-29 · 17:36 UTC

Want to learn more about how to use regular expressions in R?

Come join us to learn how to use regular expressions to parse and clean text data on Thursday, June 6th, 5-6pm Eastern Time!

Find the Zoom registration details on our website:

https://rug-at-hdsi.org/upcoming_events/2024-05-06-regex-sarah-hirsch.html

#rstats #DataScience #regex #TextAnalysis

#rstats #datascience #regex #textanalysis

alissonmasoares @[email protected] · 2024-05-24 · 15:21 UTC

Bias estimation in word embeddings using a Bayesian approach instead of WEAT or MAC. A new paper in Computational Linguistics.

#ComputationalSocialSciences #textanalysis #NLP

#computationalsocialsciences #textanalysis #nlp

🪬🍄🌈🎮💻🚲🥓🎃💀🏴🛻♓🧿 @[email protected] · 2024-05-06 · 14:39 UTC

How would you go about creating a filter that blocks posts about things that people hate?

I've thought I could build a text classifier, but it could be hard to train since I'd need to guess whether or not the author hates the thing they are posting about.

I wouldn't want it to become a filter for all current events news, but I suspect that's what it would become.

#fediverse #mastodon #machineLearning #tfidf #classification #socialMedia #classifier #textAnalysis #programming #tech #technology

#machinelearning #tfidf #classification #socialmedia #classifier #textanalysis

Paul Houle @[email protected] · 2024-03-24 · 11:00 UTC

🤖 Generator-Guided Crowd Reaction Assessment

(... I was really fascinated with this paper because my YOShInOn RSS reader has a module like this which can predict the popularity of a story on HN and if people will have a big discussion about it; it is super-easy to gather data for this kind of model)

https://arxiv.org/abs/2403.09702

#cs #research #ai #ml #textanalysis