home.social

#textanalysis — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #textanalysis, aggregated by home.social.

  1. Poking around with sentiment analysis on the public domain copy of Pride and Prejudice by Jane Austen.

    I extracted the speech, did a strict attribution, and ran sentiment analysis for different speakers based off chunks sampled from the text.

    Elizabeth is neutral with a 28% confidence level, Jane is joyful at a 57% confidence. Darcy is sad with 94% confidence and Mrs Bennet is joyful at 95% confidence.

    Those aren't the emotions I get from reading the text. Again, I'm learning more about the sentiment analysis than the text.
    kaggle.com/code/alisonhawke/pr

    #DataScience #Python #Literature #TextAnalysis #SentimentAnalysis

  2. Spent some time doing data analysis on the Project Gutenberg text of Pride and Prejudice.

    Pulling out all the speech, the library I used said it was "emotionally neutral" in sentiment. Which is interesting because when you read it, the speech is absolutely not that. There's a lot in the subtleties of the speech that makes it very pointed.

    The confidence on the emotional rating was 57%, which seems low to me. Doing analysis on a book I'm familiar with and recently read is telling me as much about the means of evaluating the text as the text itself.
    #DataScience #TextAnalysis #SentimentAnalysis

  3. Why do politicians always talk about "middle class," "immigrants," or "families"?

    New research funded by @fwf and @dfg_public led by Dr. Lena Maria Huber (lenamariahuber.eu/, MZES, University of Mannheim) and Dr. Hauke Licht (University of Innsbruck), explores how politicians talk about social groups in campaign platforms and parliamentary speeches across 8 Western European countries.

    🔗haukelicht.github.io/projects/

    #PoliticalCommunication #ComputationalSocialScience #Democracy #TextAnalysis

  4. Why do politicians always talk about "middle class," "immigrants," or "families"?

    New research funded by @fwf and @dfg_public led by Dr. Lena Maria Huber (lenamariahuber.eu/, MZES, University of Mannheim) and Dr. Hauke Licht (University of Innsbruck), explores how politicians talk about social groups in campaign platforms and parliamentary speeches across 8 Western European countries.

    🔗haukelicht.github.io/projects/

    #PoliticalCommunication #ComputationalSocialScience #Democracy #TextAnalysis

  5. Why do politicians always talk about "middle class," "immigrants," or "families"?

    New research funded by @fwf and @dfg_public led by Dr. Lena Maria Huber (lenamariahuber.eu/, MZES, University of Mannheim) and Dr. Hauke Licht (University of Innsbruck), explores how politicians talk about social groups in campaign platforms and parliamentary speeches across 8 Western European countries.

    🔗haukelicht.github.io/projects/

    #PoliticalCommunication #ComputationalSocialScience #Democracy #TextAnalysis

  6. Why do politicians always talk about "middle class," "immigrants," or "families"?

    New research funded by @fwf and @dfg_public led by Dr. Lena Maria Huber (lenamariahuber.eu/, MZES, University of Mannheim) and Dr. Hauke Licht (University of Innsbruck), explores how politicians talk about social groups in campaign platforms and parliamentary speeches across 8 Western European countries.

    🔗haukelicht.github.io/projects/

    #PoliticalCommunication #ComputationalSocialScience #Democracy #TextAnalysis

  7. Why do politicians always talk about "middle class," "immigrants," or "families"?

    New research funded by @fwf and @dfg_public led by Dr. Lena Maria Huber (lenamariahuber.eu/, MZES, University of Mannheim) and Dr. Hauke Licht (University of Innsbruck), explores how politicians talk about social groups in campaign platforms and parliamentary speeches across 8 Western European countries.

    🔗haukelicht.github.io/projects/

    #PoliticalCommunication #ComputationalSocialScience #Democracy #TextAnalysis

  8. Can #AI reasoning models infer people's underlying reasons in unstructured chat data from group decisions?

    Across multiple prompting steps, #GTP5 usually did NOT select the same underlying reason as a human rater: doi.org/10.48550/arXiv.2601.05

    #AI #cogSci #textAnalysis #psychometrics

  9. Ive been digging around for text analysis OS apps and found AntConc via a Reddit thread. This app is very good from what I can see in early quick testing. Im looking at term frequency across relevant papers, and some 'concordance' context but AntConc will do a lot more. Together with Taguette you have all you need for a lot of analysis.

    Im running portable on Windows but Mac and Linux also work.
    laurenceanthony.net/software/a

    #AntConc #textanalysis #research #academia #academicchatter #linguistics

  10. Recs for text analysis tools, without any or only minimal genai - Taguette, QDA Miner, what else? Bulk document (around 50 papers) common word analysis is what Im mainly looking for, as well as individual document labelling. Open source, free, Windows 10.
    #QualitativeData #textanalysis #software #research #academia #academicchatter #opensource

  11. Charting Twain: Building a Character Interaction Graph with Quarkus, OpenNLP, and a local Ollama Model. Uncover hidden dynamics in Huckleberry Finn using Java, sentiment analysis, and modern NLP.
    myfear.substack.com/p/text-ana
    #Java #Quarkus #OpenLNP #TextAnalysis

  12. Ah, the groundbreaking revelation that #LLMs don't handle more words as well as they handle fewer. 🤯 Who knew that feeding a massive text blob would confuse a glorified autocomplete? 😂 Next week: water is wet! 🌊
    research.trychroma.com/context #textanalysis #AIhumor #technews #revelations #HackerNews #ngated

  13. Wow! #QualiService could be a great resource!

    It wasn't obvious to me how to find the transcripts for these doctor-patient interaction data from 4 countries, but if such transcripts are accessible, that's GREAT!

    qualiservice.org/en/qsearch.ht

    #medicine #openData #cogSci #TextAnalysis

  14. 🇪🇺 Want to analyze text from the EU public consultations? EU public consultations are a way in which the EU invites the broader public to publicly comment on upcoming legislation.

    📦 :python: I just published a first version of a Python package {eu-consultations} to scrape and extract text from the EU website:
    github.com/marioangst/eu_consu

    - download consultation data as displayed on the EU's frontend into a validated form
    - download associated files (this is the hard part about analysing this data - lots of feedback is in .docx and .pdf files)
    - extract text from the files using docling and attach to feedback

    You get all data in validated form and possibly stored in huge (sorry for that) JSON files ;).

    This package is part of an analysis project on feedback the EU has received via the public consultation process on digital policy we plan to present later this year, but I thought let's make some of the tools we use open source way earlier already.

    #python #textanalysis #policyanalysis #CompSocSci

  15. Like we found in “Your Health vs. My Liberty” (doi.org/10.1016/j.cognition.20) Yael Rozenblum et al. found that compliance with #publicHealth guidance correlated with indicators of the perceived threat of a viral pandemic.

    Also, relying on #misinformation correlated with reliance on simple (vs. complex) #reasoning.

    The free paper: doi.org/10.1002/tea.21975

    #medicine #health #education #psychology #epistemology #logic #textAnalysis

  16. Have you ever wanted to use a #LLM as one step in a workflow?

    We integrated #GPT into the open-source analysis platform #useGalaxy, where you can link GPT to several thousand other tools, add more attachments for analysis and make your research reproducible.

    galaxyproject.org/news/2024-09

    In our example, we uploaded an audio file and used #Whisper to convert it into text, cut out the moderation, and prompted chatGPT to translate it into German.

    #DH #textanalysis #tools
    @galaxyfreiburg

  17. 📚🇮🇹 New working paper: "Evaluating Embedding Models for Clustering Italian Political News"

    This study compares embedding models for unsupervised clustering of Italian political news shared on Facebook before the 2018 and 2022 elections, aiming to advance NLP methods for political text analysis in non-English languages.

    Paper: osf.io/preprints/osf/2j9ed

    Code & data: github.com/fabiogiglietto/Sema

    Feedback welcome!

    #NLP #PoliticalScience #TextAnalysis #MachineLearning

  18. Word co-occurrence matrix/heatmap

    How to compute and visualize the correlation between terms that occur together in a list of documents*

    *documents: keywords, page titles, product names/descriptions, social media posts, etc.

    bit.ly/3Z4tiTx

    #DataVisualization #textanalysis #DataScience #Python

  19. The Digital Humanities Team at the University of Vienna and the Ottoman Nature in Travelogues (ONiT) project are hosting a #hackathon focused on analyzing texts, images, and multimodal sources.

    Thursday, November 14, 9:00 CET to Friday, November 15, 15:00 CET
    dh.univie.ac.at/hackathon/
    #DigitalHumanities #ComputationalHumanities #TextAnalysis #ImageAnalysis

  20. It was also a methodologically fun paper, combining digitized archival text, Census & survey data, NLP, and panel models.

    Email or dm me for a copy! #sociology #textanalysis #rstats

    3/3

  21. 📣 Attention Linguistics & Digital Humanities students! 🎓📚
    Join @janispagel and me for the »Prompting, Evaluation, Interpretation: An Introduction to LLMs in Text Analysis« course at the upcoming Deep Learning for Language Analysis Summer School in Cologne: ml-school.uni-koeln.de! 📝🔍
    🗓️ Don't miss out – registration is open until June 16th! 🙌
    #LLMs #TextAnalysis #NLP #AI #Linguistics #DigitalHumanities #CRETA

  22. Want to learn more about how to use regular expressions in R?

    Come join us to learn how to use regular expressions to parse and clean text data on Thursday, June 6th, 5-6pm Eastern Time!

    Find the Zoom registration details on our website:

    rug-at-hdsi.org/upcoming_event

  23. Bias estimation in word embeddings using a Bayesian approach instead of WEAT or MAC. A new paper in Computational Linguistics.

  24. How would you go about creating a filter that blocks posts about things that people hate?

    I've thought I could build a text classifier, but it could be hard to train since I'd need to guess whether or not the author hates the thing they are posting about.

    I wouldn't want it to become a filter for all current events news, but I suspect that's what it would become.

    #fediverse #mastodon #machineLearning #tfidf #classification #socialMedia #classifier #textAnalysis #programming #tech #technology

  25. JOB: Wissenschaftliche/r Mitarbeiter/in (m/w/d) Digital Humanities
    An der Philosophischen Fakultät der RWTH Aachen University (100% Beschäftigung, Tarifstufe TV-L13) im Bereich Digital Humanities. Die Stelle ist zunächst auf drei Jahre befristet.

    #DigitalHumanities #TextAnalysis #QuantitativeMethods #LiteraryCorpora

    accels.rwth-aachen.de/cms/ACCE

  26. JOB: Wissenschaftliche/r Mitarbeiter/in (m/w/d) Digital Humanities
    An der Philosophischen Fakultät der RWTH Aachen University (100% Beschäftigung, Tarifstufe TV-L13) im Bereich Digital Humanities. Die Stelle ist zunächst auf drei Jahre befristet.

    #DigitalHumanities #TextAnalysis #QuantitativeMethods #LiteraryCorpora

    accels.rwth-aachen.de/cms/ACCE

  27. JOB: Wissenschaftliche/r Mitarbeiter/in (m/w/d) Digital Humanities
    An der Philosophischen Fakultät der RWTH Aachen University (100% Beschäftigung, Tarifstufe TV-L13) im Bereich Digital Humanities. Die Stelle ist zunächst auf drei Jahre befristet.

    #DigitalHumanities #TextAnalysis #QuantitativeMethods #LiteraryCorpora

    accels.rwth-aachen.de/cms/ACCE

  28. JOB: Wissenschaftliche/r Mitarbeiter/in (m/w/d) Digital Humanities
    An der Philosophischen Fakultät der RWTH Aachen University (100% Beschäftigung, Tarifstufe TV-L13) im Bereich Digital Humanities. Die Stelle ist zunächst auf drei Jahre befristet.

    #DigitalHumanities #TextAnalysis #QuantitativeMethods #LiteraryCorpora

    accels.rwth-aachen.de/cms/ACCE

  29. JOB: Wissenschaftliche/r Mitarbeiter/in (m/w/d) Digital Humanities
    An der Philosophischen Fakultät der RWTH Aachen University (100% Beschäftigung, Tarifstufe TV-L13) im Bereich Digital Humanities. Die Stelle ist zunächst auf drei Jahre befristet.

    #DigitalHumanities #TextAnalysis #QuantitativeMethods #LiteraryCorpora

    accels.rwth-aachen.de/cms/ACCE

  30. Automated scoring of reflective thinking in accounting students' writing "positively related to data analytics assignment grades [but] #emotionalIntelligence (EI) was not found to moderate th[is] relationship" (N = 86).

    Images of pages from the thesis are attached: udallas-ir.tdl.org/handle/20.5

    #CriticalThinking #Emotion #EmotionalIntelligence #EQ #NaturalLanguageProcessing #NLP #TextAnalysis #DevelopmentalPsychology #DevPsych #Teaching #Education #DataAnalysis