home.social

#emnlp2023 — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #emnlp2023, aggregated by home.social.

  1. RT by @wikiresearch: The video of the presentation w/ @rnav_arora of our #EMNLP2023 paper on Transparent Stance Detection in Multilingual Wikipedia Editor Discussions predicting Wikipedia policies for content moderation is now online at
    youtu.be/UUuC6Q1SIoM?t=2190 twitter.com/frimelle/status/17

  2. RT by @wikiresearch: Excited to start the new year by presenting our #EMNLP2023 paper on Transparent Stance Detection in Multilingual Wikipedia Editor Discussions w/ @rnav_arora @IAugenstein at the @Wikimedia Research Showcase!
    Online, 17.01., 17:30 UTC

    mediawiki.org/wiki/Wikimedia_R @wikiresearch twitter.com/frimelle/status/17

  3. A paper on the topic by Max Glockner (UKP Lab), @ievaraminta Staliūnaitė (University of Cambridge), James Thorne (KAIST AI), Gisela Vallejo (University of Melbourne), Andreas Vlachos (University of Cambridge) and Iryna Gurevych was accepted to TACL and has just been presented at #EMNLP2023.

    📄 arxiv.org/abs/2104.00640

    ➡️ sigmoid.social/@UKPLab/1115613

  4. At #EMNLP2023, our colleague Jonathan Tonglet presented his master thesis, conducted at the KU Leuven. Find out more about »SEER : A Knapsack approach to Exemplar Selection for In-Context HybridQA« in this thread 🧵:

    ➡️ sigmoid.social/@UKPLab/1113743

  5. Many models produce outputs that are hard to verify for an end user.

    🏆 Our new #emnlp2023 paper won an outstanding paper award for showing that a secondary quality estimation model can help users decide when to rely on the model output.

    We ran a controlled experiment showing that a calibrated quality estimation model can make physicians twice better at correctly deciding when to rely on a translation model output.

    Paper: arxiv.org/pdf/2310.16924v1.pdf

  6. A group photo from the poster presentation of »AmbiFC: Fact-Checking Ambiguous Claims with Evidence«, co-authored by our colleague Max Glockner, @ievaraminta, James Thorne, Gisela Vallejo, Andreas Vlachos and Iryna Gurevych. #EMNLP2023

  7. A successful #EMNLPMeeting has come to an end! A group photo of our colleagues Yongxin Huang, Jonathan Tonglet, Aniket Pramanick, Sukannya Purkayastha, Dominic Petrak and Max Glockner, who represented the UKP Lab in Singapore! #EMNLP2023

  8. You can find our paper here:
    📃 arxiv.org/abs/2311.00408
    and our code here:
    💻 github.com/UKPLab/AdaSent

    Check out the work of our authors Yongxin Huang, Kexin Wang, Sourav Dutta, Raj Nath Patel, Goran Glavaš and Iryna Gurevych! (6/🧵) #EMNLP2023 #AdaSent #NLProc

  9. What makes the difference 🧐 ?

    We attribute the effectiveness of the sentence encoding adapter to the consistency between the pre-training and DAPT objectives of the base PLM. If the base PLM is domain-adapted with another loss, the adapter won’t be compatible any more, reflected in a performance drop. (5/🧵) #EMNLP2023

  10. AdaSent decouples DAPT and SEPT by storing the sentence encoding abilities into an adapter, which is trained only once in the general domain and plugged into various DAPT-ed PLMs. It can match or surpass the performance of DAPT→SEPT, with more efficient training. (4/🧵) #EMNLP2023

  11. Domain-adapted sentence embeddings can be created by applying general-domain SEPT on top of a domain-adapted base PLM (DAPT→SEPT). But this requires the same SEPT procedure to be done on each DAPT-ed PLM for every domain, resulting in computational inefficiency. (3/🧵) #EMNLP2023

  12. In our #EMNLP2023 paper we demonstrate AdaSent's effectiveness in extensive experiments on 17 different few-shot sentence classification datasets! It matches or surpasses the performance of full SEPT on DAPT-ed PLM (DAPT→SEPT) while substantially reducing training costs. (2/🧵)

  13. Need a lightweight solution for few-shot domain-specific sentence classification?

    We propose #AdaSent!
    🚀 Up to 7.2 acc. gain in 8-shot classification with 10K unlabeled data
    🪶 Small backbone with 82M parameters
    🧩 Reusable general sentence adapter across domains
    (1/🧵) #EMNLP2023

  14. Which factors shape #NLProc research over time? This was the topic of the talk by our colleague Aniket Pramanick at #EMNLP2023!

    Learn more about the paper by him, Yufang Hou, Saif M. Mohammad & Iryna Gurevych here: 📑 arxiv.org/abs/2305.12920

  15. If you are around at #EMNLP2023, look out for our colleague Sukannya Purkayastha, who presented today our paper on the use of Jiu-Jitsu argumentation in #PeerReview, authored by her, Anne Lauscher (Universität Hamburg) and Iryna Gurevych.

    📑 arxiv.org/abs/2311.03998

  16. Check out the full paper on arXiv and the code on GitLab – we look forward to your thoughts and feedback! (9/9) #NLProc #eRisk #EMNLP2023

    Paper 📄 arxiv.org/abs/2211.07624
    Code ⌨️ gitlab.irlab.org/anxo.pvila/se

  17. We also illustrate how our semantic retrieval pipeline provides interpretability of the symptom estimation, highlighting the most relevant sentences. (8/🧵) #EMNLP2023 #NLProc

  18. Our approaches achieve good performance in two Reddit benchmark collections (DCHR metric). (7/🧵) #EMNLP2023 #NLProc

  19. With this aim, we introduce two data selection strategies to detect representative sentences, both unsupervised & semi-supervised.

    For the latter, we propose an annotation schema to obtain relevant training samples. (6/🧵) #EMNLP2023

  20. We build symptom-classifiers following the BDI-II 📋 using a semantic retrieval pipeline to predict every symptom decision.

    Our pipeline searches for semantic similarities over an index of representative sentences for each symptom. (5/🧵) #EMNLP2023

  21. Our goal❗: Estimate depression severity levels for social media users based on their posts. We automatically relate their responses to the BDI-II and calculate their BDI-score related to four depression levels. (4/🧵) #EMNLP2023

  22. We propose an approach which incorporates clinical questionnaires to detect the presence of symptom markers. We adhere to the BDI-II questionnaire 📋, which includes 21 clinically validated symptoms with four alternative responses. (3/🧵) #EMNLP2023 #NLProc

  23. 🧠 Discover how using a method that associates users’ writing semantics with fine-grained depression symptoms can be helpful in the diagnosis of the disease!

    Our data-efficient method is compatible with any kind of #SBERT model! (2/🧵) #EMNLP2023 #NLProc

  24. Can we estimate depression severity from social media texts without much training data?

    ✅ Yes! Semantic Similarity Models come to the rescue!

    Check out our latest #EMNLP2023 paper by Anxo Pérez (CITIC), Neha Warikoo, Kexin Wang (UKP Lab), Javier Parapar (CITIC) and Iryna Gurevych – (1/🧵)

    📝 arxiv.org/abs/2211.07624

  25. As a nice surprise, this paper ended up getting an “honourable mention” at #CONLL at #EMNLP2023. Great work by first authors @briemadu and Pelin!

    arxiv.org/abs/2310.18229

  26. From my friends Jonas Groschwitz and Meaghan Fowlie and their colleagues:
    Check out GrAPES, the Granular AMR Parsing Evaluation Suite at EMNLP and arxiv.org/abs/2312.03480, code at github.com/jgroschwitz/GrAPES! Explore parsers' capabilities, and improve yours, with 36 categories from linguistic to practical. #EMNLP2023

    (Original post: nitter.net/megothedoonch/statu)

  27. You can find our #EMNLP2023 paper here:
    ➡️ arxiv.org/abs/2305.12920

    Check out the work of our authors Aniket Pramanick, Yufang Hou (IBM Research Europe), Saif M. Mohammad (National Research Council Canada) and Iryna Gurevych for more info. See you at #EMNLP2023 this week in Singapore! (7/7)

  28. Datasets such as “Penn Treebank” play a pivotal role, leaving marks on the tasks like "Language Modeling," "POS Tagging," and "Semantic Parsing”, whereas tasks like “Speech Recognition” and “Summarization” witness the evolution of datasets over time.

    Stay tuned for the deep dive into #NLProc Research! (6/🧵) #EMNLP2023

  29. Two works from
    @dieuwke
    on stability or consistency of outputs\metrics or if you want (and really, I want) reliability

    Datasets for compositional generalizations do not agree with each other. It means that different models are good at different things. But that the metrics don't measure what we thought...

    @Adinawilliams
    @_dieuwke_

    #emnlp #EMNLP2023

  30. Findings from #WMT23
    Our Chat4 friend is in the winning group across tasks
    Most submissions still use from scratch training
    Less constrained (low resource) submissions than before
    More test suit submissions!
    Low resource results TBD (tech issue)
    #EMNLP2023 #WMT #neuralEmpty #LLMs

  31. 🧮 MathDial is based on #GSM8k and annotated with ground-truth solutions, student guesses, and plenty of annotations from teachers about the student solution, confusion, quality of dialog, and many more. (3/🧵) #EMNLP2023

  32. In their #EMNLP2023 paper, the authors use a capacity constraint to control the size in tokens of the prompt and diversity constraints to favor the selection of exemplars – sharing the same reasoning properties as the test problem.

    They propose #SEER, a method to automatically generate a Knapsack program for HybridQA problems. It achieves superior performance to exemplar selection baselines on the FinQA and TAT-QA datasets (3/🧵) #EMNLP2023

  33. Can we combine integer linear programming with exemplar selection to improve In-Context Learning?

    Yes! All you need is to optimize your Knapsack 🎒

    The paper by Jonathan Tonglet, Manon Reusens, Philipp Borchert and Bart Baesens on #SEER was just accepted to #EMNLP2023 – learn more in this thread (1/🧵).

    📰 arxiv.org/abs/2310.06675v2

  34. Proud to announce our paper on "Automatic Unit Test Data Generation and Actor-Critic Reinforcement Learning for Code Synthesis" has been accepted to Findings of #EMNLP2023 .
    This is joint work with Matthieu Zimmer, Gerasimos Lampouras, Derrick Goh Xin Deik, and Ignacio Iacobacci .

    Code Synthesis, the generation of programming language code from a natural language description, is a challenging problem for #LLMs.
    Various Reinforcement Learning methods have been proposed to improve performance of pretrained models.
    One #RL approach to this problem is to use functional tests (Unit Tests) as the reward signal; however, this requires data consisting of (i) NL problem prompts, (ii) varied unit tests for each problem to assess functional correctness, which is often unavaible. Some datatasets such as #HumanEval and #MBPP exist; however, these are limited in size and contain (relatively) simple problems.

    We show how to programmatically derive new training data for functional test-based Code Synthesis RL, generating and converting automatic tests from a strongly typed language (Java) to a weakly typed language (Python). This allows us to generate arbitrary amounts of test-annotated data.

    We then introduce a very straight-forward yet effective practical REINFORCE-based Actor-Critic RL approach that makes use of Unit Test annotated data to tune a function-level Code Synthesis LM.
    Crucially, we find that keeping the Critic in sync with the Policy yields better results than pretraining and freezing the Critic.
    Use of our augmentation data further improves model performance.

    Preprint available at arxiv.org/abs/2310.13669 ; code and model will be made available.

    #Machinelearning #AI #ML #ReinforcementLearning #LLM #PLM #CodeSyntheis #Huawei

  35. Proud to announce our paper on "Automatic Unit Test Data Generation and Actor-Critic Reinforcement Learning for Code Synthesis" has been accepted to Findings of #EMNLP2023 .
    This is joint work with Matthieu Zimmer, Gerasimos Lampouras, Derrick Goh Xin Deik, and Ignacio Iacobacci .

    Code Synthesis, the generation of programming language code from a natural language description, is a challenging problem for #LLMs.
    Various Reinforcement Learning methods have been proposed to improve performance of pretrained models.
    One #RL approach to this problem is to use functional tests (Unit Tests) as the reward signal; however, this requires data consisting of (i) NL problem prompts, (ii) varied unit tests for each problem to assess functional correctness, which is often unavaible. Some datatasets such as #HumanEval and #MBPP exist; however, these are limited in size and contain (relatively) simple problems.

    We show how to programmatically derive new training data for functional test-based Code Synthesis RL, generating and converting automatic tests from a strongly typed language (Java) to a weakly typed language (Python). This allows us to generate arbitrary amounts of test-annotated data.

    We then introduce a very straight-forward yet effective practical REINFORCE-based Actor-Critic RL approach that makes use of Unit Test annotated data to tune a function-level Code Synthesis LM.
    Crucially, we find that keeping the Critic in sync with the Policy yields better results than pretraining and freezing the Critic.
    Use of our augmentation data further improves model performance.

    Preprint available at arxiv.org/abs/2310.13669 ; code and model will be made available.

    #Machinelearning #AI #ML #ReinforcementLearning #LLM #PLM #CodeSyntheis #Huawei

  36. Our ERC Award-winning #InterText team invites the Natural Language Processing community to explore low-resource cross-domain discourse analysis of peer reviews in our new #PragTag2023 Shared Task! Registration for the EMNLP-2023 Argument Mining Workshop is open: codalab.lisn.upsaclay.fr/compe

    #ScholarlyCommunication #metascience #emnlp2023 #NLProc #ArgumentMining

  37. 🔮 #BlackboxNLP will be back in 2023 at #EMNLP2023! ❄ We will keep updates posted on our website: blackboxnlp.github.io

    While you wait, also check out our YouTube channel: youtube.com/@blackboxnlp