Award Abstract # 2142739
CAREER: Language Technologies Against the Language of Social Discrimination

NSF Org: IIS
Div Of Information & Intelligent Systems
Recipient: UNIVERSITY OF WASHINGTON
Initial Amendment Date: March 7, 2022
Latest Amendment Date: July 21, 2023
Award Number: 2142739
Award Instrument: Continuing Grant
Program Manager: Tatiana Korelsky
tkorelsk@nsf.gov
 (703)292-0000
IIS
 Div Of Information & Intelligent Systems
CSE
 Direct For Computer & Info Scie & Enginr
Start Date: September 1, 2022
End Date: August 31, 2027 (Estimated)
Total Intended Award Amount: $550,436.00
Total Awarded Amount to Date: $240,808.00
Funds Obligated to Date: FY 2022 = $132,317.00
FY 2023 = $108,491.00
History of Investigator:
  • Yulia Tsvetkov (Principal Investigator)
    yuliats@cs.washington.edu
Recipient Sponsored Research Office: University of Washington
4333 BROOKLYN AVE NE
SEATTLE
WA  US  98195-1016
(206)543-4043
Sponsor Congressional District: 07
Primary Place of Performance: University of Washington
4333 Brooklyn Ave NE
Seattle
WA  US  98195-0001
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): HD1WMN6945W6
Parent UEI:
NSF Program(s): Robust Intelligence
Primary Program Source: 010V2122DB R&RA ARP Act DEFC V
01002324DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 102Z, 1045, 7495
Program Element Code(s): 749500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2).

The exponential growth of online social platforms provides an unprecedented source of equal opportunities for accessing expert- and crowd-wisdom, for finding education, employment, and friendships. One key root cause that can deeply impede these experiences is the exposure to implicit social bias. The risk is high, since biases are pernicious and pervasive, and it has been well established that language is a primary means through which stereotypes and prejudice are communicated and perpetuated. This project develops language technologies to detect and intervene in the language of social discrimination?sexist, racist, homophobic microaggressions, condescension, objectification, dehumanizing metaphors, and the like?which can be unconscious and unintentional, but cause prolonged personal and professional harms. The program opens up new research opportunities with implications to natural language processing, machine learning, data science, and computational social science. It develops new Web-scale algorithms to automatically detect implicit and disguised toxicity, as well as hate speech and abusive language online. Technologically, it develops new methods to surface and demote spurious patterns in deep-learning models, and new techniques to interpret deep-learning models, thereby opening new avenues to reliable and interpretable machine learning. Successful completion of the program will pave the ground for a paradigm shift in existing ways for monitoring civility in cyberspace, shielding vulnerable populations from discrimination and aggression, and reducing the mental load of platform moderators. Therefore, this project can benefit and empower a dramatic number of individuals?representatives of disadvantaged groups discriminated by gender, race, age, sexual orientation, ethnicity?who use social media or AI technologies built upon user generated content. Finally, the educational curriculum developed by this program will equip future technologists with theoretical and practical tools for building ethical AI, and will substantially promote diversity, equity and inclusion in STEM education, helping to foster a new, more diverse generation of researchers entering AI.

The overarching goal of this CAREER project is to develop lightly supervised, interpretable machine learning approaches?grounded in social psychology and causal reasoning?to detect implicit social bias in written discourse and narrative text. More specifically, the first phase of the project develops algorithms and models for identifying and explaining gendered microaggressions in short comments on social media, first unsupervisedly, then with active learning, given limited supervision by trained annotators. It provides transformative solutions to making existing overparameterized black-box neural networks more robust and more interpretable. Since microaggressions are often implicit, it also develops approaches to generate explanations to the microaggression detector?s decisions. In the second phase, the project addresses the challenging task of detecting biased framing about members of the LGBTQ community in narrative domains of digital media and develops data analytic tools by operationalizing, across languages, well-established social psychology theories. The expected outcomes of this five-year program include new datasets, algorithms, and models that provide people-centered text analytics, and pinpoint and explain potentially biased framings, across languages, data domains, and social contexts.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 12)
He, Tianxing and Zhang, Jingyu and Wang, Tianle and Kumar, Sachin and Cho, Kyunghyun and Glass, James and Tsvetkov, Yulia "On the Blind Spots of Model-Based Evaluation Metrics for Text Generation" ACL: Annual Meeting of the Association for Computational Linguistics , 2023 Citation Details
Han, Xiaochuang and Kumar, Sachin and Tsvetkov, Yulia "SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control" ACL: Annual Meeting of the Association for Computational Linguistics , 2023 Citation Details
Feng, Shangbin and Tan, Zhaoxuan and Zhang, Wenqian and Lei, Zhenyu and Tsvetkov, Yulia "KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding" ACL: Annual Meeting of the Association for Computational Linguistics , 2023 Citation Details
Feng, Shangbin and Park, Chan Young and Liu, Yuhan and Tsvetkov, Yulia "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models" ACL: Annual Meeting of the Association for Computational Linguistics , 2023 Citation Details
Field, Anjalie and Coston, Amanda and Gandhi, Nupoor and Chouldechova, Alexandra and Putnam-Hornstein, Emily and Steier, David and Tsvetkov, Yulia "Examining risks of racial biases in NLP tools for child protective services" FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency , 2023 https://doi.org/10.1145/3593013.3594094 Citation Details
Kumar, Sachin and Balachandran, Vidhisha and Njoo, Lucille and Anastasopoulos, Antonios and Tsvetkov, Yulia "Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey" European Chapter of the Association for Computational Linguistics , 2023 Citation Details
Kumar, Sachin and Paria, Biswajit and Tsvetkov, Yulia "Constrained Sampling from Language Models via Langevin Dynamics in Embedding Spaces" Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2022 Citation Details
Kumar, Sachin and Paria, Biswajit and Tsvetkov, Yulia "Gradient-based Constrained Sampling from Language Models" Conference on Empirical Methods in Natural Language Processing , 2022 Citation Details
Park, Chan Young and Mendelsohn, Julia and Field, Anjalie and Tsvetkov, Yulia "Challenges and Opportunities in Information Manipulation Detection: An Examination of Wartime Russian Media" Empirical Methods in Natural Language Processing , 2022 Citation Details
Lin, Inna and Njoo, Lucille and Field, Anjalie and Sharma, Ashish and Reinecke, Katharina and Althoff, Tim and Tsvetkov, Yulia "Gendered Mental Health Stigma in Masked Language Models" Empirical Methods in Natural Language , 2022 Citation Details
Field, Anjalie and Park, Chan Young and Theophilo, Antonio and Watson-Daniels, Jamelle and Tsvetkov, Yulia "An analysis of emotions and the prominence of positivity in #BlackLivesMatter tweets" Proceedings of the National Academy of Sciences , v.119 , 2022 https://doi.org/10.1073/pnas.2205767119 Citation Details
(Showing: 1 - 10 of 12)

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page