Skip to content

URMC CTSI: Exploring E-Cigarette Perception among Spanish and English-Speaking Users on Social Media

Analyzing Multilingual Perceptions of E-Cigarettes on Social Media

Team Members: Chen-Jui Chen, Anik De, Wonha Shin, Yadi Zhang, Chen Zhang

The goal of the UR Clinical & Translational Science Institute(CTSI) project was to analyze public perceptions of e-cigarettes from English and Spanish language posts on social media, informing culturally appropriate public health interventions. Students employed natural language processing (NLP) techniques for classification tasks such as data cleaning, preprocessing, labeling, and fine-tuning transformer-based machine learning models (e.g., Bertweet and Twitter Twin BERT Large). They then applied Sentence Transformers and K-means clustering for topic modeling to analyze thematic content. The team’s results revealed nuanced differences between linguistic groups. Specifically, English posts focused on public health and regulatory concerns, while Spanish posts highlighted the cultural and lifestyle aspects of vaping. Additionally, further attitude analysis revealed significant discrepancies between traditional sentiment metrics for English and Spanish speakers; translation evaluations revealed potential information losses but highlighted cultural nuances that could affect interpretations of the project’s findings. The team’s work underscores the need for culturally sensitive health messaging and policy approaches. It also emphasizes the value of robust multilingual analyses to inform public health strategies.