Undergraduate and graduate student presentations from the School of Information Technology, 2021 Online University Research Symposium, Illinois State University
Truong (Jack) Luu
Social media has created an unprecedented way for individuals to share their concerns, fears, optimism, and happiness, for example, in ways that were not even conceivable some 20 years ago. Extensive data from these social media platforms, such as Twitter, makes it an invaluable resource for opinion mining and sentiment analysis. Starting in December 2019, the corona virus pandemic has had devastating consequences all over the planet, sparing no country. Health, social, and economic tolls associated with the pandemic has generated intense emotions and spread fear in people of all ages, genders, and races. During these difficult times, many have shared their feelings and opinions on many aspects of their lives via Twitter. In this project we use machine learning to measure subjectivity polarity in COVID-19 related tweets, labelling it as positive, negative, and neutral, depending upon the vocabulary encountered in the tweets. Our work focused on a detailed study of the distribution of opinions among the primary U.S. states. We also tested the relationship between the sentiment scores and the cases of COVID-19 in the United States, establishing a link between the sentiment scores, the reported cases and the death toll. The findings may assist with implementing legislation related to COVID-19, act as a reference for scientific work, inform and educate the public on critical pandemic-related issues.
Preston Nowlin, RJ Benefiel, and Evan Hazzard
The purpose of our research project is to explore and analyze an anonymized data set of 2,300 School of Information Technology students who attended ISU between 1996 and 2016 and present a visualization of student retention predictors. Visualizing various factors influencing student retention required knowledge of both computer programming and data analysis. We used Microsoft Power BI for data visualization and applied Python programming language to explore potential predictors leading to Information Technology (IT) student retention. Moreover, Power BI was used to create a dashboard which helped us visualize demographic attributes. We later conducted predictive analytics (i.e., multiple linear regression and logistic regression) using Python. We found that logistic regression is most suitable for our student retention data. Exploring these factors through various data science techniques helped us better understand the relationships between student retention and other factors. Insights for our data analyses and retention strategies are provided.
Worker behaviors are complicated and under the influences of various factors when worker-vehicle collisions happen on construction job sites. The proposed research targets the safety challenges of construction management when industrial trucks are operating around workers. To solve the research question of how to identify the most influential safety hazards and patterns of the worker-vehicle coordination, this research first reviews and compares multiple data-mining algorithms for pattern analysis to select the Latent Dirichlet Allocation (LDA) approach and design the corresponding analysis system. Then it investigates the patterns of collision accident from the Occupational Safety and Health Administration (OSHA) database with the expectation to understand safety hazards and violations in worker-vehicle collisions based on the unstructured OSHA data. The intellectual meanings that occur in the collection of documents through the proposed LDA and statistical analysis of this research can support their future implementations of automated construction. This research also models the topics through text classification and suggests that the uneven ground and objects that are under-construction are the primary obstacles when workers and trucks move on the sites and should be managed for safety improvement.