Undergraduate and graduate student presentations from the School of Information Technology, 2021 Online University Research Symposium, Illinois State University
-
A New Visualization Platform For Analyzing Covid-19 Data And Extracting Critical Information
Jim Brokaw
Visualization is an integral part in the investigation of information hidden in the collected healthcare data. Innovative data visualization can provide users with intuitive feedback for decision making. For purpose of analyzing the COVID-19 data collected from the Centers for Disease Control and Prevention (CDC), effectively designing and visualizing these data are especially critical for interpreting the pandemic transformation patten. In this project, the initial raw data was collected and input to a data processing program developed by R, a statistical programming language, which was used to generate datasets that can be further visualized and analyzed. Next, we developed several web-based new algorithms to dynamically visualize and analyze the output data from R program using D3, a JavaScript library that allows for data visualization. Our algorithms can create unique interactive features that can be employed to generate innovative data display in a webpage. With the dynamic and interactive capabilities, our visualization software platform can be used to study how the COVID-19 spreads among different age groups and genders, which will yield fruitful insight of the data for medical professionals and healthcare industry to take suitable preventive measures.
-
Using Machine Learning To Measure Sentiment During The Covid-19 Pandemic
Truong (Jack) Luu
Social media has created an unprecedented way for individuals to share their concerns, fears, optimism, and happiness, for example, in ways that were not even conceivable some 20 years ago. Extensive data from these social media platforms, such as Twitter, makes it an invaluable resource for opinion mining and sentiment analysis. Starting in December 2019, the corona virus pandemic has had devastating consequences all over the planet, sparing no country. Health, social, and economic tolls associated with the pandemic has generated intense emotions and spread fear in people of all ages, genders, and races. During these difficult times, many have shared their feelings and opinions on many aspects of their lives via Twitter. In this project we use machine learning to measure subjectivity polarity in COVID-19 related tweets, labelling it as positive, negative, and neutral, depending upon the vocabulary encountered in the tweets. Our work focused on a detailed study of the distribution of opinions among the primary U.S. states. We also tested the relationship between the sentiment scores and the cases of COVID-19 in the United States, establishing a link between the sentiment scores, the reported cases and the death toll. The findings may assist with implementing legislation related to COVID-19, act as a reference for scientific work, inform and educate the public on critical pandemic-related issues.
-
Exploring Information Technology Student Retention
Preston Nowlin, RJ Benefiel, and Evan Hazzard
The purpose of our research project is to explore and analyze an anonymized data set of 2,300 School of Information Technology students who attended ISU between 1996 and 2016 and present a visualization of student retention predictors. Visualizing various factors influencing student retention required knowledge of both computer programming and data analysis. We used Microsoft Power BI for data visualization and applied Python programming language to explore potential predictors leading to Information Technology (IT) student retention. Moreover, Power BI was used to create a dashboard which helped us visualize demographic attributes. We later conducted predictive analytics (i.e., multiple linear regression and logistic regression) using Python. We found that logistic regression is most suitable for our student retention data. Exploring these factors through various data science techniques helped us better understand the relationships between student retention and other factors. Insights for our data analyses and retention strategies are provided.
-
Vehicle-Collision Warning System And Deep Learning Approach
Tianyuan Shi
Worker behaviors are complicated and under the influences of various factors when worker-vehicle collisions happen on construction job sites. The proposed research targets the safety challenges of construction management when industrial trucks are operating around workers. To solve the research question of how to identify the most influential safety hazards and patterns of the worker-vehicle coordination, this research first reviews and compares multiple data-mining algorithms for pattern analysis to select the Latent Dirichlet Allocation (LDA) approach and design the corresponding analysis system. Then it investigates the patterns of collision accident from the Occupational Safety and Health Administration (OSHA) database with the expectation to understand safety hazards and violations in worker-vehicle collisions based on the unstructured OSHA data. The intellectual meanings that occur in the collection of documents through the proposed LDA and statistical analysis of this research can support their future implementations of automated construction. This research also models the topics through text classification and suggests that the uneven ground and objects that are under-construction are the primary obstacles when workers and trucks move on the sites and should be managed for safety improvement.