Research article

Public opinion evaluation on social media platforms: a case study of High Speed 2 (HS2) rail infrastructure project

Authors
  • Ruiqiu Yao orcid logo (Civil, Environmental and Geomatic Engineering, University College London, London, UK)
  • Andrew Gillen (Department of Civil and Environmental Engineering, Northeastern University, Boston, MA, USA)

This is version 2 of this article, the published version can be found at: https://doi.org/10.14324/111.444/ucloe.000063

Abstract

Public opinion evaluation is becoming increasingly significant in infrastructure project assessment. The inefficiencies of conventional evaluation approaches can be improved with social media analysis. Posts about infrastructure projects on social media provide a large amount of data for assessing public opinion. This study proposed a hybrid model which combines pre-trained RoBERTa and gated recurrent units for sentiment analysis. We selected the United Kingdom railway project, High Speed 2 (HS2), as the case study. The sentiment analysis showed the proposed hybrid model has good performance in classifying social media sentiment. Furthermore, the study applies latent Dirichlet allocation topic modelling to identify key themes within the tweet corpus, providing deeper insights into the prominent topics surrounding the HS2 project. The findings from this case study serve as the basis for a comprehensive public opinion evaluation framework driven by social media data. This framework offers policymakers a valuable tool to effectively assess and analyse public sentiment.

Keywords: public opinion evaluation, civil infrastructure projects, machine learning, sentiment analysis, topic modelling

Rights: © 2023 The Authors.

1143 Views

Published on
07 Sep 2023
Peer Reviewed

 Open peer review from Kwadwo Agyapon-Ntra

Review

Review information

DOI:: 10.14293/S2199-1006.1.SOR-ENG.AUFIJ9.v1.RZTSUZ
License:
This work has been published open access under Creative Commons Attribution License CC BY 4.0 , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com .

ScienceOpen disciplines: Engineering , Civil engineering
Keywords: Public opinion evaluation , Sentiment analysis , Transport , Policy and law , Environmental policy and practice , Machine learning , Civil infrastructure projects , Topic modelling , Sustainability

Review text

Summary:
In this study, the authors successfull conduct an analysis of public opinions on a public infrastructure project (the High Speed 2 railway project in the United Kingdom) using data from Twitter and machine learng algorithms. The approach is sound, but the methodology could do with the adoption of SOTA models, benchmarking against very basic models, and handling potentially imbalanced datasets. Details are provided below:

  1. Deep learning techniques, especially those that employ transformer architectures are the current SOTA. While methods like Naive Baye, SVM’s and LDA are still very useful, it would be prudent to compare with the results from transformer-based deep learning architectures. Neural networks in the transformer family fine-tuned for specific tasks like classification have proven to be a very promising research direction in recent years, and some models like twitter-roberta-base-sentiment can be used out of the box. Since these deep learning architectures transform text into numerical embeddings that preserve semantic context to a degree, they reduce the amount of pre-processing that has to be done on tweets (like stemming and stop-word removal).
  2. Another good tool to consider for establishing baselines is the VADER sentiment analysis model, which was developed specifically for social media use-cases. In the worst case it can serve as a reasonable baseline, since it requires no training.
  3. Steps should be taken to address dataset imbalance. If any such steps were taken, they were not stated. This can cause issues for a classifier, such as overfitting to a label with an overwhelmingly higher representaion. The F1 score is a good metric for catching this, but it might be better to train on a balanced dataset.


Note:
This review refers to round 1 of peer review.

 Open peer review from Chrisina Jayne

Review

Review information

DOI:: 10.14293/S2199-1006.1.SOR-ENG.ADVGU2.v1.RXVTZG
License:
This work has been published open access under Creative Commons Attribution License CC BY 4.0 , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com .

ScienceOpen disciplines: Engineering , Civil engineering
Keywords: Public opinion evaluation , Sentiment analysis , Transport , Policy and law , Environmental policy and practice , Machine learning , Civil infrastructure projects , Topic modelling , Sustainability

Review text

The paper investigates sentiment analysis and topic modelling using machine learning based on Twitter data. It considers a specific topic related to the UK railway project, High Speed 2. The paper compares Multinomial Naïve Bayes and Support Vector Machine for sentiment analysis of tweets. Topic modelling was conducted with Latent Dirichlet Allocation (LDA) using publicly available scripts. Experiments, discussion, and results are presented. The paper is written well, and sufficient background is included. The references are appropriate but some more recent ones could have been included. The paper provides insights into the feasibility of using social media data for public opinion evaluation of civil infrastructure projects. The study's contribution lies in presenting a public opinion evaluation framework with a machine learning algorithm and comparing the accuracy of two classifiers.



Note:
This review refers to round 1 of peer review.

 Open peer review from Guanlan Zhang

Review

Review information

DOI:: 10.14293/S2199-1006.1.SOR-ENG.AEG3YW.v1.RPMISX
License:
This work has been published open access under Creative Commons Attribution License CC BY 4.0 , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com .

ScienceOpen disciplines: Engineering , Civil engineering
Keywords: Public opinion evaluation , Sentiment analysis , Transport , Policy and law , Environmental policy and practice , Machine learning , Civil infrastructure projects , Topic modelling , Sustainability

Review text

Summary: in this work, the author proposed a learning-based framework to evaluate public opinion through social media platform. Most of the contents are presented clearly with a proper use of language. The methods effectively solve the problem, and the experimental results are acceptable.

Comments.
1, there are symbols not correct in math formulas. In Eq.1), the x_1 and p(xn|y) are not in correct form.
2, the state-of-the-art methods are not clearly stated and the comparison between the proposed method and the SOTA is not verified. What is the significance of this work over previous work?
3, in section 3.2.2, on what machine do you train your model and what is the time consumption?
4, the training results in section 3.2.2 is not very good. Have you consider using deep neural networks to solve the problem?
5, how to evaluate the accuracy of sentiment analysis and topic modeling?



Note:
This review refers to round 1 of peer review.