[PDF][PDF] Fake-news detection system using machine-learning algorithms for arabic-language content

M Alazab, A Awajan, A Alazab, A Alhyari… - Journal of Theoretical and …, 2022 - jatit.org
M Alazab, A Awajan, A Alazab, A Alhyari, R Saadeh
Journal of Theoretical and Applied Information Technology, 2022jatit.org
Over the past decade, social media has become a dominant source of news and
information. This has led to an increase in the number of groups and individuals spreading
news through social media with no direct quality control or censorship of the content being
distributed. A fake breaking-news headline can spread rapidly to millions of people and
cause tremendous local and global problems. Because checking all information posted on
social media is almost impossible, researchers are now concentrating on combating fake …
Abstract
Over the past decade, social media has become a dominant source of news and information. This has led to an increase in the number of groups and individuals spreading news through social media with no direct quality control or censorship of the content being distributed. A fake breaking-news headline can spread rapidly to millions of people and cause tremendous local and global problems. Because checking all information posted on social media is almost impossible, researchers are now concentrating on combating fake news on the Internet and social media to mitigate the enormous damage the spread of such news can cause to individuals, communities, and nations. To detect whether news is fake and stop it before it can spread, a reliable, rapid, and automated system using artificial intelligence should be applied. Hence, in this study, an Arabic fake-news detection system that uses machine-learning algorithms is proposed. An in-house Arabic dataset containing 206,080 tweets was collected using an API search on Twitter. The algorithm uses term frequency-inverse document frequency to extract features from the dataset and analysis of variance to select subsets from them. Nine machine-learning classifiers were used to train the model (naïve Bayes, K-nearest-neighbours, support vector machine, random forest (RF), J48, logistic regression, random committee (RC), J-Rip, and simple logistics). The experimental results indicated that the highest accuracy (97.3%) was obtained using the random forest and random committee, with training times of 4403s and 0.367 s, respectively.
jatit.org
以上显示的是最相近的搜索结果。 查看全部搜索结果