Document representation and feature combination for deceptive spam review detection
Deceptive spam reviews of products or service are harmful for customers in decision
making. Existing approaches to detect deceptive spam reviews are concerned in feature
designing. Hand-crafted features can show some linguistic phenomena, however can hardly
reveal the latent semantic meaning of the review. We present a neural network based model
to learn the representation of reviews. The model makes a hard attention through the
composition from sentence representation into document representation. Specifically, we …
making. Existing approaches to detect deceptive spam reviews are concerned in feature
designing. Hand-crafted features can show some linguistic phenomena, however can hardly
reveal the latent semantic meaning of the review. We present a neural network based model
to learn the representation of reviews. The model makes a hard attention through the
composition from sentence representation into document representation. Specifically, we …
Abstract
Deceptive spam reviews of products or service are harmful for customers in decision making. Existing approaches to detect deceptive spam reviews are concerned in feature designing. Hand-crafted features can show some linguistic phenomena, however can hardly reveal the latent semantic meaning of the review. We present a neural network based model to learn the representation of reviews. The model makes a hard attention through the composition from sentence representation into document representation. Specifically, we compute the importance weights of each sentence and incorporate them into the composition process of document representation. In the mixed-domain detection experiment, the results verify the effectiveness of our model by comparing with other neural network based methods. As the feature selection is very important in this direction, we make a feature combination to enhance the performance. Then we get 86.1% F1 value which outperform the state-of-the-art method. In the cross-domain detection experiment, our method has better robustness.
Elsevier