[PDF][PDF] Intrinsic plagiarism detection using character trigram distance scores

M Kestemont, K Luyckx, W Daelemans - Proceedings of the PAN, 2011 - academia.edu
In this paper, we describe a novel approach to intrinsic plagiarism detection. Each
suspicious document is divided into a series of consecutive, potentially overlapping
'windows' of equal size. These are represented by vectors containing the relative
frequencies of a predetermined set of high-frequency character trigrams. Subsequently, a
distance matrix is set up in which each of the document's windows is compared to each
other window. The distance measure used is a symmetric adaptation of the normalized …

[引用][C] Intrinsic Plagiarism Detection Using Character Trigram Distance Scores-Notebook for PAN at CLEF 2011.

M Kestemont, K Luyckx, W Daelemans - CLEF (Notebook Papers/Labs/Workshop), 2011
以上显示的是最相近的搜索结果。 查看全部搜索结果