CROSS-LANGUAGE PLAGIARISM OF ARABIC-ENGLISH
DOCUMENTS USING LINEAR LOGISTIC REGRESSION
الباحث الأول:
ZAID ALAA
الباحثين الآخرين:
SABRINA TIUN
MOHAMMEDHASAN ABDULAMEER
المجلة:
Journal of Theoretical and Applied Information Technology
تاريخ النشر:
10 يناير، 2016
مختصر البحث:
ABSTRACT
Cross-Language Plagiarism Detection (CLPD)is used to automatically identify and extract plagiarism
among documents in different languages.The main challenge of cross-languageplagiarism detection is the
difference of text languages, wh…
ABSTRACT
Cross-Language Plagiarism Detection (CLPD)is used to automatically identify and extract plagiarism
among documents in different languages.The main challenge of cross-languageplagiarism detection is the
difference of text languages, where the original source can be analysed and translated, and plagiarism can
be detected automatically by comparing suspected text with the original text. This paper proposes an
Arabic-English cross-language plagiarism detection method,to automatically detect the semantic
relatedness between the words of two suspect targeted files.The proposed method consists of four phases.
The first phase is a pre-processing phase,the second involves key phrase extraction and translation, the third
phase used plagiarism detection techniques and the fourth phase is the classification process, which using
Linear Logistic Regression (LLR). The evaluation process is created using precision and recall
measurements of a dataset consisting of Wikipedia articles. The experimental resultsachieved96%
precision, 85% recall and 90.16% F-measure. The results show that the LLRalgorithm can be used
effectively to detect Arabic-English cross-language plagiarism.