JUCS - Journal of Universal Computer Science 8(11): 1016-1038, doi: 10.3217/jucs-008-11-1016
Finding Plagiarisms among a Set of Programs with JPlag
expand article infoLutz Prechelt, Guido Malpohl§, Michael Philippsen§
‡ Universitaet Karlsruhe, Karlsruhe, Germany§ University Karlsruhe, Karlsruhe, Germany
Open Access
Abstract
JPlag is a web service that finds pairs of similar programs among a given set of programs. It has successfully been used in practice for detecting plagiarisms among student Java program submissions. Support for the languages C, C++ and Scheme is also available. We describe JPlag's architecture and its comparsion algorithm, which is based on a known one called Greedy String Tiling. Then, the contribution of this paper is threefold: First, an evaluation of JPlag's performance on several rather different sets of Java programs shows that JPlag is very hard to deceive. More than 90 percent of the 77 plagiarisms within our various benchmark program sets are reliably detected and a majority of the others at least raise suspicion. The run time is just a few seconds for submissions of 100 programs of several hundred lines each. Second, a parameter study shows that the approach is fairly robust with respect to its configuration parameters. Third, we study the kinds of attempts used for disguising plagiarisms, their frequency, and their success.
Keywords
plagiarism, similarity, search, token, string tiling