An Author Verification Approach Based on Differential Features

posted Jul 13, 2015, 5:20 AM by Eric Medvet   [ updated Oct 20, 2016, 6:54 AM ]
  • Uncovering Plagiarism, Authorship and Social Softare Misuse at 6th Conference and Labs of the Evaluation Forum (PAN-CLEF), 2015, Toulouse (France)
  • Alberto Bartoli, Alex Dagri, Andrea De Lorenzo, Eric Medvet, Fabiano Tarlao
  • Google Scholar
We describe the approach that we submitted to the 2015 PAN competition for the author identification task. The task consists in determining if an unknown document was authored by the same author of a set of documents with the same author.    
We propose a machine learning approach based on a number of different features that characterize documents from widely different points of view. We construct non-overlapping groups of homogeneous features, use a random forest regressor for each features group, and combine the output of all regressors by their arithmetic mean. We train a different regressor for each language.
Our approach achieved the first position in the final rank for the Spanish language. 
Ċ
Eric Medvet,
Sep 9, 2015, 5:34 AM
Ċ
Eric Medvet,
Sep 9, 2015, 5:34 AM