Role Term-Based Semantic Similarity Technique for Idea Plagiarism Detection
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2018, Vol 9, Issue 8
Abstract
Most of the text mining systems are based on statistical analysis of term frequency. The statistical analysis of term (phrase or word) frequency captures the importance of the term within a document, but the techniques that had been proposed by now still need to be improved in terms of their ability to detect the plagiarized parts, especially for capturing the importance of the term within a sentence. Two terms can have a same frequency in their documents, but one term pays more to the meaning of its sentences than the other term. In this paper, we want to discriminate between the important term and unimportant term in the meaning of the sentences in order to adopt for idea plagiarism detection. This paper introduces an idea plagiarism detection based on semantic meaning frequency of important terms in the sentences. The suggested method analyses and compares text based on a semantic allocation for each term inside the sentence. SRL offers significant advantages when generating arguments for each sentence semantically. Promising experimental has been applied on the CS11 dataset and results revealed that the proposed technique's performance surpasses its recent peer methods of plagiarism detection in terms of Recall, Precision and F-measure.
Authors and Affiliations
Ahmed Hamza Osman, Hani Moaiteq AlJahdali
Cryptocurrency Mining – Transition to Cloud
Cryptocurrency, a form of digital currency that has an open and decentralized system and uses cryptography to enhance security and control the creation of new units, is touted to be the next step from conventional moneta...
On P300 Detection using Scalar Products
Results concerning detection of the P300 wave in EEG segments using scalar products with signals of various shapes are presented and their advantages and limitations are discussed. From the point of view of the computati...
Classifying Natural Language Text as Controlled and Uncontrolled for UML Diagrams
Natural language text fall within the category of Controlled and Uncontrolled Natural Language. In this paper, an algorithm is presented to show that a given language text is controlled or uncontrolled. The parameters an...
Big Data Processing for Full-Text Search and Visualization with Elasticsearch
In this paper, the task of using Big Data to identify specific individuals on the indirect grounds of their interaction with information resources is considered. Possible sources of Big Data and problems related to its p...
OTSA: Optimized Time Synchronization Approach for Delay-based Energy Efficient Routing in WSN
Time Synchronization is one of the problems and still ignored problem in area of wireless sensor network (WSN). After reviewing the existing literatures, it is found that there are few studies that combinely address the...