JUCS - Journal of Universal Computer Science 21(11): 1454-1469, doi: 10.3217/jucs-021-11-1454
PSO-Based Feature Selection for Arabic Text Summarization
expand article infoAhmed M. Al-Zahrani, Hassan Mathkour, Hassan Abdalla
‡ King Saud University, Riyadh, Saudi Arabia
Open Access
Abstract
Feature-based approaches play an important role and are widely applied in extractive summarization. In this paper, we use particle swarm optimization (PSO) to evaluate the effectiveness of different state-of-the-art features used to summarize Arabic text. The PSO is trained on the Essex Arabic summaries corpus data to determine the best particle that represents the most appropriate simple/combination of eight informative/structure features used regularly by Arab summarizers. Based on the elected features and their relevant weights in each PSO iteration, the input text sentences are scored and ranked to extract the top ranking sentences in the form of an output summary. The output summary is then compared with a reference summary using the cosine similarity function as the fitness function. The experimental results illustrate that Arabs summarize texts simply, focusing on the first sentence of each paragraph.
Keywords
feature selection, Arabic text summarization, natural language processing, Particle Swarm optimization