Stemming and root-based approaches to the retrieval of Arabic documents on the Web
Journal Title: Webology - Year 2006, Vol 3, Issue 1
Abstract
Using information retrieval systems to gain access to documents in languages other than English is becoming an increasingly significant problem. Rules, theories, algorithms, and retrieval methods designed and developed for English and other morphologically similar languages may or may not apply in the linguistic environments of other languages. The problem is particularly acute in languages that differ radically from English on account of morphological rules. This paper compares the effects stemming and root retrieval on information retrieval in Arabic through an exploratory study of the handling of Arabic words by an English-language search engine (ELSE). Search experiments, using 2000 Arabic documents and 40 Arabic search terms (nouns), were conducted in a Web search engine developed for English (AltaVista) and in an Arabic search engine (al-Idrisi) to compare the performances of stemming and root retrieval and to investigate the possibility of adapting AltaVista for use with Arabic text. The results of the experiments show that more effective retrieval can be accomplished through stemming, and that it is possible to adapt an ELSE for use with Arabic without the need to develop root-retrieval features.
Authors and Affiliations
Haidar Moukdad
Google Patents: The global patent search engine
Google Patents (www.google.com/patents) includes over 8 million full-text patents. Google Patents works in the same way as the Google search engine. Google Patents is the global patent search engine that lets users searc...
Institutional Repositories: Content and Culture in an Open Access Environment
As repository technology matures, the cultural and organizational aspects of setting up and running an institutional repository have come to the forefront of the discussion surrounding their deployment. The book delibera...
Virtual Communities, Social Networks and Collaboration, Series: Annals of Information Systems
This book tries to cover all aspects of virtual communities among virtual participants through a collection of inclusive, informative and far-reaching of 13 chapters and 239 pages including preface, a comprehensive table...
Digital Health Information for the Consumer: Evidence and Policy Implications
Wide and easy availability of health information for the general public is something that governments consider beneficial to the public as it improves the public health, helps largescale preventative medicine and eventua...
Library management in electronic environment
The interest of library profession in management sciences is nearly half a century old. A library being a service institution all the functions and principles of management are applicable to libraries as well. S R Rangan...