DEVELOPMENT OF AN INTELLIGENT WEB BASED DYNAMIC NEWS AGGREGATOR INTEGRATING INFOSPIDER AND INCREMENTAL WEB CRAWLING TECHNOLOGY

  • O. E. Aru Department of Computer Engineering, Michael Okpara University of Agriculture, Umudike Umuahia, Abia State-Nigeria
  • C. N. Ubochi Department of Computer Engineering, Michael Okpara University of Agriculture, Umudike Umuahia, Abia State-Nigeria.
  • C. Ihekweaba Department of Computer Engineering, Michael Okpara University of Agriculture, Umudike Umuahia, Abia State-Nigeria
Keywords: news aggregator, web crawler, url, frontier, seed url, crawling

Abstract

The World Wide Web is a rapidly growing and changing information source. This reality is gradually replacing the traditional way users obtain news or information. Traditionally, individuals get their news or information from print media, such as newspapers and magazines. Although, the advent of the internet has made things a lot easier by making this digitalized news accessible from anywhere in the world, either through news websites or dedicated applications. However, the growth and change rates make the task of finding relevant and recent information harder. Users are still faced with the challenges of visiting numerous websites just to get updated or informed on a specific type of news. This creates a problem as users have to always memorize different URLs and visit numerous websites just to view a specific type of news. Therefore, the need to develop an intelligent web based dynamic news aggregator that will provide a digital platform for individuals to easily find news pertaining to a particular topic in real time becomes imperative. It crawls the web, searches for news agencies and return a specific news of interest to the user. To address the shortcomings of existing news aggregators, this work was implemented by integrating the intelligent web based dynamic news aggregator, into an infospider web crawling technology. This is achieved applying a stochastic selector and incremental web crawling technology that crawls the entire seed urls. This system was implemented with the PHP scripting language developed to access the PHP-crawler using Aptana Studio as the Integrated Development Environment (IDE), Bootsrap3 and jQuery were used to provide a set of style sheets and JavaScript libraries to simplify the client-side scripting. The application was deployed and tested using the apache web server and a personal computer.

Published
2021-04-26
How to Cite
Aru, O., Ubochi, C., & Ihekweaba, C. (2021). DEVELOPMENT OF AN INTELLIGENT WEB BASED DYNAMIC NEWS AGGREGATOR INTEGRATING INFOSPIDER AND INCREMENTAL WEB CRAWLING TECHNOLOGY. LAUTECH Journal of Engineering and Technology, 15(1), 11-22. Retrieved from https://laujet.com/index.php/laujet/article/view/398