Download PDFOpen PDF in browserEfficient Ranked Multi-Keyword Search using Machine Learning AlgorithmsEasyChair Preprint 24295 pages•Date: January 20, 2020AbstractThe growing amount of documents in the search index of information retrieval systems make the problem of ranking documents crucial. The modern state of the problem leads to the point where machine learning becomes the most efficient way to optimize the ranking function. Keyword search is an effective method to retrieve information from such useful networks. The aim of keyword search is to find a set of answers covering all or part of the queried keywords. A challenge in keyword search systems is to rank answers according to their relevance to the query. This relevance lies in the textual content and structural compactness of the answers. Classification is the process of classifying the text documents based on words, phrases and word combinations with respect to set of predefined categories. Data classification has many applications such as mail routing, email filtering, content classification, news monitoring and narrow-casting. Keywords are extracted from documents to classify the documents. Keywords are subset of words that contains the most important information about the content of the document. Keyword extraction is a process used to take out the important keywords from documents. In this proposed system keywords are extracted from documents using TF-IDF and naïve bays algorithm. TF-IDF algorithm is used to select the candidate words. The words which have highest similarity are taken as keywords. The experiment has been done using Naive Bayes algorithms and its performance is analyzed based on machine learning. Keyphrases: Classification, Naïve Bayes Algorithm, Ranking, TF-IDF algorithm, keyword-based search, machine learning
|