Download PDFOpen PDF in browserHeuristic Approach Towards Covid -19: Big Data Analytics and Classification with Natural Language Procession.EasyChair Preprint 375412 pages•Date: July 6, 2020AbstractData plays an important role as per our lifestyle. With Newer advancements in technologies, and cheaper internet costs, the data usage is increasing tremendously to many folds resulting in the generation of huge heaps of unstructured data. Big Data itself means a large amount of data. This unstructured data is difficult to handle using available data-base technologies. We used HDFS system for effective data management issues. With implementation of big data analytics, it will be easy to cure many sensitive cases and these databases will be freely accessible and will definitely lead to advancements of COVID-19 research. We see that genetic information related to corona virus increases every day. So we work on Machine learning models that can classify gene sequence classes faster. We use libraries like matplotlib to construct a detailed graph of the data. In this paper we take three different sequences to classify the gene sequencing using the Natural Language Processing technique of sklearnlibrary. And also tested using logistic regression.The aim is to classify the gene classes present in complete sequence, so that mutation can be easily detected without wasting time. Keyphrases: Artificial Intelligence, Big Data, Big Data Analytic, COVID-19, Classification, Natural Language Processing, Novel coronavirus, Sklearn, deep learning, gene sequence, k mer formation, language processing, logistic regression, machine learning, matplotlib
|