Download PDFOpen PDF in browserOne-Hot Encoding and Bag-of-Words Methods in Processing the Uzbek Language Corpus TextsEasyChair Preprint 110486 pages•Date: October 9, 2023AbstractComputers are designed to process information in digital or numerical form. But data is not always in numerical form. This article describes how to process data in the form of characters, words, and text, as well as the application of ONE-HOT ENCODING and BAG-OF-WORDS methods to the Uzbek language, among the methods of teaching a computer to process natural language. How do Alexa, Google Home, and many other "smart" assistants understand and respond to our speech today? This article presents the approaches of text processing of the Uzbek language corpus through text processing methods such as Bag-of-words (BOW), ONE-HOT encoding in the field of artificial intelligence called natural language processing. Keyphrases: Uzbek language corpus, one-hot encoding, text processing Bag of words
|