Download PDFOpen PDF in browserAn Analysis of Natural Language Text Relating to Thai Criminal LawEasyChair Preprint 37326 pages•Date: July 3, 2020AbstractThis paper analyses Thailand’s criminal law enforcement in chapter 1, Offenses causing death section category section 288 and 289 of title 10 offenses affecting life and body under the Thai Criminal Code. The first part of this paper is using criminal law domain knowledge and supreme court judgment results, to be the initial domain information and result is the rules that humans can understand. The second part of this research is bringing training data set from the final judgment to train with deep learning methods. Due to the training set which have severe imbalances, the Synthetic Minority Over-sampling TEchnique (SMOTE) [1] is used to solve this problem. Models are trained on the training set using unidirectional Long Short-Term Memory (LSTM) [2] networks and bidirectional Long Short-Term Memory (BiLSTM) [3] are type of Recurrent Neural Networks (RNN) [2]. The word embeddings of the dataset can be learned while training a deep neural network. BiLSTM average F1 score is higher than LSTM. Pre-trained word embeddings are then used to make the average F1 score higher than before. Finally, using models to predict online crime news, the highest average probability of each model is selected by using Soft Voting as input to the rules. The test results compared with the predictions of our methods with the opinion of the lawyer, corresponding 76%. Keyphrases: BiLSTM, Criminal Law, Decision Tree, Deepcut, LSTM, Pre-trained word embeddings, SMOTE, Thai Supreme Court, soft voting, word embedding, word2vec
|