ANALISIS SENTIMEN REVIEW PRODUK KOSMETIK MELALUI KOMPARASI FEATURE SELECTION
Sentiment analysis is a computational study of the opinions, behaviors and emotions of people toward the entity. The entity describes the individuals, events or topics. That topics generally could be the review of diverse datasets, one of which is a product review. By reading the review of products based on the experiences of other consumers, it will be recognized the quality of a product. It goes without saying, as cosmetic products on the market today are very diverse, both in terms of type and brand. However, not all cosmetics have good quality in line with the needs of consumers and it is to be noticed by the consumer. Lately consumers who are writing their opinions, reviews and experiences through online is increasing. So, reexamination of the cosmetic product review by classifying these reviews into positive and negative class is an excellent way to determine the response of other consumers about the product quickly and accurately. Among of the techniques for classification mostly used by data classification is Support Vector Machine (SVM). SVM has the advantage of being able to identify the separated hyper plane that maximizes the margin between two different classes. However, SVM has a weakness for parameter selection or suitable features. Feature selection set up the parameters in SVM that significantly affects the results of classification accuracy. Feature selection also can be used to reduce the attributes that are less relevant to the dataset. To improve the previous research, this research uses the combined method of feature selection in Algorithm Support Vector Machine by comparing two-feature selection, namely Particle Swarm Optimization and Genetic Algorithm. It is in order to improve the accuracy of the classification of Support Vector Machine. Furthermore the research found the text classification in a positive or negative format from the cosmetic products review. Measurement is based on Support Vector Machine accuracy before and after adding the feature selection method. The evaluation was done by using 10 Fold Cross Validation. While the accuracy measurement is done by using the Confusion Matrix and ROC Curve. The results of integrated Support Vector Machine Algorithm and Feature Selection Algorithm, Particle Swarm Optimization indicate the best results with average accuracy 97.00% and the average AUC 0.988. While Genetic Algorithm show the best results with average accuracy 94.00% and average of AUC 0.984. As conclusion, the research of Support Vector Machine Algorithm showed the best accuracy improvement toward the integrated feature selection Particle Swarm Optimization with the increased accuracy from 89.00% to 97.00%.
Abstract viewed = 146 times
PDF downloaded = 128 times