Sentiment Analysis of Social Networking Data Using Categorized Dictionary مقالة

كاتب مسؤول: Sharma، Aastha ؛

مؤلف: Singh، Akansha ؛ Singh، Krishna Kant ؛ Dhull، Anuradha ؛

Journal of Information Technology Management Autumn 2020 - Number 45 التصنيف عالمي (Ministry of Science/ISC (‎16 صفحة - من 105 إلی 120 )

الکلمات المفتاحية: sentiment analysis big data Hadoop HDFS Map-Reduce Facepager

خلاصة:

Sentiment analysis is the process of analyzing a person’s perception or belief about a particular subject matter. However, finding correct opinion or interest from multi-facet sentiment data is a tedious task. In this paper, a method to improve the sentiment accuracy by utilizing the concept of categorized dictionary for sentiment classification and analysis is proposed. A categorized dictionary is developed for the sentiment classification and further calculation of sentiment accuracy. The concept of categorized dictionary involves the creation of dictionaries for different categories making the comparisons specific. The categorized dictionary includes words defining the positive and negative sentiments related to the particular category. It is used by the mapper reducer algorithm for the classification of sentiments. The data is collected from social networking site and is pre-processed. Since the amount of data is enormous therefore a reliable open-source framework Hadoop is used for the implementation. Hadoop hosts various software utilities to inspect and process any type of big data. The comparative analysis presented in this paper proves the worthiness of the proposed method.

ملخص الجهاز:

Tayal & Yadav (2016) have proposed an approach for faster retrieval of sentiment analysis wherein to store and process the large set of data Hadoop has been used. The proposed method includes the extraction of data to pre- processing of data for retrieval of a number of positive and negative words which are then used for calculating the sentiment accuracy for providing better feedback to companies about the feedback of their products. In Reducer function, the data set of input file obtained from Mapper is compared with the categorized positive and negative words dictionary consisting of the words which are specifically related to a particular category. Then it is pre-processed by the Mapper and Reducer phase of Apache Hadoop to get the number of positive and negative words which are used to calculate the sentiment accuracy. Mapper algorithm for proposed method Reducer Phase In the Reducer phase, the file generated as result from the Mapper is compared with the categorized dictionary based on the type of category identified for the input data in Mapper phase. This data is then compared with the categorized dictionary consisting of positive and negative words which are related to the sentiments of a particular category to which the data belongs to. The mean accuracy calculated proved to give better results since the comparisons on the positive and negative sentiments of words were specific to the provided categorized dictionary.