﻿<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD with MathML3 v1.2 20190208//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd">
<article
    xmlns:mml="http://www.w3.org/1998/Math/MathML"
    xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="review-article">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">JAIBD</journal-id>
      <journal-title-group>
        <journal-title>Journal of Artificial Intelligence and Big Data</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2771-2389</issn>
      <issn pub-type="ppub"></issn>
      <publisher>
        <publisher-name>Science Publications</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.31586/jaibd.2016.1293</article-id>
      <article-id pub-id-type="publisher-id">JAIBD-1293</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Review Article</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>
          Advanced Natural Language Processing (NLP) Techniques for Text-Data Based Sentiment Analysis on Social Media
        </article-title>
      </title-group>
      <contrib-group>
<contrib contrib-type="author">
<name>
<surname>*</surname>
<given-names>Srinivas Chippagiri</given-names>
</name>
<xref rid="af1" ref-type="aff">1</xref>
<xref rid="af2" ref-type="aff">2</xref>
<xref rid="af2" ref-type="aff">2</xref>
<xref rid="af2" ref-type="aff">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kumar</surname>
<given-names>Savan</given-names>
</name>
<xref rid="af1" ref-type="aff">1</xref>
<xref rid="af2" ref-type="aff">2</xref>
<xref rid="af2" ref-type="aff">2</xref>
<xref rid="af2" ref-type="aff">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sheng</surname>
<given-names>Olivia R Liu</given-names>
</name>
<xref rid="af1" ref-type="aff">1</xref>
<xref rid="af2" ref-type="aff">2</xref>
<xref rid="af2" ref-type="aff">2</xref>
<xref rid="af2" ref-type="aff">2</xref>
</contrib>
      </contrib-group>
<aff id="af1"><label>1</label> Department of Operations and Information Systems, David Eccles School of Business, University of Utah, Salt Lake City, UT, 84112, USA</aff>
      <pub-date pub-type="epub">
        <day>21</day>
        <month>12</month>
        <year>2016</year>
      </pub-date>
      <volume>1</volume>
      <issue>1</issue>
      <history>
        <date date-type="received">
          <day>26</day>
          <month>07</month>
          <year>2016</year>
        </date>
        <date date-type="rev-recd">
          <day>19</day>
          <month>10</month>
          <year>2016</year>
        </date>
        <date date-type="accepted">
          <day>12</day>
          <month>11</month>
          <year>2016</year>
        </date>
        <date date-type="pub">
          <day>21</day>
          <month>12</month>
          <year>2016</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>&#xa9; Copyright 2016 by authors and Trend Research Publishing Inc. </copyright-statement>
        <copyright-year>2016</copyright-year>
        <license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
          <license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p>
        </license>
      </permissions>
      <abstract>
        The field of sentiment analysis is a crucial aspect of natural language processing (NPL) and is essential in discovering the emotional undertones within the text data and, hence, capturing public sentiments over a variety of issues. In this regard, this study suggests a deep learning technique for sentiment categorization on a Twitter dataset that is based on Long Short-Term Memory (LSTM) networks. Preprocessing is done comprehensively, feature extraction is done through a bag of words method, and 80-20 data is split using training and testing. The experimental findings demonstrate that the LSTM model outperforms the conventional models, such as SVM and Na&#x000ef;ve Bayes, with an F1-score of 99.46%, accuracy of 99.13%, precision of 99.45%, and recall of 99.25%. Additionally, AUC-ROC and PR curves validate the model&#x02019;s effectiveness. Although, it performs well the model consumes heavy computational resources and longer training time. In summary, the results show that deep learning performs well in sentiment analysis and can be used to social media monitoring, customer feedback evaluation, market sentiment analysis, etc.
      </abstract>
      <kwd-group>
        <kwd-group><kwd>Social Media</kwd>
<kwd>Sentiment Analysis</kwd>
<kwd>Natural Language Processing (NLP)</kwd>
<kwd>Twitter Data</kwd>
<kwd>Machine Learning</kwd>
<kwd>Text Classification</kwd>
</kwd-group>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec1">
<title>Introduction</title><p>In this digital age social media has become a global communication system as it provides avenues for one to express emotions, share opinions, and interact while alive. Emotions have a huge role in defining online discussions, consumer behavior, brand perception and public sentiment. Businesses, policymakers and researchers interested in analyzing and interpreting these emotions are important to be able to understand the public sentiment and emerging trends. As social media grows in an exponential way, lots of text-based content is generated every day [
<xref ref-type="bibr" rid="R1">1</xref>]. User reviews, comments, and discussions are all part of this data that consists of rich sources of information to determine what is the preferred brand, what is the perception, and what are the societal attitudes. However, due to the nature of unstructured and the large scale of such text data, it is highly impractical to analyze such data manually, and hence automated techniques for extracting meaningful patterns and sentiments from such data become necessary [
<xref ref-type="bibr" rid="R2">2</xref>].</p>
<p>As a major domain of NLP, sentiment analysis refers to classifying and interpreting the sentiments conveyed in text [
<xref ref-type="bibr" rid="R3">3</xref>]. Current sentiment analysis techniques are based on basic word-level techniques to decide whether. The emotion of a writing might be neutral, negative, or positive [
<xref ref-type="bibr" rid="R4">4</xref>]. Despite these, such methods often fall short of context, sarcasm and Variations in language, which hinder its effectiveness in the actual world. In order to mitigate these challenges, the sentimental analysis of social media text has been enhanced through advanced NLP techniques [
<xref ref-type="bibr" rid="R5">5</xref>].</p>
<p>Sentiment analysis made by DL models, transformer architectures and contextual embeddings boosts the accuracy of classification. Modern methods that combine ML with sentiment analysis allow for the processing of massive piles of social media data for more effective identification of sentiment polarity and for further insights into public opinion [
<xref ref-type="bibr" rid="R6">6</xref>]. On the basis of these techniques, we offer contextual embedding, sentiment lexicon adaptation and domain-specific sentiment analysis, that help businesses and researchers to extract sensible results, optimize decision making and improve user engagement strategies.</p>
<title>1.1. Aim and Contribution </title><p>Sentiment analysis is a very important tool in understanding public perception of a topic, and it has become a very influential channel through which people express their opinions on many topics through social media platforms, especially Twitter. Tweet is unstructured, and so it is a challenge for sentiment classification, and thus advanced NLP and DL are needed. On the one hand, traditional ML is very sensitive to context, while sequential data is manageable on the other. The contribution of this study is to boost classification accuracy, facilitating decision-making for businesses, decision-making for researchers, or decision-making for policymakers. The main contributions are:</p>
<p>Create sentiment classification models using natural language processing and the Twitter dataset.</p>
<p>Implements advanced text preprocessing techniques, including filtering, tokenization, and stop-word removal, to enhance data quality for sentiment analysis.</p>
<p>Utilizes the bag-of-words method to identify key attributes, improving sentiment classification accuracy.</p>
<p>Employs LSTM networks to effectively capture contextual dependencies in textual data.</p>
<p>Use an evaluation based on a confusion matrix that takes into account F1-score, recall, accuracy, and precision.</p>
<title>1.2. Structure of the paper</title><p>The study is structured as follows: Relevant work for text-data-based sentiment analysis on social media is presented in Section II. The approach, including data collection, preprocessing, and feature extraction techniques, is described in depth in Section III. The experimental findings and performance assessment are shown in Section IV. Section V concludes the study and summarizes key findings.</p>
<title>1.3. Literature Review </title><p>This section reviews research articles on for sentiment analysis in social media with advanced ML algorithms and natural language processing.</p>
<p>Kanakaraj and Guddeti (2015) examine social sentiment toward a specific news story from Twitter postings. The mood of the mined text data is ascertained by applying ensemble classification. Ensemble classification combines the capabilities of several individual classifiers to address a particular classification problem. Ensemble classifiers outperform standard ML classifiers by 3&#x26;#x02013;5%, according to experiments [
<xref ref-type="bibr" rid="R5">5</xref>].</p>
<p>Chirawichitchai (2014) suggested Thai Text-Based Emotion Classification comparing many popular word weighting systems utilizing ML techniques and term weighting. I found that Boolean weighting with an SVM performs well in our experiments. With an accuracy of 77.86%, our experiments revealed that the SVM approach with the Information Gain feature selection worked best.Furthermore, our experimental results show that the Thai Emotion Classification Framework is enhanced by feature weighting strategies [
<xref ref-type="bibr" rid="R7">7</xref>].</p>
<p>Hogenboom et al. (2014) transmit the target sentiment lexicon by analyzing the sentiment of seed words in a semantic lexicon for the target language. When sentiment analysis is expanded from English to Dutch, it yields a significant performance boost of around 29% over the baseline in terms of accuracy and macro-level F1 on our data. This is achieved by mapping sentiment across languages by using relationships across semantic lexicons. Sentiment propagation in language-specific semantic lexicons can exceed the baseline by up to 47%, depending on the seed set of sentiment-carrying words [
<xref ref-type="bibr" rid="R8">8</xref>].</p>
<p>Anjaria and Guddeti. (2014) used supervised machine learning methods to classify Twitter data using a feature extraction model that combined unigram and bigram, as well as ANN. The case study included the US presidential election of 2012 as well as the Indian Karnataka state assembly election of 2013. Results from experiments show that SVM are the best classifiers, achieving up to 88% accuracy in the 2012 US elections and 68% accuracy in the 2013 Indian state assembly elections [
<xref ref-type="bibr" rid="R9">9</xref>].</p>
<p>Volkova, Wilson and Yarowsky At el. (2013) focus on finding gender differences in subjective language use in Twitter data related to English, Spanish, and Russian. Additionally, investigates cross-cultural variations in the usage of hashtags and emoticons by male and female users. According to our findings, the statistical significance of the relative F-measure improvement over the gender-independent baseline is established. 2.5% and 5% for English, 1% and 1.5% for Russian, and 2% and 0.5% for Spanish, according to the polarity and subjectivity study [
<xref ref-type="bibr" rid="R10">10</xref>].</p>
<p>Table 1 provides a comparative analysis of different previous reviews on sentiment analysis based on the datasets, key findings, limitations, and future work.</p>
<table-wrap id="tab1">
<label>Table 1</label>
<caption>
<p><b> Summary of Sentiment Classification Techniques in Social Media Using Machine Learning</b></p>
</caption>

<table>
<thead>
<tr>
<th align="center"><bold>Paper</bold></th>
<th align="center"><bold>Method</bold></th>
<th align="center"><bold>Dataset</bold></th>
<th align="center"><bold>Key Findings</bold></th>
<th align="center"><bold>Limitations &#x00026;Future Work</bold><bold></bold></th>
<th align="center"></th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">Kanakaraj and  Guddeti (2015)</td>
<td align="center">NLP techniques,  Word Sense Disambiguation, Ensemble classification</td>
<td align="center">Twitter posts on  news events</td>
<td align="center">Ensemble  classification improves accuracy by 3-5% over traditional ML classifiers</td>
<td align="center">Future work could  explore deep learning models for further accuracy enhancement</td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Chirawichitchai  (2014)</td>
<td align="center">Term weighting,  SVM, Information Gain feature selection</td>
<td align="center">Thai text dataset</td>
<td align="center">Boolean weighting  with SVM achieves the highest accuracy (77.86%)</td>
<td align="center">Future work can  focus on expanding emotion classification for multilingual settings</td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Hogenboom et al.  (2014)</td>
<td align="center">Spreading  sentiment lexicon and cross-linguistic sentiment mapping</td>
<td align="center">English and Dutch  language datasets</td>
<td align="center">Sentiment  propagation improves accuracy by up to 47%</td>
<td align="center">Further research  can investigate additional languages and domain-specific sentiment lexicons</td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Anjaria and  Guddeti (2014)</td>
<td align="center">Supervised ML  (SVM, Na&#x00026;iuml;ve Bayes, ANN), Unigram &#x00026; Bigram features, Influence Factor</td>
<td align="center">Twitter  statistics (Karnataka State Assembly Elections 2013, US Presidential  Elections 2012)</td>
<td align="center">SVM achieved  highest accuracy (88% for US Elections, 68% for Indian Elections)</td>
<td align="center">Future work can  incorporate deep learning models and social influence factors for better  prediction</td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Volkova, Wilson,  and Yarowsky (2013)</td>
<td align="center">Understanding how  gender differs in the classification of sentiment, polarity, and subjectivity</td>
<td align="center">English, Spanish,  and Russian Twitter data</td>
<td align="center">Gender-based  language differences improve polarity classification (2.5-5% improvement in  F-measure)</td>
<td align="center">Future studies  can explore additional cultural and linguistic variations for sentiment  analysis</td>
<td align="center"></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>

</fn>
</table-wrap-foot>
</table-wrap><p></p>
</sec><sec id="sec2">
<title>Methodology</title><p>The methodology for sentiment analysis using NLP and deep learning involves multiple stages, beginning with the Twitter dataset, which undergoes comprehensive preprocessing, including filtering, tokenization, and stop-word removal. Feature extraction is then applied using the bag-of-word method to identify the most relevant attributes for sentiment classification. The dataset is preprocessed and then split into training and testing groups in an 80-20 ratio. For text categorization, use DL models such as LSTM networks. The effectiveness of the trained models on the testing subset is then evaluated using performance metrics like as accuracy, precision, recall, and F1-score, producing the final findings. The overall workflow of the methodology is displayed inFigure <xref ref-type="fig" rid="fig1"> 1</xref>.</p>
<fig id="fig1">
<label>Figure 1</label>
<caption>
<p>Flowchart for sentiment analysis</p>
</caption>
<graphic xlink:href="1293.fig.001" />
</fig><p>The flowchart's subsequent phases are briefly described below:</p>
<title>2.1. Data collection</title><p>The Twitter dataset consists of 73,000 tweets, out of which 12,000 are labeled as &#x26;#x0201c;irrelevant.&#x26;#x0201d; These include tweets in foreign languages, containing only URLs, or with unreadable Unicode characters, which are excluded from analysis. This leaves 61,000 tweets with sentiments categorized as positive, negative, or neutral, covering topics related to brands, public opinion, and social discussions. The visualization of data insight is given in below: </p>
<fig id="fig2">
<label>Figure 2</label>
<caption>
<p>Count plot for Data distribution<b> </b></p>
</caption>
<graphic xlink:href="1293.fig.002" />
</fig><p>The bar chart inFigure <xref ref-type="fig" rid="fig2"> 2</xref> displays the distribution of 73,000 tweets into four sentiment classes: Negative (22,000), Positive (21,000), Neutral (18,000), and Irrelevant (12,000). The Negative class is the most frequent, while Irrelevant tweets are excluded from analysis. The dataset's slight imbalance may require data preprocessing and balancing techniques for optimal sentiment classification.</p>
<fig id="fig3">
<label>Figure 3</label>
<caption>
<p>Top 10 sources of tweet count</p>
</caption>
<graphic xlink:href="1293.fig.003" />
</fig><p>The bar graph inFigure <xref ref-type="fig" rid="fig3"> 3</xref>, shows the top 10 sources of tweets, with "Twitter for iPhone" leading, followed by "Twitter for Android" and "Twitter Web App." Mobile devices dominate tweet generation, while platforms like Tweet Deck, Hootsuite, and Instagram contribute minimally. Third-party tools play a minor role, emphasizing users' preference for official Twitter applications.</p>
<title>2.2. Data preprocessing</title><p>Pre-processing the data lowers the computational complexity and produces text classifications of greater quality. The following stages are typical of a pre-processing procedure:</p>
<p><bold>Filtering:</bold> This stage involves removing the URL link, special terms on Twitter (like "RT," which stands for "ReTweet"), user names on Twitter (like "@Ron" with the @ sign next to a user name), and emoticons[
<xref ref-type="bibr" rid="R11">11</xref>].</p>
<p><bold>Tokenization:</bold> Tokenize or segment text by dividing it into word containers using punctuation and spaces.</p>
<title>2.3. Stop-words removal</title><p>A group of terms known as "stop words"&#x26;#x02014;such as a, the, I, am, and so on&#x26;#x02014;are commonly employed in everyday speech. These words have no bearing on the text's sentiment or meaning, hence they are not important for the study [
<xref ref-type="bibr" rid="R12">12</xref>]. Because stop words eliminate low-level information, our text may concentrate more on the key information.</p>
<title>2.4. Feature extraction with Bag-of-words</title><p>The BoW technique is the source of the feature extraction procedure. (In this instance, the text is shown as a bag of words.) The frequency with which each word occurs acts as a feature for training the classifier. Additionally, redundant and sparse data are eliminated from the original raw data to minimize overfitting of the training set and speed up algorithm execution of the reduced set of features.</p>
<title>2.5. Data splitting</title><p>The dataset was divided into a test set and a training set. Eighty percent of the data was in the training set and twenty percent was in the testing set to guarantee successful model validation.</p>
<title>2.6. Classification with Long short-term memory (LSTM) </title><p>The LSTM model is one type of recurrent neural network [
<xref ref-type="bibr" rid="R13">13</xref>]. In order to provide a typical RNN more precise control over memory, LSTMs include more factors. These variables determine the importance of the present input in forming the new memory, the significance of the prior memories in forming the new memory, and the memory's key components in producing the output. The LSTM mathematical equations are shown below (1 to 6).</p>

<disp-formula id="FD1"><div class="html-disp-formula-info"><div class="f"><math display="inline"><semantics><mrow><msub><mrow><mi>i</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>=</mo><mi mathvariant="normal"> </mi><mi>σ</mi><mfenced separators="|"><mrow><msub><mrow><mi>W</mi></mrow><mrow><mi>i</mi></mrow></msub><msub><mrow><mi>x</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>+</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>U</mi></mrow><mrow><mi>i</mi></mrow></msub><msub><mrow><mi>h</mi></mrow><mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub><mo>+</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>b</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow></mfenced></mrow></semantics></math></div><div class="l"><label>(1)</label></div></div></disp-formula></disp-formula>
<disp-formula id="FD2"><div class="html-disp-formula-info"><div class="f"><math display="inline"><semantics><mrow><msub><mrow><mi>f</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>=</mo><mi mathvariant="normal"> </mi><mi>σ</mi><mfenced separators="|"><mrow><msub><mrow><mi>W</mi></mrow><mrow><mi>f</mi></mrow></msub><msub><mrow><mi>x</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>+</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>U</mi></mrow><mrow><mi>f</mi></mrow></msub><msub><mrow><mi>h</mi></mrow><mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub><mo>+</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>b</mi></mrow><mrow><mi>f</mi></mrow></msub></mrow></mfenced></mrow></semantics></math></div><div class="l"><label>(2)</label></div></div></disp-formula></disp-formula>
<disp-formula id="FD3"><div class="html-disp-formula-info"><div class="f"><math display="inline"><semantics><mrow><msub><mrow><mi>o</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>=</mo><mi mathvariant="normal"> </mi><mi>σ</mi><mfenced separators="|"><mrow><msub><mrow><mi>W</mi></mrow><mrow><mi>o</mi></mrow></msub><msub><mrow><mi>x</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>+</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>U</mi></mrow><mrow><mi>o</mi></mrow></msub><msub><mrow><mi>h</mi></mrow><mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub><mo>+</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>b</mi></mrow><mrow><mi>o</mi></mrow></msub></mrow></mfenced></mrow></semantics></math></div><div class="l"><label>(3)</label></div></div></disp-formula></disp-formula>
<disp-formula id="FD4"><div class="html-disp-formula-info"><div class="f"><math display="inline"><semantics><mrow><msub><mrow><mi>g</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>=</mo><mrow><mrow><mi mathvariant="normal">tanh</mi></mrow><mo>⁡</mo><mrow><mfenced separators="|"><mrow><msub><mrow><mi>W</mi></mrow><mrow><mi>g</mi></mrow></msub><msub><mrow><mi>x</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>+</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>U</mi></mrow><mrow><mi>g</mi></mrow></msub><msub><mrow><mi>h</mi></mrow><mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub><mo>+</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>b</mi></mrow><mrow><mi>g</mi></mrow></msub></mrow></mfenced></mrow></mrow></mrow></semantics></math></div><div class="l"><label>(4)</label></div></div></disp-formula></disp-formula>
<disp-formula id="FD5"><div class="html-disp-formula-info"><div class="f"><math display="inline"><semantics><mrow><msub><mrow><mi>c</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>=</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>f</mi></mrow><mrow><mi>i</mi></mrow></msub><mi mathvariant="normal"> </mi><mrow><mo stretchy="false">⨀</mo><mrow><msub><mrow><mi>c</mi></mrow><mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></mrow></mrow><mo>+</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>i</mi></mrow><mrow><mi>t</mi></mrow></msub><mi mathvariant="normal"> </mi><mrow><mo stretchy="false">⨀</mo><mrow><msub><mrow><mi>g</mi></mrow><mrow><mi>t</mi></mrow></msub></mrow></mrow></mrow></semantics></math></div><div class="l"><label>(5)</label></div></div></disp-formula></disp-formula>
<disp-formula id="FD6"><div class="html-disp-formula-info"><div class="f"><math display="inline"><semantics><mrow><msub><mrow><mi>h</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>=</mo><mi mathvariant="normal"> </mi><msub><mrow><mi>o</mi></mrow><mrow><mi>t</mi></mrow></msub><mrow><mo stretchy="false">⨀</mo><mrow><mi>t</mi><mi>a</mi><mi>n</mi><mi>h</mi><mfenced separators="|"><mrow><msub><mrow><mi>c</mi></mrow><mrow><mi>t</mi></mrow></msub></mrow></mfenced></mrow></mrow></mrow></semantics></math></div><div class="l"><label>(6)</label></div></div></disp-formula></disp-formula><p>The logistic sigmoid function is represented by &#x26;#x0d835;&#x26;#x0df0e; in the equations above, whereas element-wise multiplication is represented by &#x26;#x02299;. The LSTM unit has a memory cell <math><semantics><mrow><msub><mrow><mi>c</mi></mrow><mrow><mi>t</mi></mrow></msub><mi mathvariant="normal"> </mi></mrow></semantics></math>at each time step &#x26;#x0d835;&#x26;#x0dc61;, a hidden unit <math><semantics><mrow><msub><mrow><mi>h</mi></mrow><mrow><mi>t</mi></mrow></msub></mrow></semantics></math> , an input gate <math><semantics><mrow><msub><mrow><mi>i</mi></mrow><mrow><mi>t</mi></mrow></msub></mrow></semantics></math>, a forget gate <math><semantics><mrow><msub><mrow><mi>f</mi></mrow><mrow><mi>t</mi></mrow></msub></mrow></semantics></math>, and an output gate <math><semantics><mrow><msub><mrow><mi>o</mi></mrow><mrow><mi>t</mi></mrow></msub></mrow></semantics></math>. B stands for the additional bias, while W and U are the learnt parameters. It makes sense that the output gate regulates the amount of internal memory state that is exposed, the forget gate regulates the amount of memory cell erasure, and the input gate regulates the amount of each unit's updating.</p>
<title>2.7. Performance metrics </title><p>Utilize the four standard information retrieval assessment criteria listed below for the subsequent analysis stage. The confusion matrix is used in this study to gauge the model's effectiveness. It takes into account the following factors. The performance metrics contain the various parameters, which are shown below:</p>
<p><bold>True Positive (TP):</bold> demonstrates the amount of correctly recognized data as well as the presence of any undesirable event. </p>
<p><bold>False Positive (FP):</bold> indicates the number of data points that have been incorrectly labeled as the presence of any undesirable event.</p>
<p><bold>True Negative (TN):</bold> shows the number of records that are correctly classified as usual. </p>
<p><bold>False Negative (FN):</bold> shows the number of records that are incorrectly classified as usual.</p>
<p><bold>Accuracy: </bold>The ratio of accurately predicted values to all test cases is known as accuracy. It is computed according to Equation (7). </p>

<disp-formula id="FD7"><div class="html-disp-formula-info"><div class="f"><math display="inline"><semantics><mrow><mi>A</mi><mi>c</mi><mi>c</mi><mi>u</mi><mi>r</mi><mi>a</mi><mi>c</mi><mi>y</mi><mo>=</mo><mfrac><mrow><mi mathvariant="normal">T</mi><mi mathvariant="normal">P</mi><mo>+</mo><mi mathvariant="normal">T</mi><mi mathvariant="normal">N</mi></mrow><mrow><mi mathvariant="normal">T</mi><mi mathvariant="normal">P</mi><mo>+</mo><mi mathvariant="normal">F</mi><mi mathvariant="normal">p</mi><mo>+</mo><mi mathvariant="normal">T</mi><mi mathvariant="normal">N</mi><mo>+</mo><mi mathvariant="normal">F</mi><mi mathvariant="normal">N</mi></mrow></mfrac></mrow></semantics></math></div><div class="l"><label>(7)</label></div></div></disp-formula></disp-formula><p><bold>Precision:</bold> Equation (8) provides precision, which is the number of true positives among all positively assigned documents:</p>

<disp-formula id="FD8"><div class="html-disp-formula-info"><div class="f"><math display="inline"><semantics><mrow><mi>P</mi><mi>r</mi><mi>e</mi><mi>c</mi><mi>i</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi><mo>=</mo><mfrac><mrow><mi mathvariant="normal">T</mi><mi mathvariant="normal">P</mi></mrow><mrow><mi mathvariant="normal">T</mi><mi mathvariant="normal">P</mi><mo>+</mo><mi mathvariant="normal">F</mi><mi mathvariant="normal">P</mi></mrow></mfrac></mrow></semantics></math></div><div class="l"><label>(8)</label></div></div></disp-formula></disp-formula><p><bold>Recall:</bold> Recall is determined by equation (9) and is the number of true positives among the real positive documents: </p>

<disp-formula id="FD9"><div class="html-disp-formula-info"><div class="f"><math display="inline"><semantics><mrow><mi>R</mi><mi>e</mi><mi>c</mi><mi>a</mi><mi>l</mi><mi>l</mi><mo>=</mo><mfrac><mrow><mi mathvariant="normal">T</mi><mi mathvariant="normal">P</mi></mrow><mrow><mi>T</mi><mi>P</mi><mo>+</mo><mi>F</mi><mi>N</mi></mrow></mfrac></mrow></semantics></math></div><div class="l"><label>(9)</label></div></div></disp-formula></disp-formula><p><bold>F1-score:</bold> The F-measure, a weighted approach to recall and accuracy, is calculated using equation (10).</p>

<disp-formula id="FD10"><div class="html-disp-formula-info"><div class="f"><math display="inline"><semantics><mrow><mi>F</mi><mn>1</mn><mo>-</mo><mi>S</mi><mi>c</mi><mi>o</mi><mi>r</mi><mi>e</mi><mo>=</mo><mfrac><mrow><mn>2</mn><mo>(</mo><mi>P</mi><mi>r</mi><mi>e</mi><mi>c</mi><mi>i</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi><mi mathvariant="normal">*</mi><mi>R</mi><mi>e</mi><mi>c</mi><mi>a</mi><mi>l</mi><mi>l</mi><mo>)</mo></mrow><mrow><mi mathvariant="normal">P</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">c</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">s</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">o</mi><mi mathvariant="normal">n</mi><mo>+</mo><mi mathvariant="normal">R</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">c</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">l</mi><mi mathvariant="normal">l</mi></mrow></mfrac></mrow></semantics></math></div><div class="l"><label>(10)</label></div></div></disp-formula></disp-formula><p><bold>ROC (Receiver Operator Characteristic):</bold> The performance of a binary classification model may be assessed graphically using a ROC graph. It compares the True Positive Rate (TPR), often referred to as Sensitivity, with the False Positive Rate (FPR) at various classification levels. </p>
</sec><sec id="sec3">
<title>Result Analysis and Discussion</title><p>This study was conducted and assessed in an experimental environment on a PC running Windows 10 Professional (64-bit) with a Core TM i5-8250U CPU with 12 GB of RAM running at 1.8 GHz. The LSTM model for text-based sentiment analysis on social media data was implemented using Python 3 and deep natural language processing methods for classification and processing. For the same, the Twitter dataset performance is shown in below and the proposed models were trained on it.</p>
<table-wrap id="tab2">
<label>Table 2</label>
<caption>
<p><b> LSTM Model Performance for Text-based Sentiment Analysis on a Twitter Dataset</b></p>
</caption>

<table>
<thead>
<tr>
<th align="center"><bold>Evaluation measures</bold></th>
<th align="center"><bold>Long Short-Term Memory (LSTM)</bold></th>
<th align="center"></th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">Accuracy</td>
<td align="center">99.13 </td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Precision</td>
<td align="center">99.45</td>
<td align="center"></td>
</tr>
<tr>
<td align="center">Recall</td>
<td align="center">99.25 </td>
<td align="center"></td>
</tr>
<tr>
<td align="center">F1-score</td>
<td align="center">99.46 </td>
<td align="center"></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>

</fn>
</table-wrap-foot>
</table-wrap><fig id="fig4">
<label>Figure 4</label>
<caption>
<p>Bar Graph for Performance of LSTM Model</p>
</caption>
<graphic xlink:href="1293.fig.004" />
</fig><p>The results of the LSTM model's performance in text sentiment analysis using data from social media are displayed inTable <xref ref-type="table" rid="tabII"> II</xref> andFigure <xref ref-type="fig" rid="fig4"> 4</xref>. With 99.13% accuracy, the model achieves good success since it can correctly classify sentiments. It achieves a precision of 99.45% with regard to positive sentiment prediction or a recall of 99.25% as a signal of actual sentiment detection. The robustness of the F1 score of 99.46% is further supported by the good balance between accuracy and recall. This result reaffirms the usefulness of advanced NLP techniques in processing complex social media texts irrespective of linguistic variances and context.</p>
<fig id="fig5">
<label>Figure 5</label>
<caption>
<p>AUC-ROC Curve for LSTM model</p>
</caption>
<graphic xlink:href="1293.fig.005" />
</fig><p>The LSTM model's AUC-ROC curve provides a strong classification, as seen inFigure <xref ref-type="fig" rid="fig5"> 5</xref>. The curve is almost entirely in the upper-left corner, indicating an excellent TPR and FPR. This steep rise indicates that the classification capability is excellent, i.e., the misclassification is negligible. This indicates the high AUC score of the model, which implies it is effective in sentiment analysis.</p>
<fig id="fig6">
<label>Figure 6</label>
<caption>
<p>Confusion matrix for LSTM model</p>
</caption>
<graphic xlink:href="1293.fig.006" />
</fig><p>The LSTM model's high predictive potential is seen inFigure <xref ref-type="fig" rid="fig6"> 6</xref>. The model correctly classifies 7,396 positives and 7,474 negatives with very few misclassifications, as it has 51 false negatives and 79 false positives. The results show high accuracy and balanced performance, which means that the model is suitable at classifying sentiment classes with few errors.</p>
<fig id="fig7">
<label>Figure 7</label>
<caption>
<p>PR Curve of LSTM model on Twitter data</p>
</caption>
<graphic xlink:href="1293.fig.007" />
</fig><p>The PR curve inFigure <xref ref-type="fig" rid="fig7"> 7</xref> demonstrates how well the LSTM model classifies the Twitter dataset. The curve has high precision (0.98+) and recall (0.98+), meaning there are few falsely positive and negative. This validates sentiment analysis's model resilience as the steep drop close to recall of 1.0 shows that it can balance recall and accuracy.</p>
<title>3.1. Comparative analysis</title><p>In this section, the ML and DL models like LSTM, NB and SVM are compared for sentiment analysis and deep learning&#x26;#x02019;s suitability in handling contextual relationships is emphasized over traditional ones.Table <xref ref-type="table" rid="tabIII"> III</xref> analyzes several sentiment analysis models using the Twitter dataset and demonstrates that LSTM attains a 99.13% accuracy rate. The lower accuracies of 81.30% achieved by NB and 70.33% by SVM are obvious and traditional ML models. These results demonstrate that deep learning techniques are highly efficient in performing sentiment analysis tasks through LSTM which can capture the contextual information of the text. </p>
<table-wrap id="tab3">
<label>Table 3</label>
<caption>
<p><b> Various models Performance comparison on the Twitter dataset for sentiment analysis</b></p>
</caption>

<table>
<thead>
<tr>
<th align="center"><a name="_Hlk177212538"><bold>Models</bold></a></th>
<th align="center"><bold>Accuracy</bold></th>
<th align="center"></th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">LSTM (Long Short-Term Memory) </td>
<td align="center">99.13 </td>
<td align="center"></td>
</tr>
<tr>
<td align="center">NB (Na&#x00026;iuml;ve Bayes) [14]</td>
<td align="center">81.30 </td>
<td align="center"></td>
</tr>
<tr>
<td align="center">SVM (Support Vector Machine) [15]</td>
<td align="center">70.33 </td>
<td align="center"></td>
</tr>
</tbody>
</table>
</table-wrap><p></p>
<p>The proposed LSTM model detects complex patterns effectively because it reaches 99.13% accuracy while outperforming classic machine learning approaches. The arrangement of words throughout the text remains untouched to deliver a high accuracy rate together with minimal erroneous classifications. The model is useful for social media research and gathering client input in real-world applications since it performs well with large datasets. </p>
</sec><sec id="sec4">
<title>Conclusion and Future Work</title><p>Sentiment analysis is an automated method for identifying and comprehending the emotions expressed in a text. In the last ten years, SA's overall prevalence among NLP users has increased. SA is currently essential for businesses to obtain customer information and mold their marketing strategy because to the unavoidable usage of social media and online platforms. The study demonstrates the effectiveness of the LSTM model for sentiment analysis, achieving an accuracy of 99.13%, surpassing traditional ML models like Na&#x26;#x000ef;ve Bayes 81.30% and SVM 70.33%. The model excels in capturing contextual dependencies in text, ensuring minimal misclassification, as evident from high precision 99.45% and recall 99.25% scores. The AUC-ROC and PR curves further confirm their robustness in sentiment classification. However, despite its superior performance, the LSTM model has limitations, including high computational costs, long training times, and potential inefficiency for real-time sentiment analysis. Additionally, it may struggle with highly imbalanced datasets or sarcasm detection. Future work should focus on optimizing computational efficiency, integrating hybrid deep learning models, and exploring transformer-based architectures like BERT for improved contextual understanding.</p>
</sec>
  </body>
  <back>
    <ref-list>
      <title>References</title>
      
<ref id="R1">
<label>[1]</label>
<mixed-citation publication-type="other">Y. Zhao, K. Niu, Z. He, J. Lin, and X. Wang, "Text sentiment analysis algorithm optimization and platform development in social network," in Proceedings - 6th International Symposium on Computational Intelligence and Design, ISCID 2013, 2013. doi: 10.1109/ISCID.2013.108.
</mixed-citation>
</ref>
<ref id="R2">
<label>[2]</label>
<mixed-citation publication-type="other">M. C. Ganiz, M. Tutkan, and S. Akyokus, "A novel classifier based on meaning for text classification," in INISTA 2015 - 2015 International Symposium on Innovations in Intelligent Systems and Applications, Proceedings, 2015. doi: 10.1109/INISTA.2015.7276788.
</mixed-citation>
</ref>
<ref id="R3">
<label>[3]</label>
<mixed-citation publication-type="other">L. Keri and R. T. Watson, "The impact of natural language processing based textual analysis of social media interactions on decision making," ECIS 2013 - Proc. 21st Eur. Conf. Inf. Syst., 2013.
</mixed-citation>
</ref>
<ref id="R4">
<label>[4]</label>
<mixed-citation publication-type="other">G. Paltoglou, "Sentiment analysis in social media," in Online collective action: Dynamics of the crowd in social media, Springer, 2014, pp. 3-17.
</mixed-citation>
</ref>
<ref id="R5">
<label>[5]</label>
<mixed-citation publication-type="other">M. Kanakaraj and R. M. R. Guddeti, "Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques," in Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing, IEEE ICSC 2015, 2015. doi: 10.1109/ICOSC.2015.7050801.
</mixed-citation>
</ref>
<ref id="R6">
<label>[6]</label>
<mixed-citation publication-type="other">M. Moh, A. Gajjala, S. C. R. Gangireddy, and T.-S. Moh, "On Multi-tier Sentiment Analysis Using Supervised Machine Learning," in 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), IEEE, Dec. 2015, pp. 341-344. doi: 10.1109/WI-IAT.2015.154.
</mixed-citation>
</ref>
<ref id="R7">
<label>[7]</label>
<mixed-citation publication-type="other">N. Chirawichitchai, "Emotion classification of Thai text-based using term weighting and machine learning techniques," in 2014 11th Int. Joint Conf. on Computer Science and Software Engineering: "Human Factors in Computer Science and Software Engineering" - e-Science and High Performance Computing: eHPC, JCSSE 2014, 2014. doi: 10.1109/JCSSE.2014.6841848.
</mixed-citation>
</ref>
<ref id="R8">
<label>[8]</label>
<mixed-citation publication-type="other">A. Hogenboom, B. Heerschop, F. Frasincar, U. Kaymak, and F. De Jong, "Multi-lingual support for lexicon-based sentiment analysis guided by semantics," Decis. Support Syst., 2014, doi: 10.1016/j.dss.2014.03.004.
</mixed-citation>
</ref>
<ref id="R9">
<label>[9]</label>
<mixed-citation publication-type="other">M. Anjaria and R. M. R. Guddeti, "A novel sentiment analysis of social networks using supervised learning," Soc. Netw. Anal. Min., vol. 4, no. 1, p. 181, 2014, doi: 10.1007/s13278-014-0181-9.
</mixed-citation>
</ref>
<ref id="R10">
<label>[10]</label>
<mixed-citation publication-type="other">S. Volkova, T. Wilson, and D. Yarowsky, "Exploring demographic language variations to improve multilingual sentiment analysis in social media," EMNLP 2013 - 2013 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., no. October, pp. 1815-1827, 2013.
</mixed-citation>
</ref>
<ref id="R11">
<label>[11]</label>
<mixed-citation publication-type="other">M. Venugopalan and D. Gupta, "Exploring sentiment analysis on Twitter data," in 2015 Eighth International Conference on Contemporary Computing (IC3), 2015, pp. 241-247. doi: 10.1109/IC3.2015.7346686.
</mixed-citation>
</ref>
<ref id="R12">
<label>[12]</label>
<mixed-citation publication-type="other">B. Alhadidi and M. Wedyan, "Hybrid Stop-Word Removal Technique for Arabic Language.," Egypt. Comput. Sci. J., 2008.
</mixed-citation>
</ref>
<ref id="R13">
<label>[13]</label>
<mixed-citation publication-type="other">J. N. Schrading, Analyzing domestic abuse using natural language processing on social media data. Rochester Institute of Technology, 2015.
</mixed-citation>
</ref>
<ref id="R14">
<label>[14]</label>
<mixed-citation publication-type="other">A. Go, R. Bhayani, and L. Huang, "Twitter Sentiment Classification using Distant Supervision," Processing, 2009.
</mixed-citation>
</ref>
<ref id="R15">
<label>[15]</label>
<mixed-citation publication-type="other">R. Soni and K. J. Mathai, "Improved Twitter Sentiment Prediction through Cluster-then-Predict Model," vol. 4, no. 4, pp. 559-563, 2015.
</mixed-citation>
</ref>
<ref id="R1">
<label>[1]</label>
<mixed-citation publication-type="other">Y. Zhao, K. Niu, Z. He, J. Lin, and X. Wang, "Text sentiment analysis algorithm optimization and platform development in social network," in Proceedings - 6th International Symposium on Computational Intelligence and Design, ISCID 2013, 2013. doi: 10.1109/ISCID.2013.108.
</mixed-citation>
</ref>
<ref id="R2">
<label>[2]</label>
<mixed-citation publication-type="other">M. C. Ganiz, M. Tutkan, and S. Akyokus, "A novel classifier based on meaning for text classification," in INISTA 2015 - 2015 International Symposium on Innovations in Intelligent Systems and Applications, Proceedings, 2015. doi: 10.1109/INISTA.2015.7276788.
</mixed-citation>
</ref>
<ref id="R3">
<label>[3]</label>
<mixed-citation publication-type="other">L. Keri and R. T. Watson, "The impact of natural language processing based textual analysis of social media interactions on decision making," ECIS 2013 - Proc. 21st Eur. Conf. Inf. Syst., 2013.
</mixed-citation>
</ref>
<ref id="R4">
<label>[4]</label>
<mixed-citation publication-type="other">G. Paltoglou, "Sentiment analysis in social media," in Online collective action: Dynamics of the crowd in social media, Springer, 2014, pp. 3-17.
</mixed-citation>
</ref>
<ref id="R5">
<label>[5]</label>
<mixed-citation publication-type="other">M. Kanakaraj and R. M. R. Guddeti, "Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques," in Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing, IEEE ICSC 2015, 2015. doi: 10.1109/ICOSC.2015.7050801.
</mixed-citation>
</ref>
<ref id="R6">
<label>[6]</label>
<mixed-citation publication-type="other">M. Moh, A. Gajjala, S. C. R. Gangireddy, and T.-S. Moh, "On Multi-tier Sentiment Analysis Using Supervised Machine Learning," in 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), IEEE, Dec. 2015, pp. 341-344. doi: 10.1109/WI-IAT.2015.154.
</mixed-citation>
</ref>
<ref id="R7">
<label>[7]</label>
<mixed-citation publication-type="other">N. Chirawichitchai, "Emotion classification of Thai text-based using term weighting and machine learning techniques," in 2014 11th Int. Joint Conf. on Computer Science and Software Engineering: "Human Factors in Computer Science and Software Engineering" - e-Science and High Performance Computing: eHPC, JCSSE 2014, 2014. doi: 10.1109/JCSSE.2014.6841848.
</mixed-citation>
</ref>
<ref id="R8">
<label>[8]</label>
<mixed-citation publication-type="other">A. Hogenboom, B. Heerschop, F. Frasincar, U. Kaymak, and F. De Jong, "Multi-lingual support for lexicon-based sentiment analysis guided by semantics," Decis. Support Syst., 2014, doi: 10.1016/j.dss.2014.03.004.
</mixed-citation>
</ref>
<ref id="R9">
<label>[9]</label>
<mixed-citation publication-type="other">M. Anjaria and R. M. R. Guddeti, "A novel sentiment analysis of social networks using supervised learning," Soc. Netw. Anal. Min., vol. 4, no. 1, p. 181, 2014, doi: 10.1007/s13278-014-0181-9.
</mixed-citation>
</ref>
<ref id="R10">
<label>[10]</label>
<mixed-citation publication-type="other">S. Volkova, T. Wilson, and D. Yarowsky, "Exploring demographic language variations to improve multilingual sentiment analysis in social media," EMNLP 2013 - 2013 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., no. October, pp. 1815-1827, 2013.
</mixed-citation>
</ref>
<ref id="R11">
<label>[11]</label>
<mixed-citation publication-type="other">M. Venugopalan and D. Gupta, "Exploring sentiment analysis on Twitter data," in 2015 Eighth International Conference on Contemporary Computing (IC3), 2015, pp. 241-247. doi: 10.1109/IC3.2015.7346686.
</mixed-citation>
</ref>
<ref id="R12">
<label>[12]</label>
<mixed-citation publication-type="other">B. Alhadidi and M. Wedyan, "Hybrid Stop-Word Removal Technique for Arabic Language.," Egypt. Comput. Sci. J., 2008.
</mixed-citation>
</ref>
<ref id="R13">
<label>[13]</label>
<mixed-citation publication-type="other">J. N. Schrading, Analyzing domestic abuse using natural language processing on social media data. Rochester Institute of Technology, 2015.
</mixed-citation>
</ref>
<ref id="R14">
<label>[14]</label>
<mixed-citation publication-type="other">A. Go, R. Bhayani, and L. Huang, "Twitter Sentiment Classification using Distant Supervision," Processing, 2009.
</mixed-citation>
</ref>
<ref id="R15">
<label>[15]</label>
<mixed-citation publication-type="other">R. Soni and K. J. Mathai, "Improved Twitter Sentiment Prediction through Cluster-then-Predict Model," vol. 4, no. 4, pp. 559-563, 2015.
</mixed-citation>
</ref>
<ref id="R1">
<label>[1]</label>
<mixed-citation publication-type="other">Y. Zhao, K. Niu, Z. He, J. Lin, and X. Wang, "Text sentiment analysis algorithm optimization and platform development in social network," in Proceedings - 6th International Symposium on Computational Intelligence and Design, ISCID 2013, 2013. doi: 10.1109/ISCID.2013.108.
</mixed-citation>
</ref>
<ref id="R2">
<label>[2]</label>
<mixed-citation publication-type="other">M. C. Ganiz, M. Tutkan, and S. Akyokus, "A novel classifier based on meaning for text classification," in INISTA 2015 - 2015 International Symposium on Innovations in Intelligent Systems and Applications, Proceedings, 2015. doi: 10.1109/INISTA.2015.7276788.
</mixed-citation>
</ref>
<ref id="R3">
<label>[3]</label>
<mixed-citation publication-type="other">L. Keri and R. T. Watson, "The impact of natural language processing based textual analysis of social media interactions on decision making," ECIS 2013 - Proc. 21st Eur. Conf. Inf. Syst., 2013.
</mixed-citation>
</ref>
<ref id="R4">
<label>[4]</label>
<mixed-citation publication-type="other">G. Paltoglou, "Sentiment analysis in social media," in Online collective action: Dynamics of the crowd in social media, Springer, 2014, pp. 3-17.
</mixed-citation>
</ref>
<ref id="R5">
<label>[5]</label>
<mixed-citation publication-type="other">M. Kanakaraj and R. M. R. Guddeti, "Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques," in Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing, IEEE ICSC 2015, 2015. doi: 10.1109/ICOSC.2015.7050801.
</mixed-citation>
</ref>
<ref id="R6">
<label>[6]</label>
<mixed-citation publication-type="other">M. Moh, A. Gajjala, S. C. R. Gangireddy, and T.-S. Moh, "On Multi-tier Sentiment Analysis Using Supervised Machine Learning," in 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), IEEE, Dec. 2015, pp. 341-344. doi: 10.1109/WI-IAT.2015.154.
</mixed-citation>
</ref>
<ref id="R7">
<label>[7]</label>
<mixed-citation publication-type="other">N. Chirawichitchai, "Emotion classification of Thai text-based using term weighting and machine learning techniques," in 2014 11th Int. Joint Conf. on Computer Science and Software Engineering: "Human Factors in Computer Science and Software Engineering" - e-Science and High Performance Computing: eHPC, JCSSE 2014, 2014. doi: 10.1109/JCSSE.2014.6841848.
</mixed-citation>
</ref>
<ref id="R8">
<label>[8]</label>
<mixed-citation publication-type="other">A. Hogenboom, B. Heerschop, F. Frasincar, U. Kaymak, and F. De Jong, "Multi-lingual support for lexicon-based sentiment analysis guided by semantics," Decis. Support Syst., 2014, doi: 10.1016/j.dss.2014.03.004.
</mixed-citation>
</ref>
<ref id="R9">
<label>[9]</label>
<mixed-citation publication-type="other">M. Anjaria and R. M. R. Guddeti, "A novel sentiment analysis of social networks using supervised learning," Soc. Netw. Anal. Min., vol. 4, no. 1, p. 181, 2014, doi: 10.1007/s13278-014-0181-9.
</mixed-citation>
</ref>
<ref id="R10">
<label>[10]</label>
<mixed-citation publication-type="other">S. Volkova, T. Wilson, and D. Yarowsky, "Exploring demographic language variations to improve multilingual sentiment analysis in social media," EMNLP 2013 - 2013 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., no. October, pp. 1815-1827, 2013.
</mixed-citation>
</ref>
<ref id="R11">
<label>[11]</label>
<mixed-citation publication-type="other">M. Venugopalan and D. Gupta, "Exploring sentiment analysis on Twitter data," in 2015 Eighth International Conference on Contemporary Computing (IC3), 2015, pp. 241-247. doi: 10.1109/IC3.2015.7346686.
</mixed-citation>
</ref>
<ref id="R12">
<label>[12]</label>
<mixed-citation publication-type="other">B. Alhadidi and M. Wedyan, "Hybrid Stop-Word Removal Technique for Arabic Language.," Egypt. Comput. Sci. J., 2008.
</mixed-citation>
</ref>
<ref id="R13">
<label>[13]</label>
<mixed-citation publication-type="other">J. N. Schrading, Analyzing domestic abuse using natural language processing on social media data. Rochester Institute of Technology, 2015.
</mixed-citation>
</ref>
<ref id="R14">
<label>[14]</label>
<mixed-citation publication-type="other">A. Go, R. Bhayani, and L. Huang, "Twitter Sentiment Classification using Distant Supervision," Processing, 2009.
</mixed-citation>
</ref>
<ref id="R15">
<label>[15]</label>
<mixed-citation publication-type="other">R. Soni and K. J. Mathai, "Improved Twitter Sentiment Prediction through Cluster-then-Predict Model," vol. 4, no. 4, pp. 559-563, 2015.
</mixed-citation>
</ref>
    </ref-list>
  </back>
</article>