Yet Another Data Science Blog: Word Cloud of Negative Sentiment Summaries

Wednesday, December 25, 2024

Word Cloud of Negative Sentiment Summaries

The task here involves performing exploratory data analysis (EDA) to visualize negative sentiment text data. The data used is a collection of summaries, with the sentiment labeled (e.g., polarity score). We focus on negative sentiment sentences (polarity < 0), and the objective is to generate a word cloud that visually represents the most frequent words used in the negative summaries.

### Code Explanation:

1. **Importing Required Libraries**:

from mlAASentimentAnalysis import data

import matplotlib.pyplot as plt

from wordcloud import WordCloud, STOPWORDS

- `mlAASentimentAnalysis` is a custom module (presumably containing a dataset `data`).

- `matplotlib.pyplot` is used to plot the word cloud.

- `WordCloud` is used to generate a visual representation of frequent words in the dataset, and `STOPWORDS` provides a list of common words (like "the", "and") to exclude from the word cloud.

2. **Setting Stopwords**:

stopwords = set(STOPWORDS)

- This converts the `STOPWORDS` list into a set to eliminate common, irrelevant words (like "a", "the") from the word cloud.

3. **Filtering Negative Sentences**:

data_negative = data[data['polarity'] < 0]

- Here, the dataset `data` is filtered to include only rows where the `polarity` is less than 0 (indicating negative sentiment). The filtered data is stored in `data_negative`.

4. **Concatenating Negative Sentences**:

total_negative = (' '.join(data_negative['Summary']))

- The summaries (or text content) of the negative sentences are concatenated into a single string `total_negative`. This is necessary to generate the word cloud.

5. **Data Cleaning**:

import re

total_negative = re.sub('[^a-zA-Z]', ' ', total_negative)

total_negative = re.sub(' +', ' ', total_negative)

- The first `re.sub()` removes all non-alphabetical characters (like numbers or special symbols) from the text.

- The second `re.sub()` replaces any consecutive spaces with a single space, ensuring cleaner text.

6. **Generating the Word Cloud**:

wordcloud = WordCloud(width=1000, height=500, stopwords=stopwords).generate(total_negative)

- A `WordCloud` object is created, where the width and height are specified (1000x500 pixels). The `stopwords` set is passed to ensure that common words are excluded from the cloud. The `generate()` method processes the text to build the word cloud.

7. **Plotting the Word Cloud**:

plt.figure(figsize=(15, 5))

plt.imshow(wordcloud)

plt.axis('off')

plt.show()

- The figure size is set to 15x5 inches.

- `plt.imshow(wordcloud)` displays the word cloud.

- `plt.axis('off')` removes the axes for a cleaner visualization.

- `plt.show()` renders the plot.

### Plot Explanation:

The word cloud generated from this code will visually represent the most frequent words in the summaries that have a negative sentiment (polarity < 0). The size of each word in the word cloud corresponds to its frequency in the dataset—larger words appear more often, while smaller words appear less frequently.

#### Key Observations:

- Words that are frequently used in negative summaries will dominate the word cloud.

- Common words that are irrelevant to sentiment analysis (like "the", "and", "of") are excluded due to the stopwords filtering.

### Solution:

The solution involves two main steps:

1. **Data Filtering**: By isolating the negative sentences using the `polarity < 0` condition, we focus only on the negative sentiment text.

2. **Text Visualization**: The word cloud is a great tool for visualizing the most common words associated with negative sentiment in the dataset. This allows us to identify trends, recurring themes, or specific words that appear frequently in negative summaries.

Overall, this approach helps in gaining insights into the language or phrases that are commonly used in negative contexts in the dataset.

Yet Another Data Science Blog

Pages

Wednesday, December 25, 2024

Word Cloud of Negative Sentiment Summaries

No comments:

Post a Comment

Featured Post

How HMT Watches Lost the Time: A Deep Dive into Disruptive Innovation Blindness in Indian Manufacturing

Popular Posts

Posts Per Category

🎮 AI Fun Zone

🧠 AI Quiz

🎯 Guess Game

⚡ Speed Test

✊ Rock Paper Scissors

🔢 Quick Math

🧩 Memory Game

⌨️ Typing Speed

🟥 Color Click

🎲 Dice Game

Explore AI Hub

Latest Posts

AI Category

🚀 Trending AI Projects

📊 Data Science Resources

📚 Latest Research Papers

🔥 New AI Tools

💬 Developer Discussions

Contact Form

Followers