MovieSent Dual Approach Sentiment Analysis

Machine Learning / AI

Project Details

Project Information

Project Title: MovieSent Dual Approach Sentiment Analysis

Category: Machine Learning / AI

Project File: Download Project File

Semester: Spring 2025

Course: CS619

Complexity: Complex

Total Reviews: 0

Supervisor Details

Muhammad Bilal
bilal.saleem@vu.edu.pk
bilalsaleem101

Project Description

MovieSent Dual Approach Sentiment Analysis

Project Domain / Category

Machine Learning / Natural Language Processing (NLP)

Abstract / Introduction

Imagine a bustling film festival where organizers and critics are eager to gauge audience sentiment on the latest releases. With thousands of reviews pouring in from social media and dedicated review sites, manually analyzing each comment becomes impractical. MovieSent – Dual Approach Sentiment Analysis steps in as a powerful tool to automatically assess the sentiment behind movie reviews.

By leveraging both a traditional Logistic Regression model and an advanced LSTM network, the system can quickly determine whether public opinion is positive, negative, or neutral. This dual approach not only provides comprehensive insights into viewer reactions but also enables film studios, critics, and streaming platforms to make informed decisions based on real-time sentiment trends. With an interactive web interface, users can simply submit a review and instantly see the analysis outcome, making MovieSent an indispensable asset in today’s fast-paced entertainment industry.

Functional Requirements:

The functional requirements of this project are given below:

1. Data Collection

Requirement: Load a dataset containing at least 5,000 movie reviews. Details:

· Use the provided dataset from this Google Drive link: 5000 Movie Reviews Dataset.

· Ensure the dataset includes sentiment labels (e.g., Positive, Negative, or Neutral).

2. Data Preparation

Requirement: Clean and verify the dataset. Details:

· Manually review and confirm sentiment labels.

· Save the data in CSV or JSON format.

· Include any additional metadata if available (e.g., review date, movie genre).

3. Data Pre-Processing

Requirement: Normalize and preprocess the raw text data. Details:

· Remove HTML tags, punctuation, and special characters.

· Convert text to lowercase.

· Tokenize reviews into words.

· Remove stopwords and perform stemming/lemmatization.

· Handle missing values and remove duplicate entries.

4. Feature Extraction

Requirement: Convert text into numerical representations for model input. Details:

· For Logistic Regression: Apply TF-IDF vectorization with N-Grams (Uni-Gram, Bi-Grams, Tri-Grams).

· For LSTM: Tokenize text, pad sequences to a fixed length, and optionally use an Embedding layer (e.g., with pre-trained GloVe embeddings).

5. Train & Test Data Splitting

Requirement: Partition the dataset into training and testing sets. Details:

· Use a 70/30 split, ensuring stratified sampling to preserve class distribution.

6. Model Development – Logistic Regression Requirement: Build a classical sentiment analysis model. Details:

· Train a Logistic Regression classifier using scikit-learn on TF-IDF features.

· Evaluate its performance using accuracy, precision, recall, and F1-score.

7. Model Development – LSTM

Requirement: Build a deep learning sentiment analysis model using an LSTM network. Details:

· Construct an LSTM-based network that includes an Embedding layer, one or more LSTM layers, and dropout for regularization.

· Compile the model with an appropriate loss function (binary or categorical cross-entropy) and optimizer (e.g., Adam).

· Train the model on tokenized and padded data.

8. Performance Evaluation

Requirement: Evaluate and compare both models. Details:

· Generate confusion matrices for each model.

· Compute evaluation metrics (accuracy, precision, recall, F1-score) and analyze results to determine which approach performs better.

9. Web Interface Integration

Requirement: Develop a web application to showcase real-time sentiment analysis. Details:

· Create a backend using Flask (or Django) to serve both models via RESTful API endpoints.

· Build a responsive front-end using HTML/CSS and optionally JavaScript/Bootstrap.

· Allow users to input movie reviews and select the model (Logistic Regression or LSTM) to get predictions.

· Implement error handling and provide clear instructions.

Tools:

· Programming Language:

o Python: Primary language for data processing, model development, and backend services.

· Development Environments / IDEs:

o Anaconda: Python distribution platform to manage environments and dependencies.

o Jupyter Notebook: For interactive coding, exploratory data analysis, and prototyping.

o Visual Studio Code (or PyCharm): For writing, debugging, and managing the project code.

· Libraries & Frameworks:

o Data Processing: Pandas, NumPy, NLTK, spaCy.

o Machine Learning & Deep Learning: scikit-learn, TensorFlow/Keras.

o Web Development: Flask (or Django) for building the backend API; HTML/CSS and optionally JavaScript/Bootstrap for the front-end.

· Other Tools:

o Joblib: For model serialization (saving and loading models).

o Git: For version control and collaboration (optional).

Supervisor:

Name: Muhammad Bilal

Email ID: bilal.saleem@vu.edu.pk

Skype ID: bilalsaleem101

Languages

Python, HTML, CSS, JavaScript Language

Tools

Anaconda, Jupyter Notebook, Visual Studio Code, PyCharm, Pandas, NumPy, NLTK, spaCy, scikit-learn, TensorFlow, Keras, Flask, Django, Bootstrap, Joblib, Git Tool

Project Schedules

Assignment #

Title

Start Date

End Date

Sample File

SRS Document

Friday 2, May, 2025 12:00AM

Thursday 22, May, 2025 12:00AM

Download Sample/Template

Design Document

Friday 23, May, 2025 12:00AM

Tuesday 29, July, 2025 12:00AM

Download Sample/Template

Prototype Phase

Wednesday 30, July, 2025 12:00AM

Friday 12, September, 2025 12:00AM

Download Sample/Template

Final Deliverable

Saturday 13, September, 2025 12:00AM

Monday 3, November, 2025 12:00AM

Download Sample/Template

Viva Review Submission

Review Information

Viva Type

Prototype Viva

Final Viva

Your Full Name

Your Review

Supervisor Behavior

Select Supervisor Behavior

Friendly & Relaxed

Lenient & Supportive

Formal & Neutral

Serious & Strict

Harsh & Tough

Student Viva Reviews

No reviews available for this project.