Online Conversation media serves as a means for individuals to engage, cooperate, and exchange
ideas; however, it is also considered a platform that facilitates the spread of hateful and offensive
comments, which could significantly impact one's emotional and mental health. The rapid
growth of online communication makes it impractical to manually identify and filter out hateful
tweets. Consequently, there is a pressing need for a method or strategy to eliminate toxic and
abusive comments and ensure the safety and cleanliness of social media platforms. Utilizing
LSTM, Character-level CNN, Word-level CNN, and Hybrid model (LSTM + CNN) in this
toxicity analysis is to classify comments and identify the different types of toxic classes by
means of a comparative analysis of various models. The neural network models utilized for this
analysis take in comments extracted from online platforms, including both toxic and non-toxic
comments. The results of this study can contribute towards the development of a web interface
that enables the identification of toxic and hateful comments within a given sentence or phrase,
and categorizes them into their respective toxicity classes
Key innovations include:
• Multi-label classification addressing overlapping toxicity categories.
• Bias mitigation through adversarial debiasing and balanced dataset sampling.
• Real-time processing via a Django backend and React.js dashboard.
Experimental results demonstrate BERT’s superiority (95.4% F1-score) over LSTM (91.9%),
attributed to its contextual embedding capabilities. Challenges in sarcasm detection and
multilingual support are discussed, alongside proposed solutions. This system has direct
applications in social media moderation, online gaming, and forum management, significantly
reducing reliance on manual review.