Parse, Compute Pairwaise Similarity Matrices, Train and Test using KNN Classification Algorithm
-
Updated
Feb 5, 2018 - Jupyter Notebook
Parse, Compute Pairwaise Similarity Matrices, Train and Test using KNN Classification Algorithm
Naive Bayes classifier and boolean retrieval done on the 20Newsgroups dataset that has been written from scratch. Extremely lightweight and produces decent results. Also currently working on classification using word embeddings.
This repository contains code for our project work as part of the E0-334 Deep Learning for Natural Language Processing course at IISc, Bengaluru. We had proposed a graph-based model for text classification.
Experimentation with novelty detection
Implemented Naive Bayes text classifier for the 20newsgroups dataset
Classified human and machine generated text using 1) a single score threshold classifier and 2) a neural network classifier approach, based on perplexities and probability scores generated from n-grams. Best results are 77% for the single score classifier and 80% for the ANN classifier.
Assignment 2 – Dimensionality reduction and text classification: converted news text into a machine readable representation, reduced the dimensions of the text representation and trained classifiers to decide which of 20 news groups a sample belongs to.
Project work as part of the E0-334 Deep Learning for Natural Language Processing course at IISc, Bengaluru. We had proposed a graph-based model for text classification.
This repository contains notebooks which explores the tsne algorithm by applying it on various datasets
Cluster labelling was done by using the power of wikipedia search
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.
This project offers advanced techniques in text preprocessing, word embeddings, and text classification. Explore methods like Word2Vec and GloVe, and master Multinomial Naive Bayes for accurate predictions. Dive into the world of text clustering and conquer challenges like unbalanced data.
In this project we will generate the sentences using ngrams
Text classification using Multinomial NB on 20_newsgroups dataset.
Kmeans and SOM clustering for 20newsgroup
NLP Topic Modeling Techniques (LDA, LSA & BERTopic)
Add a description, image, and links to the 20newsgroup topic page so that developers can more easily learn about it.
To associate your repository with the 20newsgroup topic, visit your repo's landing page and select "manage topics."