Skip to content
#

tfidf-vectorizer

Here are 357 public repositories matching this topic...

MailGuard is an intelligent spam detection tool that classifies emails as spam or ham using a Multinomial Naive Bayes model. Built with Streamlit, it leverages natural language processing techniques for text cleaning and feature extraction.

  • Updated Jul 14, 2024
  • Python

This project focuses on building a classifier to distinguish between spam and ham emails using Logistic Regression. Key steps include data preprocessing, feature extraction with TF-IDF vectorization, and model evaluation with accuracy metrics and a confusion matrix.

  • Updated Jul 13, 2024
  • Jupyter Notebook
MagicXML

Magic-XML — is a modern web application developed for the convenient and swift transformation of data from XML files into CSV format. The application leverages the power of FastAPI to ensure high performance in request processing, as well as utilizes machine learning algorithms and natural language processing for efficient analysis

  • Updated Jul 11, 2024
  • Python

This repository explores the correlation between news headlines' textual embeddings and their political orientation. Using clustering and transformer-based embeddings, the goal is to classify news sources based on headline content. Key features include clustering visualizations, BERT embeddings, and comparisons between K-Means, Spectral, and DBSCAN

  • Updated Jul 10, 2024
  • Jupyter Notebook

The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention.

  • Updated Jul 3, 2024
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the tfidf-vectorizer topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tfidf-vectorizer topic, visit your repo's landing page and select "manage topics."

Learn more