longformer

This project was developed for a Kaggle competition focused on detecting Personally Identifiable Information (PII) in student writing. The primary objective was to build a robust model capable of identifying PII with high recall. The DeBERTa v3 transformer model was chosen for this task after comparing its performance with other transformer models.

natural-language-processing spacy-nlp pii-detection name-entity-recognition longformer huggingface-transformers roberta-model deberta-v3-large

Updated Jun 28, 2024
Jupyter Notebook

prince-css / semrep

Star

Factuality check of the SemRep Predications

python nlp pytorch openai drug-discovery bert electra peft few-shot-learning roberta drug-drug-interaction t5 longformer in-context-learning gpt-35-turbo llama2 llm-finetuning

Updated Jan 30, 2024
Jupyter Notebook

nsi319 / Legal-Summarizer

Star

Longformer Encoder Decoder model for the legal domain, trained for long document abstractive summarization task.

legal summarization seq2seq transfer-learning encoder-decoder abstractive-summarization huggingface longformer

Updated Feb 26, 2021

debryu / EULA-Embeddings_Utilizer_Legal_Assistant

Star

An attempt of creating a model and pipeline for retrieving italian legal documents given a prompt from the user.

nlp haystack italian document-retrieval sentence-similarity huggingface longformer

Updated Jun 16, 2023
Jupyter Notebook

akuritsyn / feedback-prize-2021

Star

Kaggle NLP competition - Top 2% solution (36/2060)

nlp transformers pytorch longformer deberta

Updated Mar 17, 2022
Jupyter Notebook

OthmanMohammad / Longformer-Learning-Next-Generation-Sentiment-Analysis

Star

This project applies the Longformer model to sentiment analysis using the IMDB movie review dataset. The Longformer model, introduced in "Longformer: The Long-Document Transformer," tackles long document processing with sliding-window and global attention mechanisms. The implementation leverages PyTorch, following the paper's architecture

nlp transformers pytorch longformer

Updated Apr 7, 2023
Python

ivanokhotnikov / claim_veracity

Star

Training and inference code for the claim veracity checker built on Longformer-4096 tuned to PUBHEALTH

docker gke-cluster kubeflow-pipelines longformer vertex-ai

Updated May 22, 2023
Python

dmamakas2000 / ipo

Star

This GitHub repository implements a novel approach for detecting Initial Public Offering (IPO) underpricing using pre-trained Transformers. The models, extended to handle large S-1 filings, leverage both textual information and financial indicators, outperforming traditional machine learning methods.

python finance ai deep-learning pytorch shares capital longformer initial-public-offering llms

Updated Nov 23, 2023
Python

sidharrth2002 / text-scoring

Star

Industrial Text Scoring using Multimodal Deep Natural Language Processing 🚀 | Code for IEA AIE 2022 paper

nlp transformers longformer

Updated Jan 1, 2023
Python

AbineshSivakumar / HyperPartisan_Classification_Using_BERT

Star

A hyperpartisan news article classification system using BERT-based techniques. The goal was to leverage state-of-the-art transformer models like BERT, ROBERTa, and Longformer to accurately classify news articles as hyperpartisan or non-hyperpartisan.

text-classification bert hyperpartisan-news-detection longformer large-language-models