#

chinese-nlp

Here are 184 public repositories matching this topic...

FerdinandZhong / punctuator

A small seq2seq punctuator tool based on DistilBERT

nlp deep-learning pytorch seq2seq chinese-nlp punctuation bert bert-ner

Updated Jul 13, 2024
Python

lyogavin / Anima

33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU

open-source chinese-nlp llama lora instruction-set finetune open-source-models open-models llm generative-ai instruct-gpt qlora chinese-llm

Updated Jul 11, 2024
Jupyter Notebook

yaoxiaoyuan / mimix

Mimix: A Text Generation Tool and Pretrained Chinese Models

Updated Jul 8, 2024
Python

esbatmop / MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

nlp chinese chinese-nlp corpus-data chinese-simplified nlp-machine-learning chinese-language

Updated Jul 8, 2024

SUFE-AIFLM-Lab / StatChat

StatChat是一个专门用于统计学及相关应用领域（金融学、经济学、商业分析、数据科学等）知识问答的数字化智能学习助手

education finance data-science statistics economics chinese-nlp business-analytics llm

Updated Jul 1, 2024

HIT-SCIR / ltp

Language Technology Platform

nlp machine-learning natural-language-processing chinese-nlp

Updated Jul 1, 2024
Python

CodingDogzxg / MicroblogCrawler

微博热榜爬虫

nlp crawler spider corpus selenium python3 weibo chinese-nlp microblog corpus-linguistics corpus-data nlp-machine-learning weibo-spider lingustics sarcasm-detection pytorch-nlp weibo-crawler microblog-crawler

Updated Jun 29, 2024
Python

hscspring / pnlp

NLP预/后处理工具。

nlp concurrency text-extraction chinese-nlp text-processing preprocessing normalization text-cleaning nlp-preprocess nlp-enhancer text-length

Updated Jun 29, 2024
Python

Isaac-JL-Chen / rouge_chinese

Python ROUGE Score Implementation for Chinese Language Task (official rouge score)

nlp algorithm metrics summarization chinese-nlp rouge rouge-metric rouge-l

Updated Jun 27, 2024
Python

niuwz / Mini-Chinese-Phi3

基于Phi3模型结构，使用常见的中文预料从零训练的小参数量LLM。包括了tokenizer训练、模型预训练、指令微调和直接偏好优化等流程。

chinese-nlp from-scratch large-language-models

Updated Jun 23, 2024
Python

ECNU-ICALK / EduChat

An open-source educational chat model from ICALK, East China Normal University. 开源中英教育对话大模型。(通用基座模型，GPU部署，数据清理) 致敬: LLaMA, MOSS, BELLE, Ziya, vLLM

education chinese-nlp llama data-cleaning moss open-models belle llm

Updated Jun 2, 2024
Python

aplmikex / deduplication_mnbvc

文本去重

nlp chinese chinese-nlp corpus-data chinese-simplified nlp-machine-learning chinese-language

Updated May 23, 2024
Python

brightmart / nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

nlp news wiki text-classification word2vec corpus dataset question-answering chinese chinese-nlp language-model bert chinese-corpus pretrain chinese-dataset

Updated May 23, 2024

edward-martyr / yitizi-rs

Get all variants (yitizi, 異體字) of a Chinese character (Sinograph)!

chinese-nlp chinese-characters rust-crate

Updated May 16, 2024
Rust

rime / rime-cantonese

Rime Cantonese input schema | 粵語拼音輸入方案

input-method linguistics chinese chinese-nlp rime rime-schema cantonese jyutping cantonese-language chinese-language cantonese-dictionary

Updated May 15, 2024
Python

dongrixinyu / jiojio

A convenient Chinese word segmentation tool 简便中文分词器

python crf chinese-nlp chinese-word-segmentation partofspeech-tagger wordsegmentation

Updated Apr 26, 2024
Python

CVI-SZU / Linly

Chinese-LLaMA 1&2、Chinese-Falcon 基础模型；ChatFlow中文对话模型；中文OpenLLaMA模型；NLP预训练/指令微调数据集

nlp chatbot chinese chinese-nlp llama language-model bert zero-shot-learning gpt-3 chatgpt

Updated Apr 14, 2024
Python

open-chinese / chinese-word-structure

研究所有汉字的结构，为NLP中汉字结构问题提供完备的解。

chinese chinese-nlp chinese-characters chinese-structure

Updated Apr 7, 2024

LianjiaTech / BELLE

BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）

bloom chinese-nlp llama lora instruction-set open-models instruct-gpt gpt-q gpt-evaluation instruct-finetune

Updated Mar 15, 2024
HTML

amutu / zhparser

zhparser is a PostgreSQL extension for full-text search of Chinese language

extension postgresql chinese chinese-nlp chinese-text-segmentation scws zhparser

Updated Feb 13, 2024
C

Improve this page

Add a description, image, and links to the chinese-nlp topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the chinese-nlp topic, visit your repo's landing page and select "manage topics."