Skip to content

Word2Vec Word Vectors trained on a North Korean Corpus / 조선어 (북한어) 단어 임베딩

Notifications You must be signed in to change notification settings

digitalprk/north_korean_embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation

Pre-trained word vectors for the North Korean language

  • Trained on a 200 million word corpus of North Korean text: news articles, magazines, literature and political essays.
  • Vocabulary size: 64979

Download:

https://datank2.s3.ap-southeast-1.amazonaws.com/nk200sg7.bin

Usage:

import gensim.models.keyedvectors as word2vec
model = word2vec.KeyedVectors.load_word2vec_format('nk200sg7.bin', binary=False)

About

Word2Vec Word Vectors trained on a North Korean Corpus / 조선어 (북한어) 단어 임베딩

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages