Incorrect tagging by a trained model for Tibetan #13549
Unanswered
ykyogoku
asked this question in
Help: Model Advice
Replies: 1 comment
-
I just ran the "debug data" command and found that there are many misaligned tokens in both the training and validation datasets. Could this be related to the incorrect tagging?
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I tried to train a tagger for Tibetan. However, the result is not satisfactory. What is particularly striking is that the genitive, which is consistently tagged as ADP in the training dataset, is wrongly tagged as NOUN, AUX, etc., by the generated model. I hope the training (train.spacy: 10.3 MB) and validation (dev.spacy: 2.7 MB) datasets are large enough. So, I suspect that the cause of the incorrect tagging lies in the configuration. The following is the configuration file, which has not been processed by spacy init fill-config.
And the following is one of the logs.
I tried to train a model with different learning rates (0.001, 0.005, 0.0005), but none of them improves the results.
Could you tell what I can change to improve the tagging?
Beta Was this translation helpful? Give feedback.
All reactions