No terms on Top-30 Most Relevant Terms for Topic X #82

cpollack736 · 2018-03-27T03:02:23Z

When I try to run the LDAvis as described in the tutorial, the axis that is supposed to display the terms in the 30-Most Salient Terms (y-axis) only displays a number followed by a comma (3, 1, etc.). What could this be the result of?

Thanks!

kshirley · 2018-03-27T03:07:38Z

You might simply have numbers (followed by commas) in your vocabulary. Did you use regular expressions, or something like that, to clean your data and tokenize your corpus? If numbers appear in your raw data, and you didn't remove them, then they will appear as words in the topic model. (For some analyses, this is quite interesting, as numbers often co-occur as a topic!) But if the top-30 terms for every topic are all numbers, then something is wacky.

cpollack736 · 2018-03-27T03:39:14Z

Thank you for your fast reply! All of my top 30 terms for each topic are numbers followed by a comma. I'm curious now if it has something to do with the vocabulary, as when I ran the "Movie Review" data, it appeared as normal (words on the axis instead of text). However, the numbers still don't make sense for my data, since I'm mainly looking at text with very few numbers. I'll keep investigating!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No terms on Top-30 Most Relevant Terms for Topic X #82

No terms on Top-30 Most Relevant Terms for Topic X #82

cpollack736 commented Mar 27, 2018

kshirley commented Mar 27, 2018

cpollack736 commented Mar 27, 2018

No terms on Top-30 Most Relevant Terms for Topic X #82

No terms on Top-30 Most Relevant Terms for Topic X #82

Comments

cpollack736 commented Mar 27, 2018

kshirley commented Mar 27, 2018

cpollack736 commented Mar 27, 2018