Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where are the docs, TopDocs? #91

Open
Alvant opened this issue Apr 24, 2023 · 1 comment
Open

Where are the docs, TopDocs? #91

Alvant opened this issue Apr 24, 2023 · 1 comment
Labels
enhancement New feature or request wontfix This will not be worked on

Comments

@Alvant
Copy link
Collaborator

Alvant commented Apr 24, 2023

What is the matter

Seems like TopDocuments Viewer assigns each document to one topic only. Even if there are some other topics which are represented well in the document. (These topics won't have the document in the result view.)

How to reproduce

  1. Make a small dataset of, let's say, three documents.
  2. Create a topic model of, let's say, 50 topics. Fit it on the dataset.
  3. Create a TopDocumentsViewer. Get a view of the model's topics' documents.

Result

Some topics do not have any documents, even if the probabilities in the Theta matrix are high.

image

Screenshot_2023-04-24_20-13-02

where the view_model function is:

image

Expected result

There is a way to control which documents are considered as "top documents". For example, if several topics have high probabilities in a particular document, then maybe there should be an opportunity to put this document in the "top lists of documents" for all the aforementioned topics.

@Alvant
Copy link
Collaborator Author

Alvant commented Jul 14, 2024

Seems like predict_cluster_by_precomputed_distances is responsible for this behaviour.

@Alvant Alvant added enhancement New feature or request wontfix This will not be worked on labels Jul 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

1 participant