Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

synonym-extracto: ERROR IN IDEA for overlapping synonims sets #139

Open
Sandy4321 opened this issue May 28, 2023 · 0 comments
Open

synonym-extracto: ERROR IN IDEA for overlapping synonims sets #139

Sandy4321 opened this issue May 28, 2023 · 0 comments

Comments

@Sandy4321
Copy link

as you mentined in
https://github.com/vi3k6i5/synonym-extractor
Why
Say you have a corpus where similar words appear frequently.

eg: Last weekened I was in NY.
I am traveling to new york next weekend.
If you train a word2vec model on this or do any sort of NLP it will treat NY and new york as 2 different words.

Instead if you create a synonym dictionary like:

eg: NY=>new york
new york=>new york
Then you can extract NY and new york as the same text.


ERROR IN IDEA for overlapping synonims sets
FOR EXAMPLE IN CASE WE HAVE
a0 synonims -> a2,a6,a7
a6 synonims -> a0,a7, b23

THEN IF IN TEXT WE SEE a7
THEN WHAT SHOULD BE SUBSTITUTED INSTEAD OF a7
OR
a0
or
a6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant