Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide replacement info to replace_keyword API #107

Open
gabyx opened this issue Apr 23, 2020 · 1 comment
Open

Provide replacement info to replace_keyword API #107

gabyx opened this issue Apr 23, 2020 · 1 comment

Comments

@gabyx
Copy link

gabyx commented Apr 23, 2020

Its desirable to have a clue what was replaced and where, and also if any replacement happend at all.

@thakur-nandan
Copy link
Collaborator

thakur-nandan commented Apr 28, 2020

Hello @gabyx,

I can't think of a direct solution to your problem, but here is one solution which I can think of immediately and would help to solve your issue -

  1. While adding your keywords, construct a dictionary of the keyword (replacing) as key and keyword (to be replaced).
  2. Instead of using replace_keyword, you can first use extract_keyword to see if any keywords which you want to replace have extracted out or not (This would tell you whether any word in a sentence was replaced or not)
  3. Also with extract_keyword, you can do span_info=True to get the position of the replacement.

>>> from flashtext import KeywordProcessor
>>> keyword_processor = KeywordProcessor()
>>> keyword_dict = {'New York' : 'Big Apple', 'Trump' : 'Bay Area'}
>>> keyword_processor.add_keyword('Big Apple', 'New York')
>>> keyword_processor.add_keyword('Bay Area', 'Trump')
>>> keywords_found = keyword_processor.extract_keywords('I love big Apple and Bay Area.', span_info=True)
>>> [(keyword_dict.get(key[0]), key[1], key[2]) for key in keywords_found]
[('Big Apple', 7, 16), ('Bay Area', 21, 29)]

Kind Regards,
Nandan Thakur

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants