-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In windows prompts cannot be chinese refer to pathlib read_text #380
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok
@microsoft-github-policy-service agree |
Thanks for the contribution @liseri! Should |
This is a common bug with the Windows system in China, caused by various factors, primarily summarized as follows: GPK, as a universal code, is widely used, but the Win32 interface at the bottom of Windows defaults to using system variables. This variable can be modified to other encodings, but it may cause some programs to crash because some programs do not use this Win32 interface when running. This issue is mostly resolved by upstream frameworks specifying an encoding parameter or by directly using binary reading. In this case, binary reading should be used instead of file stream reading. |
It is sufficient to set the utf-8 parameter ; because the prompts file is generated by graphrag itself in utf-8, ensuring that read_text reads in utf-8 will guarantee correctness; if the utf-8 parameter is not set for read_text, it may read using the operating system’s default character encoding (such as gbk encoding on Windows systems). |
1 similar comment
It is sufficient to set the utf-8 parameter ; because the prompts file is generated by graphrag itself in utf-8, ensuring that read_text reads in utf-8 will guarantee correctness; if the utf-8 parameter is not set for read_text, it may read using the operating system’s default character encoding (such as gbk encoding on Windows systems). |
@liseri what are your thoughts on @glide-the 's comment? We could replace this by binary reading |
I think that solution is better; I just chose the simplest approach, as I didn’t want to implement something too complicated; as long as the issue can be resolved, I’m open to any solution; I’ll go ahead and close this pull later. |
Description
In windows , prompts cannot be chinese , because pathlib read_text default to use gbk to read prompts file;
Proposed Changes
pathlib read_text add encoding=utf-8 params
Checklist