-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
table with negative data fails to save as image when using .fmt_number #391
Comments
Hey, thanks for raising--I'm having some trouble viewing the .txt file. Do you mind pasting in the python code directly? |
Hello Michael,
No problem for me to post the python code, i will do so in a few minutes.
Thank you for looking into this issue.
mike purtell
…On Mon, Jul 8, 2024 at 11:47 AM Michael Chow ***@***.***> wrote:
Hey, thanks for raising--I'm having some trouble viewing the .txt file. Do
you mind pasting in the python code directly?
—
Reply to this email directly, view it on GitHub
<#391 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AU4PPQHQC5GGVEFTVW7FTTLZLLNERAVCNFSM6AAAAABKKLR2RGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJUHEZDSOJWGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Here is the python code I wrote to demonstrate the reported bug: import great_tables
df_pos = pl.DataFrame( make df_neg by multiplying all values of df_pos by -1df_neg = df_pos.with_columns(pl.all()*pl.lit(-1)) |
Hello, I reformatted the code to make it easier to read on GitHub. Hope this helps! By the way, it seems that the import great_tables
import polars as pl
from great_tables import GT
def save_gt(df, filename):
my_gt = (
GT(df).tab_header(title=f"{filename}", subtitle=f"subtitle")
# TO TEST THIS BUG, RUN THIS CODE WITH and WITHOUT .fmt_number
# save table to image fails when .fmt_number with negative values is used
.fmt_number(columns=df.columns, decimals=1, use_seps=True, sep_mark=",")
)
try:
my_gt.save(filename, window_size=(6, 6))
print(f"\n ########### SUCCESSFULLY WROTE {filename} ###########\n")
except:
print(f"\n ########### FAILED TO WRITE {filename} ###########\n")
return
df_pos = pl.DataFrame(
{
"A": [x for x in list(range(3))],
"B": [x * 0.5 for x in list(range(3))],
"C": [x * 01.5 for x in list(range(3))],
}
)
# make df_neg by multiplying all values of df_pos by -1
df_neg = df_pos.with_columns(pl.all() * pl.lit(-1))
display(df_neg, df_pos)
save_gt(df_neg, "df_neg.png")
save_gt(df_pos, "df_pos.png") |
Thank you for reformatting of the python code. Not sure how I get away without using from IPython.display import display. Might be automatically imported by my anaconda environment or might be running the native python display command. Thank you for working on this issue, greatly appreciated, and if I can help in any way please don't hesitate to ask. |
Thank you for releasing 0.10. I ran the test case submitted and it worked, very happy about that. On my production code, I still have cannot format tables with negative values. My error message indicates that I have an issue with the use of UTF-16 coding for the minus sign, which is represented as 0x2212. In polars, I tried to cast as UTF-8, then back to Float64, still have the issue. I also tried multiplying all values by -1 twice to see if this operation would return with an acceptable minus sign, also to no avail. I will see if I can produce a usable work-around for now. |
great_tables 0.10.0 has issues with .fmt_number. Verified using python 3.11.9, polars 1.1.0. Verified with anaconda/spyder, and with a python notebook in jupyter lab. Short python script (18 lines) attached as txt file. A workaround is to have polars do the rounding, instead of great tables/ .fmt_number. This work around only applies to rounding, does not cover other features of .fmt_number such as thousands commas. |
Thanks for looking into this (and to @jrycw for the clean up!). I'm having some trouble reproducing :/ . Based on the examples, I ran the code below, but did not hit an error. import polars as pl
from great_tables import GT
from IPython.display import display
df_pos = pl.DataFrame(
{
"A": [x for x in list(range(3))],
"B": [x * 0.5 for x in list(range(3))],
"C": [x * 01.5 for x in list(range(3))],
}
)
# make df_neg by multiplying all values of df_pos by -1
df_neg = df_pos.with_columns(pl.all() * pl.lit(-1))
display(df_neg, df_pos)
(
GT(df_neg)
.tab_header(title="a", subtitle="b")
.fmt_number(columns=df_neg.columns, decimals=1, use_seps=True, sep_mark=",")
.save("test.png", window_size=(6,6))
) Do you mind pasting in the traceback for the error (or the error name)? I'm a bit stumped on what might cause saving a table to fail when formatting negative numbers... 😵 |
Hi Michael,
Please try running this code with .fmt_number commented out (works for me, great_table is saved to Random.png with many digits). Then run it again after uncommenting .fmt_number. That is where I get this errors:
UnicodeEncodeError: 'charmap' codec can't encode character '\u2212' in position 7431: character maps to <undefined>
In my work usage, all of my data is read from csv files, so I thought adding Utf8 decoding to polars scan_csv would do the trick. But this test case which generates the data organically shows that csv endoding is not the issue.
import random, polars as pl
from great_tables import GT
random.seed(42)
col_1 = [random.uniform(-1.0, 1.0) for a in list(range(7))]
col_2 = [random.uniform(-1.0, 1.0) for a in list(range(7))]
df = pl.DataFrame({'COL_1': col_1,'COL_2': col_2})
print(df.head(7))
my_gt = (
GT(df)
.tab_header(title = 'Positive, Negative Cosine')
# Test with.fmt_number invoked, and with .fmt_number commented out
# .fmt_number(columns=['COL_1', 'COL_2'], decimals=3)
)
# .save fails when great_table .fmt_number was used
my_gt.save('Random.png', window_size=(6, 6))
In the case of .fmt number, I workaround it by using polars to do the rounding, but would like to use .fmt_number for thousands columns and other reasons.
Thank you for working on this, I really enjoy great_tables.
Mike Purtell
From: Michael Chow ***@***.***>
Sent: Monday, July 15, 2024 6:10 AM
To: posit-dev/great-tables ***@***.***>
Cc: Michael Purtell ***@***.***>; Author ***@***.***>
Subject: Re: [posit-dev/great-tables] table with negative data fails to save as image when using .fmt_number (Issue #391)
Thanks for looking into this (and to @jrycw <https://github.com/jrycw> for the clean up!). I'm having some trouble reproducing :/ . Based on the examples, I ran the code below, but did not hit an error.
import polars as pl
from great_tables import GT
from IPython.display import display
df_pos = pl.DataFrame(
{
"A": [x for x in list(range(3))],
"B": [x * 0.5 for x in list(range(3))],
"C": [x * 01.5 for x in list(range(3))],
}
)
# make df_neg by multiplying all values of df_pos by -1
df_neg = df_pos.with_columns(pl.all() * pl.lit(-1))
display(df_neg, df_pos)
(
GT(df_neg)
.tab_header(title="a", subtitle="b")
.fmt_number(columns=df_neg.columns, decimals=1, use_seps=True, sep_mark=",")
.save("test.png", window_size=(6,6))
)
Do you mind pasting in the traceback for the error (or the error name)? I'm a bit stumped on what might cause saving a table to fail when formatting negative numbers... 😵
—
Reply to this email directly, view it on GitHub <#391 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AU4PPQBBKCRAJELLWRPY2J3ZMPCYXAVCNFSM6AAAAABKKLR2RGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRYGQ3DQNZQGY> .
You are receiving this because you authored the thread. <https://github.com/notifications/beacon/AU4PPQBXWFDQLKPRBUQPRKTZMPCYXA5CNFSM6AAAAABKKLR2RGWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUE2O56E.gif> Message ID: ***@***.*** ***@***.***> >
|
Here is just the code from previous post |
I'm running on Windows 11 as well and cannot reproduce the error with or without import random
import polars as pl
from great_tables import GT
random.seed(42)
col_1 = [random.uniform(-1.0, 1.0) for a in list(range(7))]
col_2 = [random.uniform(-1.0, 1.0) for a in list(range(7))]
df = pl.DataFrame({"COL_1": col_1, "COL_2": col_2})
print(df.head(7))
my_gt = (
GT(df).tab_header(title="Positive, Negative Cosine")
# Test with.fmt_number invoked, and with .fmt_number commented out
# .fmt_number(columns=['COL_1', 'COL_2'], decimals=3)
)
# .save fails when great_table .fmt_number was used
my_gt.save("Random.png", window_size=(6, 6)) |
I ran this code on my personal machine and my work PC, both running Win11, with Anaconda/Spyder, great_tables 0.10.0. I get the same error in both cases when I include .fmt_number. The error message indicates unable to encode \u2212, which is UTF-16. Can the lines that deal with negative values be enhanced to support UTF-16, or to cast the negative sign to an equivalent UTF-8 code? Here is the error message: |
Another possible fix would be to set the encoding to UTF-8 while writing in |
Ah, thanks for surfacing! That bit of code definitely looks like the issue, and encoding seems like it should resolve 😓 |
Description
Saving image of a table to png file fails when the table has negative values, and .fmt_number is used.
Reproducible example - Verified on complex use cases, and the simple example posted here. Notice that the file extension is .txt, please change to .py or paste into a notebook to run this code.
gt_bug_2024_07_03_MP.txt
Development environment
Win11, great_tables 0.9.0, python 3.11.5 with Anaconda/Jupyter Lab, polars 0.20.31
Expected result
Expect that table with negative data can use .fmt_number to clean the table, and then can be saved as an image file. This failed. .
The text was updated successfully, but these errors were encountered: