You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My test fails because the data cast and expected data have slightly different types, like int32 vs. int64.
I don't want to use assert_frame_equal(df1, df2, check_dtype=False) because it does not check the data type at all, which is bad.
importpandasaspda=pd.DataFrame({'Int': [1, 2, 3], 'Float': [0.57, 0.179, 0.213]}) # Automatic type casting# Force 32-bitb=a.copy()
b['Int'] =b['Int'].astype('int32')
b['Float'] =b['Float'].astype('float32')
# Force 64-bitc=a.copy()
c['Int'] =c['Int'].astype('int64')
c['Float'] =c['Float'].astype('float64')
try:
pd.testing.assert_frame_equal(b, c)
print('Success')
exceptAssertionErroraserr:
print(err)
gives
Attributes of DataFrame.iloc[:, 0] (column name="Int") are different
Attribute "dtype" are different
[left]: int32
[right]: int64
Feature Description
Something like assert_frame_equal(df1, df2, check_dtype='equiv') would be handy but it does not work because the function uses the hard check of assert_attr_equal under the hood.
It means changing the logic to either have a soft attribute check in assert_attr_equal, or call a new function if the check_dtype is set to 'equiv'.
Alternative Solutions
I added a workaround function to my unit tests, which casts the data type of one DataFrame to the other when the types are similar (int, float).
defassert_frame_equiv(left: pd.DataFrame, right: pd.DataFrame) ->None:
"""Convert equivalent data types to same before comparing. Parameters ---------- left : DataFrame First DataFrame to compare. right : DataFrame Second DataFrame to compare. Raises ------ AssertionError If the DataFrames are different. """# First, check that the columns are the same.pd.testing.assert_index_equal(left.columns, right.columns, check_order=False)
# Knowing columns names are the same, cast the same data type if equivalent.forcol_nameinleft.columns:
lcol=left[col_name]
rcol=right[col_name]
if (
(pd.api.types.is_integer_dtype(lcol) andpd.api.types.is_integer_dtype(rcol))
or (pd.api.types.is_float_dtype(lcol) andpd.api.types.is_float_dtype(rcol))
):
left[col_name] =lcol.astype(rcol.dtype)
returnpd.testing.assert_frame_equal(left, right, check_like=True)
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
My test fails because the data cast and expected data have slightly different types, like
int32
vs.int64
.I don't want to use
assert_frame_equal(df1, df2, check_dtype=False)
because it does not check the data type at all, which is bad.gives
Feature Description
Something like
assert_frame_equal(df1, df2, check_dtype='equiv')
would be handy but it does not work because the function uses the hard check ofassert_attr_equal
under the hood.It means changing the logic to either have a soft attribute check in
assert_attr_equal
, or call a new function if thecheck_dtype
is set to'equiv'
.Alternative Solutions
I added a workaround function to my unit tests, which casts the data type of one DataFrame to the other when the types are similar (int, float).
Additional Context
Adapted from my answer on SO.
Thanks for making
pandas
!The text was updated successfully, but these errors were encountered: