Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary datatype mismatch error | Pandas AI #1231

Open
usmanbinimran-octopus opened this issue Jun 13, 2024 · 2 comments
Open

Unnecessary datatype mismatch error | Pandas AI #1231

usmanbinimran-octopus opened this issue Jun 13, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@usmanbinimran-octopus
Copy link

System Info

OS version: Windows 11 pro
Python version: 3.9
The current version of pandasai being used: 2.2

馃悰 Describe the bug

Hi @gventuri, hope this message finds you well.

I am here with an unexpected behaviour that I am experiencing for so long. Please look into it:

PROMPT
show last 10 records

RESPONSE
Unfortunately, I was not able to answer your question, because of the following error:
Value type <class 'list'> must match with type dataframe

EXPLANATION
I looked at the previous conversation and saw that you wanted to see the last 10 records. So, I wrote some code that takes the last 10 records from each dataset and puts them into a new dataframe. This way, you can easily see the most recent data without having to look through everything.

CLARIFICATION QUESTIONS
['Which dataframe should the last 10 records be shown from?', 'Are the records supposed to be sorted in any particular order before showing the last 10?', 'Is there a specific column that should be used for sorting the records before displaying the last 10?']

REVISED PROMPT:
Display the most recent 10 entries.

EXECUTED CODE:
dfs[0] = dfs[0].tail(10)
dfs[1] = dfs[1].tail(10)
result = {'type': 'dataframe', 'value': dfs}

OPEN AI TOKEN CB:
Tokens Used: 3350
Prompt Tokens: 2994
Completion Tokens: 356
Total Cost (USD): $ 0.002031

ERROR ON COMMAND PROMPT:
Traceback (most recent call last):
File "W:\OctopusTasks\GenInsight\GenInsight-Phase2\GenInsight-Phase2\env-geninsight-phase2\lib\site-packages\pandasai\pipelines\chat\generate_chat_pipeline.py", line 308, in run
output = (
File "W:\OctopusTasks\GenInsight\GenInsight-Phase2\GenInsight-Phase2\env-geninsight-phase2\lib\site-packages\pandasai\pipelines\pipeline.py", line 137, in run
raise e
File "W:\OctopusTasks\GenInsight\GenInsight-Phase2\GenInsight-Phase2\env-geninsight-phase2\lib\site-packages\pandasai\pipelines\pipeline.py", line 101, in run
step_output = logic.execute(
File "W:\OctopusTasks\GenInsight\GenInsight-Phase2\GenInsight-Phase2\env-geninsight-phase2\lib\site-packages\pandasai\pipelines\chat\code_execution.py", line 113, in execute
raise e
File "W:\OctopusTasks\GenInsight\GenInsight-Phase2\GenInsight-Phase2\env-geninsight-phase2\lib\site-packages\pandasai\pipelines\chat\code_execution.py", line 97, in execute
raise InvalidOutputValueMismatch(
pandasai.exceptions.InvalidOutputValueMismatch: Value type <class 'list'> must match with type dataframe
Exception in APILogger:

<title>Error</title>
Cannot POST /api/log/add

@gventuri,
To the best of my domain knowledge, everything except response is working fine! I think there's a minor case handling that needs to be handled. Because reading at Explanation & Code Executed, everything seems smooth.

Looking forward to hearing from you!

Kind regards,
Usman Bin Imran
+92-310-480-160-1

image

@dosubot dosubot bot added the bug Something isn't working label Jun 13, 2024
Copy link

dosubot bot commented Jun 13, 2024

Hey @usmanbinimran-octopus! 馃憢 I'm here to help you with any bugs, questions, or contributions you have in mind. Let's tackle this together!

To resolve the datatype mismatch error when using Pandas AI to display the last 10 records from multiple dataframes, you need to ensure that the data types of the columns in the dataframes are consistent before concatenating or performing operations on them. Here is an example of how you can achieve this:

import pandas as pd
from pandasai import PandasAI

# Example dataframes
df1 = pd.DataFrame({
    'A': [1, 2, 3],
    'B': ['a', 'b', 'c']
})

df2 = pd.DataFrame({
    'A': [4, 5, 6],
    'B': ['d', 'e', 'f']
})

# Ensure the data types are consistent
df1 = df1.astype({'A': 'int', 'B': 'str'})
df2 = df2.astype({'A': 'int', 'B': 'str'})

# Concatenate dataframes
df_combined = pd.concat([df1, df2])

# Initialize PandasAI
pandas_ai = PandasAI(dfs=[df_combined])

# Display the last 10 records using PandasAI
print(pandas_ai.chat("Show the last 10 records"))

This code ensures that the columns 'A' and 'B' in both dataframes df1 and df2 have the same data types before concatenating them. Additionally, it uses the PandasAI library to handle the dataframes and display the last 10 records, which should help in resolving the datatype mismatch error [1][2].

To continue talking to Dosu, mention @dosu.

@usmanbinimran-octopus
Copy link
Author

So it will not work for multiple datasets having different attributes?

Please confirm @gventuri

Best,
Usman Bin Imran
+92-310-480-160-1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant