Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandasai ordering custom order not correct #1222

Open
ssling0817 opened this issue Jun 10, 2024 · 3 comments
Open

Pandasai ordering custom order not correct #1222

ssling0817 opened this issue Jun 10, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@ssling0817
Copy link

ssling0817 commented Jun 10, 2024

System Info

pandasai version 2.0.42

馃悰 Describe the bug

The table I used have priority column, from high to low: "P0 - Critical, P1 - High, P2 - Medium, P3 - Low". I asked the agent to order from highest to lowest, but it always generated codes with .sort_values('priority', ascending=False) which actually should be .sort_values('priority'). Sometimes it even generated code with nlargest so will have error TypeError: Column 'priority' has dtype object, cannot use method 'nlargest' with this dtype. I have tried putting this order in input or putting in field_descriptions, but both don't work. How to solve this issue?

@dosubot dosubot bot added the bug Something isn't working label Jun 10, 2024
Copy link

dosubot bot commented Jun 10, 2024

To sort a custom priority column in a DataFrame with values "P0 - Critical", "P1 - High", "P2 - Medium", and "P3 - Low" from highest to lowest priority using pandasai version 2.0.42, you can use the following code:

import pandas as pd

# Sample DataFrame
data = {
    'Task': ['Task1', 'Task2', 'Task3', 'Task4'],
    'Priority': ['P2 - Medium', 'P0 - Critical', 'P3 - Low', 'P1 - High']
}
df = pd.DataFrame(data)

# Define the custom priority order
priority_order = ['P0 - Critical', 'P1 - High', 'P2 - Medium', 'P3 - Low']

# Convert the Priority column to a categorical type with the specified order
df['Priority'] = pd.Categorical(df['Priority'], categories=priority_order, ordered=True)

# Sort the DataFrame by the Priority column
df_sorted = df.sort_values('Priority')

print(df_sorted)

This code will sort the DataFrame df by the Priority column according to the custom order defined in priority_order. The sorted DataFrame df_sorted will have the tasks ordered from highest to lowest priority [1].

To continue talking to Dosu, mention @dosu.

@ssling0817
Copy link
Author

@dosu I am trying to use pandasai Agent, not interacting with df directly.

response = agent.chat("Return 5 tasks which have higher priority")

Copy link

dosubot bot commented Jun 10, 2024

To ensure that the pandasai Agent respects your custom priority order when processing queries, you can extend the agent with a custom pipeline that includes your sorting logic. Here's how you can do it:

  1. Create a Custom Pipeline: Define a custom pipeline that includes your sorting logic.
  2. Integrate the Custom Pipeline with the Agent: Use this custom pipeline with the pandasai Agent.

Here's an example:

from pandasai.agent.base import BaseAgent
from pandasai.pipelines.chat.generate_chat_pipeline import GenerateChatPipeline
import pandas as pd

class CustomSortPipeline(GenerateChatPipeline):
    def __init__(self, context, logger, **callbacks):
        super().__init__(context, logger, **callbacks)

    def custom_sort(self, df: pd.DataFrame, priority_order: list):
        # Implement your custom sorting logic here
        df['priority'] = pd.Categorical(df['Priority'], categories=priority_order, ordered=True)
        df = df.sort_values('priority')
        df = df.drop(columns=['priority'])
        return df

# Sample DataFrame
data = {
    'Task': ['Task1', 'Task2', 'Task3', 'Task4'],
    'Priority': ['P2 - Medium', 'P0 - Critical', 'P3 - Low', 'P1 - High']
}
df = pd.DataFrame(data)

# Define the custom priority order
priority_order = ['P0 - Critical', 'P1 - High', 'P2 - Medium', 'P3 - Low']

# Create an instance of the custom pipeline
custom_pipeline = CustomSortPipeline(context=None, logger=None)

# Create an instance of the agent and assign the custom pipeline
agent = BaseAgent(dfs=df)
agent.pipeline = custom_pipeline

# Use the custom sort method
sorted_df = custom_pipeline.custom_sort(df, priority_order)
print(sorted_df)

In this example, the CustomSortPipeline class is a subclass of GenerateChatPipeline where the custom_sort method implements the custom sorting logic. The BaseAgent is then configured to use this custom pipeline, ensuring that the DataFrame is sorted according to your custom priority order [1][2].

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant