Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Message API for chatbot and chatinterface #8422

Merged
merged 42 commits into from
Jul 10, 2024

Conversation

freddyaboulton
Copy link
Collaborator

@freddyaboulton freddyaboulton commented May 30, 2024

Description

Main Changes:

  • Adds a msg_format parameter to Chatbot and ChatInterface so that messages can be returned as either the current list of tuples or a dictionary that is a superset of the messages/"OAI" format, e.g. {"role": "user", "content": ""}.
  • Adds a ChatMessage dataclass that can be used instead of a dictionary when `msg_format="messages". This is nice because the IDE will autocomplete.
Screenshot 2024-07-02 at 10 51 54 AM
  • If msg_format is "messages", then in ChatInterface developers can just yield the next token. They don't have to yield the entire message up to and including that token. I think this makes demos easier to write. And lets developers simply yield from their iterator.
  • Added the ability to parametrize e2e tests. If you create a python file that ends with _testcase.py in a demo directory corresponding to an e2e test, that demo will also be loaded to the e2e app. And you can use go_to_testcase to navigate to that testcase.

Messages format overview

The message format is a dict with two required keys role and content. There is an additional metadata key that is not required but it can be used for tools and additional info about the message. Most messages returned from an "openai compatible" client will be compatible with gradio.

The implementation of the message is below:

class Metadata(GradioModel):
    title: Optional[str] = None

class Message(GradioModel):
    role: str
    metadata: Metadata = Field(default_factory=Metadata)
    content: str | FileData

Examples

Realistic Inference API Streaming

from huggingface_hub import InferenceClient
import gradio as gr

"""
For more information on `huggingface_hub` Inference API support, please check the docs: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/inference
"""
client = InferenceClient("HuggingFaceH4/zephyr-7b-beta")


def respond(
    prompt: str,
    history,
):
    if not history:
        history = [{"role": "system", "content": "You are a friendly chatbot"}]
    history.append({"role": "user", "content": prompt})

    yield history

    response = {"role": "assistant", "content": ""}
    for message in client.chat_completion(
        history,
        temperature=0.95,
        top_p=0.9,
        max_tokens=512,
        stream=True,
    ):
        response["content"] += message.choices[0].delta.content or ""

        yield history + [response]


with gr.Blocks() as demo:
    gr.Markdown("# Chat with Hugging Face Zephyr 7b 🤗")
    chatbot = gr.Chatbot(
        label="Agent",
        msg_format="messages",
        avatar_images=(
            None,
            "https://em-content.zobj.net/source/twitter/376/hugging-face_1f917.png",
        ),
    )
    prompt = gr.Textbox(lines=1, label="Chat Message")
    prompt.submit(respond, [prompt, chatbot], [chatbot])


if __name__ == "__main__":
    demo.launch()

ChatInterface Streaming

import time
import gradio as gr

def slow_echo(message, history):
    yield f"You typed: "
    for i in range(len(message)):
        time.sleep(0.05)
        yield message[i]


demo = gr.ChatInterface(slow_echo, msg_format="messages").queue()

if __name__ == "__main__":
    demo.launch()

ChatInterface Multimodal

import gradio as gr


def echo(message, history):
    return message["text"]


demo = gr.ChatInterface(
    fn=echo,
    examples=[{"text": "hello"}, {"text": "hola"}, {"text": "merhaba"}],
    title="Echo Bot",
    multimodal=True,
    msg_format="messages",
)
demo.launch()

Chatbot

import gradio as gr
import random
import time

with gr.Blocks() as demo:
    chatbot = gr.Chatbot(msg_format="openai")
    msg = gr.Textbox()
    clear = gr.ClearButton([msg, chatbot])

    def respond(message, chat_history: list):
        bot_message = random.choice(["How are you?", "I love you", "I'm very hungry"])
        chat_history.extend([{"role": "user", "content": message}, {"role": "assistant", "content": bot_message}])
        time.sleep(2)
        return "", chat_history

    msg.submit(respond, [msg, chatbot], [msg, chatbot])

if __name__ == "__main__":
    demo.launch()

Multimodal Chatbot

import gradio as gr
import time

def add_message(history, message):
    for x in message["files"]:
        history.append({"role": "user", "content": {"path": x}})
    if message["text"] is not None:
        history.append({"role": "user", "content": message["text"]})
    return history, gr.MultimodalTextbox(value=None, interactive=False)

def bot(history: list):
    response = "**That's cool!**"
    history.append({"role": "assistant", "content": ""})
    for character in response:
        history[-1]['content'] += character
        time.sleep(0.05)
        yield history

with gr.Blocks() as demo:
    chatbot = gr.Chatbot(
        [],
        elem_id="chatbot",
        bubble_full_width=False,
        msg_format="messages"
    )

    chat_input = gr.MultimodalTextbox(interactive=True, file_types=["image"], placeholder="Enter message or upload file...", show_label=False)

    chat_msg = chat_input.submit(add_message, [chatbot, chat_input], [chatbot, chat_input])
    bot_msg = chat_msg.then(bot, chatbot, chatbot, api_name="bot_response")
    bot_msg.then(lambda: gr.MultimodalTextbox(interactive=True), None, [chat_input])

demo.queue()
if __name__ == "__main__":
    demo.launch()

Closes: #7118

@gradio-pr-bot
Copy link
Contributor

gradio-pr-bot commented May 30, 2024

🪼 branch checks and previews

Name Status URL
Spaces ready! Spaces preview
Website ready! Website preview
Storybook ready! Storybook preview
🦄 Changes detected! Details

Install Gradio from this PR

pip install https://gradio-builds.s3.amazonaws.com/96d6e61c927fcf15374934cfde976c0a25000db3/gradio-4.37.2-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@96d6e61c927fcf15374934cfde976c0a25000db3#subdirectory=client/python"

Install Gradio JS Client from this PR

npm install https://gradio-builds.s3.amazonaws.com/96d6e61c927fcf15374934cfde976c0a25000db3/gradio-client-1.2.1.tgz

@gradio-pr-bot
Copy link
Contributor

gradio-pr-bot commented May 30, 2024

🦄 change detected

This Pull Request includes changes to the following packages.

Package Version
@gradio/chatbot minor
@gradio/tootils minor
gradio minor
website minor

With the following changelog entry.

Support message format in chatbot 💬

gr.Chatbot and gr.ChatInterface now support the Messages API, which is fully compatible with LLM API providers such as Hugging Face Text Generation Inference, OpenAI's chat completions API, and Llama.cpp server.

Building Gradio applications around these LLM solutions is now even easier!

gr.Chatbot and gr.ChatInterface now have a msg_format parameter that can accept two values - 'tuples' and 'messages'. If set to 'tuples', the default chatbot data format is expected. If set to 'messages', a list of dictionaries with content and role keys is expected. See below -

def chat_greeter(msg, history):
    history.append({"role": "assistant", "content": "Hello!"})
    return history

Additionally, gradio now exposes a gr.ChatMessage dataclass you can use for IDE type hints and auto completion.

image

Tool use in Chatbot 🛠️

The Gradio Chatbot can now natively display tool usage and intermediate thoughts common in Agent and chain-of-thought workflows!

If you are using the new "messages" format, simply add a metadata key with a dictionary containing a title key and value. This will display the assistant message in an expandable message box to show the result of a tool or intermediate step.

import gradio as gr
from gradio import ChatMessage
import time

def generate_response(history):
    history.append(ChatMessage(role="user", content="What is the weather in San Francisco right now?"))
    yield history
    time.sleep(0.25)
    history.append(ChatMessage(role="assistant",
                               content="In order to find the current weather in San Francisco, I will need to use my weather tool.")
                               )
    yield history
    time.sleep(0.25)

    history.append(ChatMessage(role="assistant",
                               content="API Error when connecting to weather service.",
                              metadata={"title": "💥 Error using tool 'Weather'"})
                  )
    yield history
    time.sleep(0.25)

    history.append(ChatMessage(role="assistant",
                               content="I will try again",
                              ))
    yield history
    time.sleep(0.25)

    history.append(ChatMessage(role="assistant",
                               content="Weather 72 degrees Fahrenheit with 20% chance of rain.",
                                metadata={"title": "🛠️ Used tool 'Weather'"}
                              ))
    yield history
    time.sleep(0.25)

    history.append(ChatMessage(role="assistant",
                               content="Now that the API succeeded I can complete my task.",
                              ))
    yield history
    time.sleep(0.25)

    history.append(ChatMessage(role="assistant",
                               content="It's a sunny day in San Francisco with a current temperature of 72 degrees Fahrenheit and a 20% chance of rain. Enjoy the weather!",
                              ))
    yield history


with gr.Blocks() as demo:
    chatbot  = gr.Chatbot(msg_format="messages")
    button = gr.Button("Get San Francisco Weather")
    button.click(generate_response, chatbot, chatbot)

if __name__ == "__main__":
    demo.launch()

tool-box-demo

⚠️ The changeset file for this pull request has been modified manually, so the changeset generation bot has been disabled. To go back into automatic mode, delete the changeset file.

Something isn't right?

  • Maintainers can change the version label to modify the version bump.
  • If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

@abidlabs
Copy link
Member

abidlabs commented Jun 3, 2024

@freddyaboulton this looks great! Just one quibble: I'd suggest not using "openai" as the name of the format. It might be that they modify their message format in a way that we don't want to track. Also Anthropic and others use the same format as well. What about "tuples" | "dicts"?

So thinking how we could add support for components into this: we could modify the content key to accept Component as well?

class Message(GradioModel):
    role: str
    metadata: Metadata = Field(default_factory=Metadata)
    content: str | FileData | Component

wdyt @dawoodkhan82 @freddyaboulton

@freddyaboulton
Copy link
Collaborator Author

freddyaboulton commented Jun 3, 2024

Yes makes sense regarding renaming. What about "messages" (the name used by tgi/transformers and from the looks of it is actually the industry standard name transformers docs anthropic docs )

@abidlabs
Copy link
Member

abidlabs commented Jun 3, 2024

"messages" sounds good, compatibility with transformers/tgi makes more sense 👍

@dawoodkhan82
Copy link
Collaborator

This format looks good.

So thinking how we could add support for components into this: we could modify the content key to accept Component as well?

Regrading this, it would have to be another dict ComponentMessage or ComponentContent which stores the component name, the processed value, and the constructor_args.

class ComponentMessage(GradioModel):
    component: str
    value: Any
    constructor_args: List[Dict[str, Any]]

@abidlabs
Copy link
Member

abidlabs commented Jun 3, 2024

imo it would be much nicer DX if the ComponentMessage class was only used internally and the developer could just pass in a Component object, e.g. gr.Gallery([..., ...])

@freddyaboulton
Copy link
Collaborator Author

Yes that is what I had in mind, we can handle the conversion from component instance to internal payload format in pre/postprocess

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This demo isn't working for me. Getting a couple of errors saying "Data incompatible with openai format"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - issue was a typo in the msg_format parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! Just some comments on the UI:

image
  1. Its not clear that you can click on the error message or tool use message and expand to get more details. I would suggest replacing with an accordion-like element where the toggle icon is clear
  2. As seen in the screenshot, the color of the tool use message is the "primary" color even though the rest of the bot message is not that color. In fact, its the user message that is the primary color, which is confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw it'd be awesome if we can include a real example of tool use with transformers agents or one of the other llm providers in our docs!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make the font color more consistent! I like the minimal toolbox though and it matches the hugging face styling. Does anyone else have thoughts?

Agreed about the demo but to make it interesting we need to add api tokens etc so don't want to include it in tests or the repo. Will prepare something for the launch though!

)
return ChatbotData(root=processed_messages)
return ChatbotDataTuples(root=[])
if self.msg_format == "tuples":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: for all of our other components that can accept different types (e.g. Image with multiple types such as PIL images, numpy arrays, string filepaths), we don't enforce the type in postprocess() -- we just automatically figure it out based on value. Potentially, we could do the same thing here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the msg_format to set the data model and api info. So I want to make sure it matches in postprocess. Otherwise, it would be possible for the API info to tell you to expect dictionaries but you get tuples.

@abidlabs
Copy link
Member

abidlabs commented Jul 3, 2024

Nits:

  1. Not sure if related to this PR, but noticed a UI issue when multiple user messages are one after another:
import gradio as gr

with gr.Blocks() as demo:
    gr.Chatbot([
        gr.ChatMessage(role="user", content="Hello!"),
        gr.ChatMessage(role="user", content="Hello!")        
    ], msg_format="messages")
    
demo.launch()
image
  1. Any non-"user" role is treated like an "assistant" just fyi:
import gradio as gr

with gr.Blocks() as demo:
    gr.Chatbot([
        gr.ChatMessage(role="abc", content="Hello!"),
    ], msg_format="messages")
    
demo.launch()
image
  1. Is it expected that this doesn't work? No bot message is printed for me
import gradio as gr

demo = gr.ChatInterface(lambda x,y:x, msg_format="messages")
    
demo.launch()
image

@freddyaboulton very nice PR! Made a first pass and left some comments above. Down to do another deeper review again once these comments are addressed

@freddyaboulton
Copy link
Collaborator Author

Thanks for the review @abidlabs !! I think I got all of the comments (and added some more unit tests because of them :) )

Copy link
Member

@abidlabs abidlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @freddyaboulton this looks great! Tested gr.Chatbot and gr.ChatInterface with the new format and everything is working as expected.

I just have three small points of feedback, which I'll leave below

@abidlabs
Copy link
Member

abidlabs commented Jul 8, 2024

(1) The first concerns this:

If msg_format is "messages", then in ChatInterface developers can just yield the next token. They don't have to yield the entire message up to and including that token. I think this makes demos easier to write. And lets developers simply yield from their iterator.

Although I agree that this makes demos easier to write, this introduces a different behavior for iterators in the very special case where you are iterating from a ChatInterface with msg_format="messages". This is likely to confuse users who are used to sending the complete message with yield in all other cases. It also could lead to bugs. For example, if you run

python demo/chatinterface_streaming_echo/messages_testcase.py

and then use it via the client, e.g.

from gradio_client import Client

client = Client("http://127.0.0.1:7864/")
result = client.predict(
		message="Hello!!",
		api_name="/chat"
)
print(result)

You only get "!" (the final token). But if you run the regular version of this demo (with msg_format="tuples"), you get the entire final string: "You typed: Hello!!". This introduces a discrepancy between what a user would observe if they used the Gradio UI and what you get when you make a prediction with the client.

@abidlabs
Copy link
Member

abidlabs commented Jul 8, 2024

(2) Just to reiterate the earlier point about the design of tools, I think we can improve the UI quite a bit

image
  • We should make the color of the tool "box" match the color of the messages
  • We should provide some visual indication that the tool "box" can be clicked to expand the message

On the second point, I think an accordion would be the best UI. This is one area I like Streamlit's UI. If its an accordion, we should keep the accordion open if the accordion is the final message, but then collapse it if there are subsequent messages.

cc @pngwn @hannahblair on this front. This isn't a blocker but I think having a nice UI for tools will facilitate some nice viral comms down the road

@abidlabs
Copy link
Member

abidlabs commented Jul 8, 2024

(3) Let's add some docs for this, perhaps in the chatbot/chatinterface guides. Excited to do some nice comms here!

@abidlabs
Copy link
Member

abidlabs commented Jul 9, 2024

Discussed with @freddyaboulton and we don't need this:

We should make the color of the tool "box" match the color of the messages

if we are bringing the bubbles back. @pngwn perhaps you could review the design of the chatbot_with_tools demo after you revert the bubbles back to ensure it looks good in both light and dark mode.

Copy link
Member

@pngwn pngwn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few notes/ questions!


export let elem_id = "";
export let elem_classes: string[] = [];
export let visible = true;
export let value: messages = [];
export let value: TupleFormat | Message[] = [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we didn't have these different types everywhere, can we transform the tuple format to the standard Message format in the backend before we send them down?

Although I guess this would be breaking for the clients, maybe we can only do this in 5.0.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it would be breaking for clients so lets do in 5.0!

Comment on lines +60 to +63
$: _value =
msg_format === "tuples"
? normalise_tuples(value as TupleFormat, root)
: normalise_messages(value as Message[], root);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DO we need to handle these separately? Can we not have a single function that deals with them all or is there ambiguity?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is slightly different processing steps for each case so thought that keeping them separate would be best

message: NormalisedMessage,
selected: string | null
): void {
dispatch("like", {
index: [i, j],
value: message,
index: i,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? If the user is using tuples don't we still need both i and j?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're right! Will fix

Comment on lines +218 to 245
function group_messages(
messages: NormalisedMessage[]
): NormalisedMessage[][] {
const groupedMessages: NormalisedMessage[][] = [];
let currentGroup: NormalisedMessage[] = [];
let currentRole: MessageRole | null = null;

for (const message of messages) {
if (!(message.role === "assistant" || message.role === "user")) {
continue;
}
if (message.role === currentRole) {
currentGroup.push(message);
} else {
if (currentGroup.length > 0) {
groupedMessages.push(currentGroup);
}
currentGroup = [message];
currentRole = message.role;
}
}

if (currentGroup.length > 0) {
groupedMessages.push(currentGroup);
}

return groupedMessages;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of grouping messages like this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that they would be displayed in the same bubble but we got rid of the bubbles lol

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are bringing the bubbles back but should they be in the same bubble if they are different messages? Maybe it looks cleaner though. Happy to leave it like this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for sure the bot messages look cleaner in the same bubble. Happy to iterate with you when we bring them back.

Comment on lines 11 to 12
<!-- svelte-ignore a11y-click-events-have-key-events -->
<!-- svelte-ignore a11y-no-static-element-interactions -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make this a button and remove these ignore comments.

Copy link
Collaborator

@dawoodkhan82 dawoodkhan82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks for working on this @freddyaboulton

@freddyaboulton freddyaboulton enabled auto-merge (squash) July 10, 2024 10:17
@freddyaboulton freddyaboulton merged commit 4221290 into main Jul 10, 2024
9 checks passed
@freddyaboulton freddyaboulton deleted the openai-message-format branch July 10, 2024 11:08
@pngwn pngwn mentioned this pull request Jul 10, 2024
@freddyaboulton
Copy link
Collaborator Author

Thanks everyone for the reviews! I addressed all comments and updated the chatbot docs and release notes. Will prepare a guide soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow gr.Chatbot to accept messages in the openai format
5 participants