Add image-text-to-text task guide #31777

merveenoyan · 2024-07-03T14:55:53Z

Added shortly image-text-to-text task guide that includes streaming and more

HuggingFaceDocBuilderDev · 2024-07-03T15:19:28Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

stevhliu

Very nice job! 🙂

docs/source/en/tasks/image_text_to_text.md

Co-authored-by: Steven Liu <[email protected]>

stevhliu

Awesome work, thanks again!

amyeroberts

Thanks for adding!

Some comments, mostly nits

docs/source/en/tasks/image_text_to_text.md

amyeroberts · 2024-07-09T21:57:55Z

docs/source/en/tasks/image_text_to_text.md

+processor = AutoProcessor.from_pretrained("HuggingFaceM4/idefics2-8b")
+```
+
+This model has a [chat template](./chat_templating) format that's required for the input. Moreover, the model can also accept multiple images as input in a single conversation or message. We will now prepare the inputs. 


This isn't quite right, we don't need to use a chat template for the model inputs. It's just useful to correctly format the prompt in the case of message-style inputs

You likely know better, I thought when fine-tuning these chat templates are included in fine-tuning data, thus it is required to use chat templates no? e.g. Mistral one has <INST> </INST>

amyeroberts · 2024-07-09T21:58:09Z

docs/source/en/tasks/image_text_to_text.md

+img_urls =["https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/cats.png",
+           "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg"]
+images = [Image.open(requests.get(img_urls[0], stream=True).raw),
+          Image.open(requests.get(img_urls[1], stream=True).raw)]


docs/source/en/tasks/image_text_to_text.md

amyeroberts · 2024-07-09T22:04:25Z

docs/source/en/tasks/image_text_to_text.md

+
+    acc_text = ""
+    for text_token in streamer:
+        time.sleep(0.04)


Why do we need to add this?

otherwise the text flows super fast which is essentially against streaming (and also from my experience it was crashing too)

docs/source/en/tasks/image_text_to_text.md

amyeroberts · 2024-07-09T22:06:07Z

docs/source/en/tasks/image_text_to_text.md

+        target=model.generate,
+        kwargs=generation_args,
+    )
+    thread.start()


Will this be safely closed?

docs/source/en/tasks/image_text_to_text.md

amyeroberts · 2024-07-09T22:08:26Z

docs/source/en/tasks/image_text_to_text.md

+quantized_model = Idefics2ForConditionalGeneration.from_pretrained(model_id, device_map="cuda", quantization_config=quantization_config)
+```
+
+And that's it, we can use the model the same way with no changes.


It would be good here to note what kind of change this makes e.g. x% reduction in memory footprint

Co-authored-by: amyeroberts <[email protected]>

Add image-text-to-text task page

61fc59b

merveenoyan requested a review from stevhliu July 3, 2024 14:56

stevhliu reviewed Jul 3, 2024

View reviewed changes

merveenoyan and others added 13 commits July 4, 2024 02:38

Update docs/source/en/tasks/image_text_to_text.md

c282928

Co-authored-by: Steven Liu <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

9c28150

Co-authored-by: Steven Liu <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

ed0ce47

Co-authored-by: Steven Liu <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

f0227c0

Co-authored-by: Steven Liu <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

ba872f8

Co-authored-by: Steven Liu <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

78a4ee6

Co-authored-by: Steven Liu <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

0755adc

Co-authored-by: Steven Liu <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

91a6ab3

Co-authored-by: Steven Liu <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

43ab484

Co-authored-by: Steven Liu <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

5f8b08b

Co-authored-by: Steven Liu <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

6946805

Co-authored-by: Steven Liu <[email protected]>

Address comments

b4d1028

Fix heading

755374e

merveenoyan requested a review from stevhliu July 4, 2024 08:44

stevhliu approved these changes Jul 8, 2024

View reviewed changes

stevhliu requested a review from amyeroberts July 8, 2024 17:10

amyeroberts reviewed Jul 9, 2024

View reviewed changes

merveenoyan and others added 7 commits July 10, 2024 11:54

Update docs/source/en/tasks/image_text_to_text.md

d2e4dd6

Co-authored-by: amyeroberts <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

5be2b7f

Co-authored-by: amyeroberts <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

f862630

Co-authored-by: amyeroberts <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

0db46a9

Co-authored-by: amyeroberts <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

3a9d5f6

Co-authored-by: amyeroberts <[email protected]>

Update docs/source/en/tasks/image_text_to_text.md

5652ffc

Co-authored-by: amyeroberts <[email protected]>

Address comments

e834f9f

merveenoyan requested a review from amyeroberts July 10, 2024 09:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add image-text-to-text task guide #31777

Add image-text-to-text task guide #31777

merveenoyan commented Jul 3, 2024

HuggingFaceDocBuilderDev commented Jul 3, 2024

stevhliu left a comment

stevhliu left a comment

amyeroberts left a comment

amyeroberts Jul 9, 2024

merveenoyan Jul 10, 2024 •

edited

Loading

amyeroberts Jul 9, 2024

amyeroberts Jul 9, 2024

merveenoyan Jul 10, 2024

amyeroberts Jul 9, 2024

amyeroberts Jul 9, 2024

Add image-text-to-text task guide #31777

Are you sure you want to change the base?

Add image-text-to-text task guide #31777

Conversation

merveenoyan commented Jul 3, 2024

HuggingFaceDocBuilderDev commented Jul 3, 2024

stevhliu left a comment

Choose a reason for hiding this comment

stevhliu left a comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts Jul 9, 2024

Choose a reason for hiding this comment

merveenoyan Jul 10, 2024 • edited Loading

Choose a reason for hiding this comment

amyeroberts Jul 9, 2024

Choose a reason for hiding this comment

amyeroberts Jul 9, 2024

Choose a reason for hiding this comment

merveenoyan Jul 10, 2024

Choose a reason for hiding this comment

amyeroberts Jul 9, 2024

Choose a reason for hiding this comment

amyeroberts Jul 9, 2024

Choose a reason for hiding this comment

merveenoyan Jul 10, 2024 •

edited

Loading