You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to know if it's possible to input multiple bounding boxes and have TrOCR perform OCR only on those specified areas of my image. Could you please advise on this?
The text was updated successfully, but these errors were encountered:
I don't think TrOcr for that. All you need is a utility function with opencv or some other preferred library, like this:
import cv2
def crop_images_from_bounding_boxes(image, bounding_boxes):
"""
Crops images from the original image based on the provided bounding boxes.
Parameters:
image (numpy.ndarray): The original image.
bounding_boxes (list of tuples): A list of bounding boxes, where each bounding box is represented by a tuple
(x, y, width, height).
Returns:
list of numpy.ndarray: A list of cropped images.
"""
cropped_images = []
for (x, y, w, h) in bounding_boxes:
cropped_image = image[y:y+h, x:x+w]
cropped_images.append(cropped_image)
return cropped_images
If you are worried about the performance, be assured that as the images are cropped so it will be faster, at least a bit. And you can you use multi-threading for this, if you don't have a GPU. If you have a GPU, this will be fast as hell.
I want to know if it's possible to input multiple bounding boxes and have TrOCR perform OCR only on those specified areas of my image. Could you please advise on this?
The text was updated successfully, but these errors were encountered: