The GPU is not used when running detection with YOLOv5 #13171

Angelinnp · 2024-07-06T14:15:08Z

Search before asking

I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

Multi-GPU

Bug

When I run the YOLOv5 detection code, it still uses CPU. And it causes the detection process to be slow, I get fps = 0.4. For installation, CUDA has been activated but the CUDA on the Jetson nano is still not used. Please give me an explanation why it happened and what is the solution?
The following are the versions of CUDA 10.2.300 and pytorch 2.3.1 that I have installed.
I use the virtual environment Python 3.8.0. Please tell which version of Pytorch and CUDA suits my python virtual environment. Please help me

Environment

YOLO : YOLO v5 CUDA 10.2.300 and pytorch 2.3.1 Python 3.8.0

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

github-actions · 2024-07-06T14:15:31Z

👋 Hello @Angelinnp, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics

glenn-jocher · 2024-07-06T20:53:31Z

@Angelinnp hello,

Thank you for reaching out and providing detailed information about your issue. It looks like you're experiencing difficulties with GPU utilization on your Jetson Nano while running YOLOv5 detection.

To better assist you, could you please provide a minimal reproducible code example? This will help us understand the context and reproduce the issue on our end. You can find more information on creating a minimal reproducible example here. This step is crucial for us to investigate and provide a solution effectively.

In the meantime, here are a few steps you can take to troubleshoot the issue:

Verify CUDA and PyTorch Compatibility:
Ensure that your CUDA and PyTorch versions are compatible. For Jetson Nano, it is recommended to use the versions provided by NVIDIA's JetPack SDK, which ensures compatibility. You can check the compatibility matrix on the NVIDIA website.
Check GPU Availability in PyTorch:
Run the following code to verify that PyTorch detects your GPU:
```
import torch
print(torch.cuda.is_available())
print(torch.cuda.current_device())
print(torch.cuda.get_device_name(0))
```
If torch.cuda.is_available() returns False, there might be an issue with your CUDA installation.
Ensure YOLOv5 is Configured to Use GPU:
When running detection, make sure to specify the --device argument to use the GPU. For example:
```
python detect.py --source your_source --weights yolov5s.pt --device 0
```
This command explicitly tells YOLOv5 to use the first GPU.
Update YOLOv5 and Dependencies:
Ensure you are using the latest version of YOLOv5 and its dependencies. You can update YOLOv5 by running:
```
git pull
pip install -r requirements.txt
```

If the issue persists after trying the above steps, please share the minimal reproducible code example, and we will investigate further.

Thank you for your cooperation, and we look forward to resolving this issue with you.

Angelinnp · 2024-07-07T12:59:32Z

The following is the detect program code that I run. please help me solve this problem.
Uploading code deteksi.pdf…

Angelinnp · 2024-07-08T01:20:15Z

The following is the detect program code that I run. please help me solve this problem.

import argparse
import os
import platform
import sys
import serial
from pathlib import Path
from turtle import distance
import pynmea2

import torch
import time

FILE = Path(file).resolve()
ROOT = FILE.parents[0] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative

from models.common import DetectMultiBackend
from utils.dataloaders import IMG_FORMATS, VID_FORMATS, LoadImages, LoadScreenshots, LoadStreams
from utils.general import (LOGGER, Profile, check_file, check_img_size, check_imshow, check_requirements, colorstr, cv2,
increment_path, non_max_suppression, print_args, scale_boxes, strip_optimizer, set_logging, xyxy2xywh)
from utils.plots import Annotator, colors, save_one_box
from utils.torch_utils import select_device, smart_inference_mode

#Konfigurasi GPS Sensor
serialport = serial.Serial(
port="/dev/ttyTHS1",
baudrate=9600,
bytesize=serial.EIGHTBITS,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
)
time.sleep(1)

#Konfigurasi GPS Information
def parse_nmea(sentence):
latitude = longitude = altitude = altitude_units = None

if sentence.startswith('$GPGGA'):
    msg = pynmea2.parse(sentence)
    latitude = msg.latitude
    longitude = msg.longitude
    altitude = msg.altitude
    altitude_units = msg.altitude_units
elif sentence.startswith('$GPRMC'):
    msg = pynmea2.parse(sentence)
    latitude = msg.latitude
    longitude = msg.longitude

return latitude, longitude, altitude, altitude_units

latitude = None
longitude = None

@smart_inference_mode()
def run(
weights=ROOT / 'yolov5s.pt', # model path or triton URL
source=ROOT / 'data/images', # file/dir/URL/glob/screen/0(webcam)
data=ROOT / 'data/coco128.yaml', # dataset.yaml path
imgsz=(640, 640), # inference size (height, width)
conf_thres=0.1, # confidence threshold
iou_thres=0.9, # NMS IOU threshold
max_det=1000, # maximum detections per image
device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu
view_img=False, # show results
save_txt=False, # save results to *.txt
save_conf=False, # save confidences in --save-txt labels
save_crop=False, # save cropped prediction boxes
nosave=False, # do not save images/videos
classes=None, # filter by class: --class 0, or --class 0 2 3
agnostic_nms=False, # class-agnostic NMS
augment=False, # augmented inference
visualize=False, # visualize features
update=False, # update all models
project=ROOT / 'runs/detect', # save results to project/name
name='exp', # save results to project/name
exist_ok=False, # existing project/name ok, do not increment
line_thickness=3, # bounding box thickness (pixels)
hide_labels=False, # hide labels
hide_conf=False, # hide confidences
half=False, # use FP16 half-precision inference
dnn=False, # use OpenCV DNN for ONNX inference
vid_stride=1, # video frame-rate stride
):
source = str(source)
save_img = not nosave and not source.endswith('.txt') # save inference images
is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://'))
webcam = source.isnumeric() or source.endswith('.txt') or (is_url and not is_file)
screenshot = source.lower().startswith('screen')
if is_url and is_file:
source = check_file(source) # download

# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok)  # increment run
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir

# Initialize
set_logging()
device = select_device('0' if torch.cuda.is_available() else 'cpu')
half &= device.type != 'cpu'  # presisi setengah hanya didukung pada CUDA

# Load model
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
stride, names, pt = model.stride, model.names, model.pt
imgsz = check_img_size(imgsz, s=stride)  # cek ukuran gambar

# Dataloader
bs = 1  # batch_size
if webcam:
    view_img = check_imshow(warn=True)
    dataset = LoadStreams(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
    bs = len(dataset)
elif screenshot:
    dataset = LoadScreenshots(source, img_size=imgsz, stride=stride, auto=pt)
else:
    dataset = LoadImages(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
vid_path, vid_writer = [None] * bs, [None] * bs

# Run inference
model.warmup(imgsz=(1 if pt or model.triton else bs, 3, *imgsz))  # warmup
seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
for path, im, im0s, vid_cap, s in dataset:
    with dt[0]:
        im = torch.from_numpy(im).to(model.device)
        im = im.half() if model.fp16 else im.float()  # uint8 to fp16/32
        im /= 255  # 0 - 255 to 0.0 - 1.0
        if len(im.shape) == 3:
            im = im[None]  # expand for batch dim

    # Inference
    with dt[1]:
        visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if visualize else False
        pred = model(im, augment=augment, visualize=visualize)

    # NMS
    with dt[2]:
        pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)

    # Second-stage classifier (optional)
    # pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)

    # Process predictions
    for i, det in enumerate(pred):  # per image
        seen += 1
        if webcam:  # batch_size >= 1
            p, im0, frame = path[i], im0s[i].copy(), dataset.count
            s += f'{i}: '
        else:
            p, im0, frame = path, im0s.copy(), getattr(dataset, 'frame', 0)

        p = Path(p)  # to Path
        save_path = str(save_dir / p.name)  # im.jpg
        txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}')  # im.txt
        s += '%gx%g ' % im.shape[2:]  # print string
        gn = torch.tensor(im0.shape)[[1, 0, 1, 0]]  # normalization gain whwh
        imc = im0.copy() if save_crop else im0  # for save_crop
        annotator = Annotator(im0, line_width=line_thickness, example=str(names))
        if len(det):
            # Rescale boxes from img_size to im0 size
            det[:, :4] = scale_boxes(im.shape[2:], det[:, :4], im0.shape).round()

            # Print results
            for c in det[:, -1].unique():
                n = (det[:, -1] == c).sum()  # detections per class
                s += f"{n} {names[int(c)]}{'s' * (n > 1)}, "  # add to string
                detnum = det.cpu().numpy()
                len_det = len(detnum)
                for count in range(len_det):
                    if serialport.in_waiting:
                        data = serialport.readline().decode('utf-8').strip()
                        latitude, longitude, altitude, altitude_units = parse_nmea(data)
                        if latitude is not None and longitude is not None:
                            print(f"Latitude: {latitude:.6f}, Longitude: {longitude:.6f}")
                        elif altitude is not None:
                            print(f"Altitude: {altitude:.2f} {altitude_units if altitude_units else ''}")

		
                    #print("conf=%d ; xmin=%d ; xmax=%d ; ymin=%d ; ymax=%d ; deltax = %d ; deltay = %d ; dist=%d" % (detconf, detxmin, detxmax, detymin, detymax, deltax, deltay, dist2))
                    #print("Jarak = %d" % (dist2))
                    #print("conf=%d ; xmin=%d ; xmax=%d ; ymin=%d ; ymax=%d" % (detconf, detxmin, detxmax, detymin, detymax))

                    #detdist = calcDist(detClass, detw)
                    #rint("distance : ", detdist)
                    #for z in range(0,len(detymintemp)):
                        #for y in range(z+1,len(detymintemp)):
                            #if(detymintemp[z]>detymintemp[y]):
                                #temp = detymintemp[z]
                                #detymintemp[z] = detymintemp[y]
                                #detymintemp[y] = temp

                    #print latitude

            # Write results
            for *xyxy, conf, cls in reversed(det):
                if save_txt:  # Write to file
                    xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
                    line = (cls, *xywh, conf) if save_conf else (cls, *xywh)  # label format
                    with open(f'{txt_path}.txt', 'a') as f:
                        f.write(('%g ' * len(line)).rstrip() % line + '\n')

                if save_img or save_crop or view_img:  # Add bbox to image
                    c = int(cls)  # integer class
                    label = None if hide_labels else (names[c] if hide_conf else f'{names[c]} {conf:.2f}')
                    annotator.box_label(xyxy, label, color=colors(c, True))
                if save_crop:
                    save_one_box(xyxy, imc, file=save_dir / 'crops' / names[c] / f'{p.stem}.jpg', BGR=True)

        # Stream results
       # Stream results
        im0 = annotator.result()
        # Tambahkan ini untuk FPS di sudut kanan atas
        fps = 1 / dt[1].dt  # Kalkulasi FPS dari waktu inferensi
        cv2.putText(im0, f"FPS: {fps:.1f}", (im0.shape[1] - 150, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
        if view_img:
            if platform.system() == 'Linux' and p not in windows:
                windows.append(p)
                cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO)  # cv2.WINDOW_NORMAL
                cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
            cv2.imshow(str(p), im0)
            cv2.waitKey(1)  # 1 millisecond

        # Save results (image with detections)
        if save_img:
            if dataset.mode == 'image':
                cv2.imwrite(save_path, im0)
            else:  # 'video' or 'stream'
                if vid_path[i] != save_path:  # new video
                    vid_path[i] = save_path
                    if isinstance(vid_writer[i], cv2.VideoWriter):
                        vid_writer[i].release()  # release previous video writer
                    if vid_cap:  # video
                        fps = vid_cap.get(cv2.CAP_PROP_FPS)
                        w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
                        h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
                    else:  # stream
                        fps, w, h = 30, im0.shape[1], im0.shape[0]
                    save_path = str(Path(save_path).with_suffix('.mp4'))  # force *.mp4 suffix on results videos
                    vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
                vid_writer[i].write(im0)

    # Print time (inference-only)
    LOGGER.info(f"{s}{'' if len(det) else '(no detections), '}{dt[1].dt * 1E3:.1f}ms")

# Print results
t = tuple(x.t / seen * 1E3 for x in dt)  # speeds per image
LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}' % t)
if save_txt or save_img:
    s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
    LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
if update:
    strip_optimizer(weights[0])  # update model (to fix SourceChangeWarning)

def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s.pt', help='model path or triton URL')
parser.add_argument('--source', type=str, default=ROOT / 'data/images', help='file/dir/URL/glob/screen/0(webcam)')
parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='(optional) dataset.yaml path')
parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w')
parser.add_argument('--conf-thres', type=float, default=0.2, help='confidence threshold')
parser.add_argument('--iou-thres', type=float, default=0.1, help='NMS IoU threshold')
parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--view-img', action='store_true', help='show results')
parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes')
parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3')
parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--visualize', action='store_true', help='visualize features')
parser.add_argument('--update', action='store_true', help='update all models')
parser.add_argument('--project', default=ROOT / 'runs/detect', help='save results to project/name')
parser.add_argument('--name', default='exp', help='save results to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--line-thickness', default=3, type=int, help='bounding box thickness (pixels)')
parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels')
parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
parser.add_argument('--vid-stride', type=int, default=1, help='video frame-rate stride')
opt = parser.parse_args()
opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1 # expand
print_args(vars(opt))
return opt

def calcDist(myClass, myWidth):
myF = 700
if(myClass == 0):
myDist = int(myF * myWidth / 50)
elif(myClass == 1):
myDist = int(myF * myWidth / 90)
else:
myDist = 0
return myDist

def sendcalc232(category, confidence, x, y, w, h, dist):
toWrite = "$" + "," + str(category) + "," + str(confidence) + "," + str(x) + "," + str(y) + "," + str(w) + "," + str(h) + "," + str(dist)
value = write_read(toWrite)
print(value)

def write_read(x):
# send_data.write(bytes(x, 'utf-8'))
time.sleep(0.05)
# data = send_data.readline()
# return data

def main(opt):
check_requirements(exclude=('tensorboard', 'thop'))
run(**vars(opt))

if name == "main":
opt = parse_opt()
main(opt)

glenn-jocher · 2024-07-08T12:24:02Z

Hello @Angelinnp,

Thank you for sharing your detection code. I see that you've integrated GPS sensor data and are running YOLOv5 on a Jetson Nano. Let's address the issue of the GPU not being utilized.

Steps to Ensure GPU Utilization

Verify CUDA and PyTorch Compatibility:
Ensure that your CUDA and PyTorch versions are compatible with each other and with your Jetson Nano. For Jetson Nano, it's recommended to use the versions provided by NVIDIA's JetPack SDK. You can find the compatibility matrix on the NVIDIA website.

Check GPU Availability in PyTorch:
Run the following code snippet to verify that PyTorch detects your GPU:

import torch
print(torch.cuda.is_available())  # Should return True
print(torch.cuda.current_device())  # Should return the GPU device index
print(torch.cuda.get_device_name(0))  # Should return the name of your GPU

If torch.cuda.is_available() returns False, there might be an issue with your CUDA installation.

Ensure YOLOv5 is Configured to Use GPU:
In your run function, you are already using select_device('0' if torch.cuda.is_available() else 'cpu'). Ensure that torch.cuda.is_available() returns True as mentioned above.
Update YOLOv5 and Dependencies:
Make sure you are using the latest version of YOLOv5 and its dependencies. You can update YOLOv5 by running:
```
git pull
pip install -r requirements.txt
```
Specify the CUDA Device Explicitly:
When running the detection script, make sure to specify the --device argument to use the GPU. For example:
```
python detect.py --source your_source --weights yolov5s.pt --device 0
```

Example Code Adjustments

Here are some adjustments to ensure GPU utilization:

Ensure select_device is correctly set:

device = select_device('0' if torch.cuda.is_available() else 'cpu')

Run the script with the --device argument:

python detect.py --source your_source --weights yolov5s.pt --device 0

Additional Debugging

If the above steps do not resolve the issue, please provide the output of the following commands:

python -c "import torch; print(torch.cuda.is_available())"
python -c "import torch; print(torch.cuda.get_device_name(0))"

This will help us understand if PyTorch is correctly detecting your GPU.

Thank you for your cooperation, and we look forward to resolving this issue with you. If you have any further questions or need additional assistance, please feel free to ask.

Angelinnp added the bug Something isn't working label Jul 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The GPU is not used when running detection with YOLOv5 #13171

The GPU is not used when running detection with YOLOv5 #13171

Angelinnp commented Jul 6, 2024

github-actions bot commented Jul 6, 2024

glenn-jocher commented Jul 6, 2024

Angelinnp commented Jul 7, 2024

Angelinnp commented Jul 8, 2024

glenn-jocher commented Jul 8, 2024

The GPU is not used when running detection with YOLOv5 #13171

The GPU is not used when running detection with YOLOv5 #13171

Comments

Angelinnp commented Jul 6, 2024

Search before asking

YOLOv5 Component

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

github-actions bot commented Jul 6, 2024

Requirements

Environments

Status

Introducing YOLOv8 🚀

glenn-jocher commented Jul 6, 2024

Angelinnp commented Jul 7, 2024

Angelinnp commented Jul 8, 2024

glenn-jocher commented Jul 8, 2024

Steps to Ensure GPU Utilization

Example Code Adjustments

Additional Debugging