This is a series of computer vision foundational projects that anyone diving into the field must tackle.
-
Updated
Jul 16, 2024 - Jupyter Notebook
This is a series of computer vision foundational projects that anyone diving into the field must tackle.
Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta
Implementation of Midas from [Towards Robust Monocular Depth Estimation] in Pytorch and Zeta
we generate captions to the images which are given by user(user input) using prompt engineering and Generative AI
Enhance your skills in prompt engineering for vision models. Learn to effectively prompt, fine-tune, and track experiments for models like SAM, OWL-ViT, and Stable Diffusion 2.0 to achieve precise image generation, segmentation, and object detection.
In This repo i FineTuned a Pretrained ResNet18 model from PyTorch library
A framework to compute threshold sensitivity of deep networks to visual stimuli.
Testing the Moondream tiny vision model
Vision-based swarms in the Presence of Occlusions
building AVA from ex-machina; a lightweight multi-modal system from scratch, just for learning & experimentation
Add a description, image, and links to the vision-models topic page so that developers can more easily learn about it.
To associate your repository with the vision-models topic, visit your repo's landing page and select "manage topics."