mixed-precision

Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

adapter deep-learning llama lora quantization language-model alpaca mistral fine-tuning peft finetuning mixed-precision gpt-2 gpt-j llm generative-ai gen-ai

Updated Mar 31, 2024
Python

rickiepark / deep-learning-with-python-2nd

Star

<케라스 창시자에게 배우는 딥러닝 2판> 도서의 코드 저장소

Updated Mar 9, 2024
Jupyter Notebook

andreped / GradientAccumulator

Star

🎯 Accumulated Gradients for TensorFlow 2

deep-learning tensorflow gpu keras tf2 hacktoberfest multi-gpu distributed-training float16 tpu batch-size mixed-precision gradient-accumulation tensorflow2 huggingface adaptive-gradient-clipping accumulated-gradients memory-constraints accumulated-batch-normalization

Updated Feb 11, 2024
Python

enp1s0 / cuMpSGEMM

Star

Fast SGEMM emulation on Tensor Cores

gpu cuda gemm half-precision mixed-precision tensorcore tensorcores fp32

Updated Nov 20, 2023
Cuda

zjykzj / YOLOv1

Star

You Only Look Once: Unified, Real-Time Object Detection

python pytorch apex yolo imagenet nvidia-docker pascal-voc yolov1 mixed-precision nvidia-apex distributed-data-parallel yolov1-loss

Updated Jul 9, 2023
Python

Zhen-Dong / HAWQ

Star

Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

pytorch quantization hessian 8-bit model-compression distillation tvm 4-bit mixed-precision tensorcore quantized-neural-networks hardware-aware efficient-neural-networks

Updated May 15, 2023
Python

qleenju / PDPU

Star

PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications

systemverilog posit unum dot-product mixed-precision posit-arithmetic posit-arithmetic-generator arithmetic-units

Updated May 5, 2023
SystemVerilog

at-aaims / OpenMxP

Star

This is the open source version of HPL-MXP. The code performance has been verified on Frontier

performance hpc mixed-precision

Updated May 1, 2023
C++

Zhen-Dong / BitPack

Star

BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.

memory pytorch quantization model-compression mixed-precision quantized-neural-networks

Updated Feb 7, 2023
Python

moritztng / prism

Star

High Resolution Style Transfer in PyTorch with Color Control and Mixed Precision 🎨

pytorch style-transfer high-resolution mixed-precision controlling-perceptual-factors

Updated Aug 7, 2022
Python

hellojialee / Improved-Body-Parts

Star

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

training tutorial heatmap pytorch distributed apex bottom-up pose-estimation multi-gpus mixed-precision multi-person focal-l2-loss

Updated May 23, 2022
Python

hinofafa / torch_accelerator

Star

Experiments to accelerate GPU device for PyTorch training

pytorch gpu-acceleration mixed-precision tensorcore gpu-profiler

Updated Dec 15, 2021
Jupyter Notebook

Ahmad-Shawahna / FxP-QNet

Star

A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation for Efficient Hardware Acceleration on Edge Devices

acceleration deep-learning neural-networks accuracy quantization model-compression efficient-inference mixed-precision fixed-point-arithmetic resource-constrained-devices