Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

identifier "__hisnan" is undefined #524

Open
jimmieliu opened this issue Nov 9, 2023 · 4 comments
Open

identifier "__hisnan" is undefined #524

jimmieliu opened this issue Nov 9, 2023 · 4 comments

Comments

@jimmieliu
Copy link

Hi,

Env:
cuda 11.6, pytorch 1.11
Installed with
pip install lightseq

Then test.py:
from lightseq.training.ops.pytorch.quant_linear_layer import LSQuantLinearLayer

When runing test.py, the following error happens.

It seems to me a simple issue, but dunno where to import the __hisnan function.

Thank you

83 errors detected in the compilation of "/opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/dropout_kernels.cu".
[7/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=lightseq_layers_new -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1013" -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/ops_new/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/lsflow/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/layers_new/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTHRUST_IGNORE_CUB_VERSION_CHECK -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -c /opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/cuda_util.cu -o cuda_util.cuda.o
FAILED: cuda_util.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=lightseq_layers_new -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1013" -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/ops_new/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/lsflow/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/layers_new/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTHRUST_IGNORE_CUB_VERSION_CHECK -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -c /opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/cuda_util.cu -o cuda_util.cuda.o
/opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/cuda_util.cu(218): error: identifier "__hisnan" is undefined

1 error detected in the compilation of "/opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/cuda_util.cu".
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1726, in _run_ninja_build
subprocess.run(
File "/opt/conda/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "test.py", line 4, in
from lightseq.training.ops.pytorch.quant_linear_layer import LSQuantLinearLayer
File "/opt/conda/lib/python3.8/site-packages/lightseq/training/init.py", line 1, in
from lightseq.training.ops.pytorch.transformer_embedding_layer import (
File "/opt/conda/lib/python3.8/site-packages/lightseq/training/ops/pytorch/init.py", line 11, in
layer_cuda_module = LayerBuilder().load()
File "/opt/conda/lib/python3.8/site-packages/lightseq/training/ops/pytorch/builder/builder.py", line 203, in load
return self.jit_load(verbose)
File "/opt/conda/lib/python3.8/site-packages/lightseq/training/ops/pytorch/builder/builder.py", line 231, in jit_load
op_module = load(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1130, in load
return _jit_compile(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1343, in _jit_compile
_write_ninja_file_and_build_library(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1455, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1742, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'lightseq_layers_new'

@helson73
Copy link

Just set env TORCH_CUDA_ARCH_LIST to null.

export TORCH_CUDA_ARCH_LIST=

Torch cpp_extension will internally add cuda_flags according to TORCH_CUDA_ARCH_LIST to ninja. You got these errors because somehow (when you installed some other libraries) your env variable TORCH_CUDA_ARCH_LIST are expanded to all possible archs. This caused nvcc compiler forced to disable half precision operations for old arch compatibility.

@Anychnn
Copy link
Collaborator

Anychnn commented Jan 15, 2024 via email

@runningabcd
Copy link

export TORCH_CUDA_ARCH_LIST=

不管用!

@Anychnn
Copy link
Collaborator

Anychnn commented Apr 15, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants