Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Can't open inference server #354

Closed
limingchina opened this issue Jul 7, 2024 · 3 comments
Closed

[BUG]Can't open inference server #354

limingchina opened this issue Jul 7, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@limingchina
Copy link

limingchina commented Jul 7, 2024

Describe the bug
Can't open inference server.

To Reproduce

  1. Run install_env.bat with USE_MIRROR=false and INSTALL_TYPE=stable
  2. Change API_FLAGS.txt and enable "--infer", then Run start.bat
  3. Go to the inference tab and click the "open inference server"
  4. Go to the web address http://127.0.0.1:7860

Expected behavior
The inference web UI should be shown at http://127.0.0.1:7860

Actual behavior
No inference web UI is show. The inference service is not running

Screenshots / log
It seems that it complains that the triton can't find the CUDA lib. However, according to nvidia's doc: https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/getting_started/quickstart.html#run-on-cpu-only-system. The triton should be able to run without GPU as well.

The python stacktrace appeared twice in the following log. The first happens when starting the webui. I can see that webpage. But When I go to the "inference" tab on the page, and click "open inference server", it shows the same stacktrace and the inference server webpage is not shown.

Start WebUI Inference...
Debug: flags = --listen 0.0.0.0:8000 --llama-checkpoint-path "checkpoints/fish-speech-1.2" --decoder-checkpoint-path "checkpoints/fish-speech-1.2/firefly-gan-vq-fsq-4x1024-42hz-generator.pth" --decoder-config-name firefly_gan_vq
Traceback (most recent call last):
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\username\Downloads\fish-speech\tools\webui.py", line 23, in <module>
    from tools.api import decode_vq_tokens, encode_reference
  File "C:\Users\username\Downloads\fish-speech\tools\api.py", line 34, in <module>
    from fish_speech.models.vqgan.modules.firefly import FireflyArchitecture
  File "C:\Users\username\Downloads\fish-speech\fish_speech\models\vqgan\__init__.py", line 1, in <module>
    from .lit_module import VQGAN
  File "C:\Users\username\Downloads\fish-speech\fish_speech\models\vqgan\lit_module.py", line 5, in <module>
    import lightning as L
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\__init__.py", line 19, in <module>
    from lightning.fabric.fabric import Fabric  # noqa: E402
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\fabric\__init__.py", line 30, in <module>
    from lightning.fabric.fabric import Fabric  # noqa: E402
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\fabric\fabric.py", line 46, in <module>
    from lightning.fabric.loggers import Logger
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\fabric\loggers\__init__.py", line 15, in <module>
    from lightning.fabric.loggers.tensorboard import TensorBoardLogger  # noqa: F401
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\fabric\loggers\tensorboard.py", line 31, in <module>
    from lightning.fabric.wrappers import _unwrap_objects
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\fabric\wrappers.py", line 38, in <module>
    from torch._dynamo import OptimizedModule
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\_dynamo\__init__.py", line 2, in <module>
    from . import convert_frame, eval_frame, resume_execution
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\_dynamo\convert_frame.py", line 41, in <module>
    from . import config, exc, trace_rules
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\_dynamo\exc.py", line 11, in <module>
    from .utils import counters
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\_dynamo\utils.py", line 1031, in <module>
    if has_triton_package():
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\utils\_triton.py", line 8, in has_triton_package
    import triton
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\__init__.py", line 8, in <module>
    from .runtime import (
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\runtime\__init__.py", line 1, in <module>
    from .autotuner import (Autotuner, Config, Heuristics, OutOfResources, autotune, heuristics)
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\runtime\autotuner.py", line 7, in <module>
    from ..testing import do_bench
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\testing.py", line 7, in <module>
    from . import language as tl
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\language\__init__.py", line 6, in <module>
    from .standard import (
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\language\standard.py", line 3, in <module>
    from ..runtime.jit import jit
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\runtime\jit.py", line 10, in <module>
    from ..runtime.driver import driver
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\runtime\driver.py", line 1, in <module>
    from ..backends import backends
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\__init__.py", line 50, in <module>
    backends = _discover_backends()
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\__init__.py", line 43, in _discover_backends
    compiler = _load_module(name, os.path.join(root, name, 'compiler.py'))
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\__init__.py", line 12, in _load_module
    spec.loader.exec_module(module)
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\nvidia\compiler.py", line 3, in <module>
    from triton.backends.nvidia.driver import CudaUtils
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\nvidia\driver.py", line 18, in <module>
    library_dir += [os.path.join(os.environ.get("CUDA_PATH"), "lib", "x64")]
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\ntpath.py", line 104, in join
    path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not NoneType

Next launch the page...
['', 'C:\\Users\\username\\Downloads\\fish-speech\\fish_speech\\webui', 'C:\\Users\\username\\Downloads\\fish-speech', 'C:\\Users\\username\\Downloads\\fish-speech\\fishenv\\env\\python310.zip', 'C:\\Users\\username\\Downloads\\fish-speech\\fishenv\\env\\DLLs', 'C:\\Users\\username\\Downloads\\fish-speech\\fishenv\\env\\lib', 'C:\\Users\\username\\Downloads\\fish-speech\\fishenv\\env', 'C:\\Users\\username\\Downloads\\fish-speech\\fishenv\\env\\lib\\site-packages', '__editable__.fish_speech-0.1.0.finder.__path_hook__']
You are in  C:\Users\username\Downloads\fish-speech
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
2024-07-07 00:14:39.928 | INFO     | __main__:clean_infer_cache:146 - C:\Users\username\AppData\Local\Temp\gradio was not found
Traceback (most recent call last):
  File "C:\Users\username\Downloads\fish-speech\tools\webui.py", line 23, in <module>
    from tools.api import decode_vq_tokens, encode_reference
  File "C:\Users\username\Downloads\fish-speech\tools\api.py", line 34, in <module>
    from fish_speech.models.vqgan.modules.firefly import FireflyArchitecture
  File "C:\Users\username\Downloads\fish-speech\fish_speech\models\vqgan\__init__.py", line 1, in <module>
    from .lit_module import VQGAN
  File "C:\Users\username\Downloads\fish-speech\fish_speech\models\vqgan\lit_module.py", line 5, in <module>
    import lightning as L
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\__init__.py", line 19, in <module>
    from lightning.fabric.fabric import Fabric  # noqa: E402
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\fabric\__init__.py", line 30, in <module>
    from lightning.fabric.fabric import Fabric  # noqa: E402
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\fabric\fabric.py", line 46, in <module>
    from lightning.fabric.loggers import Logger
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\fabric\loggers\__init__.py", line 15, in <module>
    from lightning.fabric.loggers.tensorboard import TensorBoardLogger  # noqa: F401
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\fabric\loggers\tensorboard.py", line 31, in <module>
    from lightning.fabric.wrappers import _unwrap_objects
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\lightning\fabric\wrappers.py", line 38, in <module>
    from torch._dynamo import OptimizedModule
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\_dynamo\__init__.py", line 2, in <module>
    from . import convert_frame, eval_frame, resume_execution
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\_dynamo\convert_frame.py", line 41, in <module>
    from . import config, exc, trace_rules
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\_dynamo\exc.py", line 11, in <module>
    from .utils import counters
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\_dynamo\utils.py", line 1031, in <module>
    if has_triton_package():
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\utils\_triton.py", line 8, in has_triton_package
    import triton
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\__init__.py", line 8, in <module>
    from .runtime import (
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\runtime\__init__.py", line 1, in <module>
    from .autotuner import (Autotuner, Config, Heuristics, OutOfResources, autotune, heuristics)
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\runtime\autotuner.py", line 7, in <module>
    from ..testing import do_bench
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\testing.py", line 7, in <module>
    from . import language as tl
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\language\__init__.py", line 6, in <module>
    from .standard import (
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\language\standard.py", line 3, in <module>
    from ..runtime.jit import jit
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\runtime\jit.py", line 10, in <module>
    from ..runtime.driver import driver
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\runtime\driver.py", line 1, in <module>
    from ..backends import backends
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\__init__.py", line 50, in <module>
    backends = _discover_backends()
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\__init__.py", line 43, in _discover_backends
    compiler = _load_module(name, os.path.join(root, name, 'compiler.py'))
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\__init__.py", line 12, in _load_module
    spec.loader.exec_module(module)
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\nvidia\compiler.py", line 3, in <module>
    from triton.backends.nvidia.driver import CudaUtils
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\nvidia\driver.py", line 18, in <module>
    library_dir += [os.path.join(os.environ.get("CUDA_PATH"), "lib", "x64")]
  File "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\ntpath.py", line 104, in join
    path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not NoneType

Additional context
Windows 11. Intel integrated graphics card. Use lastest master code.

@limingchina limingchina added the bug Something isn't working label Jul 7, 2024
@limingchina limingchina changed the title [BUG] [BUG]Can't open inference server. Jul 7, 2024
@limingchina limingchina changed the title [BUG]Can't open inference server. [BUG]Can't open inference server Jul 7, 2024
@AnyaCoder
Copy link
Collaborator

I think you need CUDA device...

@limingchina
Copy link
Author

Does it mean I can't run it without Nvidia graphics card?

@limingchina
Copy link
Author

OK. I set the CUDA_PATH environment variable as "C:\Users\username\Downloads\fish-speech\fishenv\env\lib\site-packages\triton\backends\nvidia". It doesn't show that error in triton. However, later, I saw the error;

2024-07-07 19:42:07.597 | INFO     | __main__:clean_infer_cache:146 - C:\Users\china\AppData\Local\Temp\gradio was not found
2024-07-07 19:43:02.397 | INFO     | __main__:<module>:451 - Loading Llama model...
Exception in thread Thread-6 (worker):
Traceback (most recent call last):
  File "C:\Users\china\Downloads\fish-speech\fishenv\env\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "C:\Users\china\Downloads\fish-speech\fishenv\env\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\china\Downloads\fish-speech\tools\llama\generate.py", line 557, in worker
    model, decode_one_token = load_model(
  File "C:\Users\china\Downloads\fish-speech\tools\llama\generate.py", line 346, in load_model
    model = model.to(device=device, dtype=precision)
  File "C:\Users\china\Downloads\fish-speech\fishenv\env\lib\site-packages\torch\nn\modules\module.py", line 1137, in to
    device, dtype, non_blocking, convert_to_format = torch._C._nn._parse_to(*args, **kwargs)
RuntimeError: Device string must not be empty

I guess it does require a nvidia graphics card. Can you confirm? If yes, maybe you can use the issue to improve the requirement part of the documentation and mention this explicitly.

@limingchina limingchina closed this as not planned Won't fix, can't repro, duplicate, stale Jul 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants