fix-_is_package_available-unify-behavior-for-available-failing-import #31798

Laz4rz · 2024-07-04T16:00:37Z

What does this PR do?

Discussion is in #31793. Turns out it does not completely fix it, but allows a unified handling of libraries that are installed, hence importlib.metadata.version can read their version, but fail to import with ImportError. Current behavior for these packages with _is_package_available() is to return True. While this is not coherent with how torch is treated in the same function:

68                except ImportError:
69                    # If the package can't be imported, it's not available
70                    package_exists = False

and also does not align with the name of the function.

I think it would be beneficial to merge this PR, because the behavior of the function is a little counterintuitive and results in problems down the line when the developer is not fully aware of how it treats these corner cases.

#31793 could be fixed with this + skipping the import guard check in dynamic_module_utils.py, line 181 -- model works the same with and without flash_attn. But I think this is up to the model developers how to handle it.

Finally, I am not sure of the logging level there, but I guess some information to the user would also be beneficial.

Fixes :

flash_attn ImportError breaking model loading (Florence-2-base-ft) #31793 (partially)
while running make fixup, allows handling broken flash_attn imports:
Initially:

(tfix) [mikolaj@0 transformers]$ make fixup
Checking/fixing src/transformers/utils/import_utils.py
All checks passed!
1 file left unchanged
python utils/custom_init_isort.py
python utils/sort_auto_mappings.py
python utils/check_doc_toc.py --fix_and_overwrite
running deps_table_update
updating src/transformers/dependency_versions_table.py
python utils/check_copies.py
python utils/check_table.py
python utils/check_dummies.py
python utils/check_repo.py
2024-07-04 11:49:56.507801: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-07-04 11:49:56.529334: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-07-04 11:49:56.529360: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-07-04 11:49:56.530168: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-07-04 11:49:56.534028: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-07-04 11:49:56.979032: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Checking all models are included.
Checking all models are public.
Traceback (most recent call last):
  File "/home/mikolaj/github/transformers/src/transformers/utils/import_utils.py", line 1567, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
  File "/home/mikolaj/miniconda3/envs/tfix/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/mikolaj/github/transformers/src/transformers/models/bark/modeling_bark.py", line 57, in <module>
    from flash_attn import flash_attn_func, flash_attn_varlen_func
  File "/home/mikolaj/miniconda3/envs/tfix/lib/python3.10/site-packages/flash_attn/__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "/home/mikolaj/miniconda3/envs/tfix/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 10, in <module>
    import flash_attn_2_cuda as flash_attn_cuda
ImportError: /home/mikolaj/miniconda3/envs/tfix/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mikolaj/github/transformers/utils/check_repo.py", line 1193, in <module>
    check_repo_quality()
  File "/home/mikolaj/github/transformers/utils/check_repo.py", line 1172, in check_repo_quality
    check_models_are_in_init()
  File "/home/mikolaj/github/transformers/utils/check_repo.py", line 497, in check_models_are_in_init
    for module in get_model_modules():
  File "/home/mikolaj/github/transformers/utils/check_repo.py", line 442, in get_model_modules
    modeling_module = getattr(model_module, submodule)
  File "/home/mikolaj/github/transformers/src/transformers/utils/import_utils.py", line 1555, in __getattr__
    value = self._get_module(name)
  File "/home/mikolaj/github/transformers/src/transformers/utils/import_utils.py", line 1569, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.bark.modeling_bark because of the following error (look up to see its traceback):
/home/mikolaj/miniconda3/envs/tfix/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
make: *** [Makefile:41: repo-consistency] Error 1

With PR:

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[x ] Did you read the contributor guideline,
Pull Request section?
[x ] Was this discussed/approved via a Github issue or the forum? Yes, partially: flash_attn ImportError breaking model loading (Florence-2-base-ft) #31793
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.

No, don't think it's needed. It is aligned with function signature.

Did you write any new necessary tests?

Who can review?

@LysandreJik @amyeroberts @muellerzr

muellerzr · 2024-07-04T16:25:06Z

I’m 99.9% positive we don’t want to do this, as importing anything from transformers will be bloated by trying to import any library that’s available.

I’d rather see what we have in accelerate, where only specific libraries where this matters get imported instead if specifically needed for this.

To test this, could you please report a tuna trace showing before and after timings of import transformers and from transformers import AutoModel?

Laz4rz · 2024-07-04T23:30:24Z

Of course you were right. Tested on the script below, as just import transformers and from transformers import AutoModel were yielding strangely looking logs.

For

from transformers import AutoProcessor, AutoModelForCausalLM 

model = AutoModelForCausalLM.from_pretrained("microsoft/Florence-2-large", trust_remote_code=True)
processor = AutoProcessor.from_pretrained("microsoft/Florence-2-large", trust_remote_code=True)

tuna import profiling

python -X importtime florence.py 2> fix.log

gives:

PR: 5.412s
Main: 2.757s

Most likely due to all the:

_accelerate_available, _accelerate_version = _is_package_available("accelerate", return_version=True)
_apex_available = _is_package_available("apex")
_aqlm_available = _is_package_available("aqlm")
_av_available = importlib.util.find_spec("av") is not None
_bitsandbytes_available = _is_package_available("bitsandbytes")
_eetq_available = _is_package_available("eetq")
_galore_torch_available = _is_package_available("galore_torch")
_lomo_available = _is_package_available("lomo_optim")
...

in the import_utils.py file. So I don't think that's a viable option. The function name + behavior with torch, and problems that it yields in some cases still scratches me a little wrong. Any advice what direction I can try to take it?

Laz4rz · 2024-07-05T10:16:23Z

I also saw @sgugger mentioned for TODO on this function, so might be worth @ here.

fix-_is_package_available-unify-behavior-for-available-failing-import

ae99123

Laz4rz mentioned this pull request Jul 4, 2024

flash_attn ImportError breaking model loading (Florence-2-base-ft) #31793

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix-_is_package_available-unify-behavior-for-available-failing-import #31798

fix-_is_package_available-unify-behavior-for-available-failing-import #31798

Laz4rz commented Jul 4, 2024

muellerzr commented Jul 4, 2024 •

edited

Loading

Laz4rz commented Jul 4, 2024

Laz4rz commented Jul 5, 2024

fix-_is_package_available-unify-behavior-for-available-failing-import #31798

Are you sure you want to change the base?

fix-_is_package_available-unify-behavior-for-available-failing-import #31798

Conversation

Laz4rz commented Jul 4, 2024

What does this PR do?

Fixes :

Before submitting

Who can review?

muellerzr commented Jul 4, 2024 • edited Loading

Laz4rz commented Jul 4, 2024

Laz4rz commented Jul 5, 2024

muellerzr commented Jul 4, 2024 •

edited

Loading