Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Superset API import dataset without SSH Tunnel not working ? #29443

Open
3 tasks done
kdaouda opened this issue Jul 1, 2024 · 1 comment
Open
3 tasks done

Superset API import dataset without SSH Tunnel not working ? #29443

kdaouda opened this issue Jul 1, 2024 · 1 comment
Assignees
Labels
api Related to the REST API data:csv Related to import/export of CSVs

Comments

@kdaouda
Copy link

kdaouda commented Jul 1, 2024

Bug description

Hi everyone,
I need your help.

When I try to import a CSV dataset via the superset API with the endpoint /api/v1/dataset/import/, the request response is {"message": "OK"}, which indicates that everything went well.
The problem is that the dataset has not been imported, and the following messages appear in the logs:

/app/superset/commands/importers/v1/utils.py:113: SAWarning: TypeDecorator EncryptedType() will not produce a cache key because the ``cache_ok`` attribute is not set to True.  This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions.  Set this attribute to True if this type object's state is safe to use in a cache key, or False to disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)

for uuid, password in db.session.query(Database.uuid, Database.password).all()

/app/superset/commands/importers/v1/utils.py:118: SAWarning: TypeDecorator EncryptedType() will not produce a cache key because the ``cache_ok`` attribute is not set to True.  This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions.  Set this attribute to True if this type object's state is safe to use in a cache key, or False to disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)

for uuid, password in db.session.query(SSHTunnel.uuid, SSHTunnel.password).all()

/app/superset/commands/importers/v1/utils.py:125: SAWarning: TypeDecorator EncryptedType() will not produce a cache key because the ``cache_ok`` attribute is not set to True.  This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions.  Set this attribute to True if this type object's state is safe to use in a cache key, or False to disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)

).all()

/app/superset/commands/importers/v1/utils.py:132: SAWarning: TypeDecorator EncryptedType() will not produce a cache key because the ``cache_ok`` attribute is not set to True.  This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions.  Set this attribute to True if this type object's state is safe to use in a cache key, or False to disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)

).all()

I'm importing without using an SSH tunnel. Is the use of an SSH tunnel mandatory? If not, how can I fix this problem?

Thank you in advance for your help.

How to reproduce the bug

  1. Call endpoint /api/v1/dataset/import/ with parameters formData and passwords. Not use the ssh tunnel informations
  2. One authenticated, add the bearer token and the csrf token, then launch the test
  3. You will receive {"message": "OK"} even if the dataset is not imported

Screenshots/recordings

example_import_dataset_superset.zip

Capture d’écran 2024-07-01 à 14 10 54

Superset version

4.0.1

Python version

3.9

Node version

16

Browser

Chrome

Additional context

No response

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
@dosubot dosubot bot added api Related to the REST API data:csv Related to import/export of CSVs labels Jul 1, 2024
Copy link

dosubot bot commented Jul 1, 2024

Hey @kdaouda 👋, I'm here to help you with your Superset issue. I'm a bot that can assist you with bugs, questions, and becoming a contributor. Let's squash this bug together! 🤖

To fix the issue where the Superset API endpoint /api/v1/dataset/import/ returns {"message": "OK"} but does not actually import the dataset, and the logs show warnings related to TypeDecorator EncryptedType(), you should ensure that the ImportDatasetsCommand is correctly processing the dataset contents and handling any encryption-related issues.

Here is a possible solution to add error handling and logging to the ImportDatasetsCommand execution:

if not contents:
    raise NoValidFilesFoundError()

passwords = (
    json.loads(request.form["passwords"])
    if "passwords" in request.form
    else None
)
overwrite = request.form.get("overwrite") == "true"
sync_columns = request.form.get("sync_columns") == "true"
sync_metrics = request.form.get("sync_metrics") == "true"
ssh_tunnel_passwords = (
    json.loads(request.form["ssh_tunnel_passwords"])
    if "ssh_tunnel_passwords" in request.form
    else None
)
ssh_tunnel_private_keys = (
    json.loads(request.form["ssh_tunnel_private_keys"])
    if "ssh_tunnel_private_keys" in request.form
    else None
)
ssh_tunnel_priv_key_passwords = (
    json.loads(request.form["ssh_tunnel_private_key_passwords"])
    if "ssh_tunnel_private_key_passwords" in request.form
    else None
)

try:
    command = ImportDatasetsCommand(
        contents,
        passwords=passwords,
        overwrite=overwrite,
        sync_columns=sync_columns,
        sync_metrics=sync_metrics,
        ssh_tunnel_passwords=ssh_tunnel_passwords,
        ssh_tunnel_private_keys=ssh_tunnel_private_keys,
        ssh_tunnel_priv_key_passwords=ssh_tunnel_priv_key_passwords,
    )
    command.run()
    return self.response(200, message="OK")
except Exception as ex:
    logger.error(
        "Error importing datasets: %s",
        str(ex),
        exc_info=True,
    )
    return self.response_500(message="Failed to import datasets")

Additionally, ensure that the encryption configuration is correctly set up in your application. The EncryptedType requires a SECRET_KEY to be specified in the app configuration. Here is an example of how the encryption setup should be initialized:

from superset.utils.encrypt import EncryptedFieldFactory

app = Flask(__name__)
app.config['SECRET_KEY'] = 'your_secret_key'
app.config['SQLALCHEMY_ENCRYPTED_FIELD_TYPE_ADAPTER'] = SQLAlchemyUtilsAdapter

encrypted_field_factory = EncryptedFieldFactory()
encrypted_field_factory.init_app(app)

Make sure that the SECRET_KEY is correctly set in your application configuration. This key is essential for the EncryptedType to function properly. If the SECRET_KEY is missing or incorrect, it could lead to issues with the encryption and decryption processes, which might be causing the dataset import to fail silently [1][2][3][4].

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Related to the REST API data:csv Related to import/export of CSVs
Projects
None yet
Development

No branches or pull requests

2 participants