Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli): use a queue for duplicate and upload #10750

Merged
merged 3 commits into from
Jul 9, 2024

Conversation

masterT
Copy link
Contributor

@masterT masterT commented Jul 2, 2024

Using a queue to process the files makes the file duplicate detection and asset upload more stable and tolerant of network errors. If an error occurs, the whole command will not stop; the task will be retried (3 times) before logging the error and moving to the next step.

The new queue abstraction is using fastq internally.

Fixes: #5839

@github-actions github-actions bot added the cli Tasks related to the Immich CLI label Jul 2, 2024
{ concurrency, retry: 3 },
);

for (const items of chunk(files, concurrency)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reproduced the same behaviour, but I'm unsure if the file should be batched by concurrency; I tested it with 100, and it worked fine. Maybe the batch size should be hardcoded to a greater number than the default concurrency value.

@masterT
Copy link
Contributor Author

masterT commented Jul 2, 2024

I tested those changes with a 62.7 GB upload (12k files), but I did not write any tests; let me know if you would like some tests. 🙂

@zackpollard
Copy link
Contributor

I tested those changes with a 62.7 GB upload (12k files), but I did not write any tests; let me know if you would like some tests. 🙂

If you would be willing to write some tests, that would be awesome! We're always trying to increase our coverage to decrease the chance of regressions in future releases 🙂

@masterT
Copy link
Contributor Author

masterT commented Jul 2, 2024

I tested those changes with a 62.7 GB upload (12k files), but I did not write any tests; let me know if you would like some tests. 🙂

If you would be willing to write some tests, that would be awesome! We're always trying to increase our coverage to decrease the chance of regressions in future releases 🙂

I see no setup for e2e/integration test (testing a command). Would you prefer unit tests or an e2e/integration test?

@jrasm91
Copy link
Contributor

jrasm91 commented Jul 2, 2024

There are e2e tests for the cli inside of the e2e/ folder

Using a queue to process the files makes the file duplicate detection and asset upload more stable and tolerant of network errors. If an error occurs, the whole command will not stop; the task will be retried (3 times) before logging the error and moving to the next step.

The new queue abstraction is using [fastq](https://www.npmjs.com/package/fastq) internally.
@masterT
Copy link
Contributor Author

masterT commented Jul 4, 2024

@zackpollard @jrasm91 I'm not sure what the best strategy for testing the upload retry could be. This test may not be an e2e test. I thought of mocking the HTTP request in the e2e tests with something like vitest-fetch-mock, but since we are executing a shell command to run the CLI, it's not possible.

What do you suggest?

@jrasm91
Copy link
Contributor

jrasm91 commented Jul 4, 2024

Yeah, good point. I don't see any great way to accomplish that in our actual e2e environment. The only thing i can think of is doing something like executing docker restart immich_server or something. Not sure if that is a good, reliable option or not.

@masterT
Copy link
Contributor Author

masterT commented Jul 4, 2024

Yeah, good point. I don't see any great way to accomplish that in our actual e2e environment. The only thing i can think of is doing something like executing docker restart immich_server or something. Not sure if that is a good, reliable option or not.

It feels like it could introduce flaky tests as the request retry will exhaust until the server has restarted.

@jrasm91
Copy link
Contributor

jrasm91 commented Jul 4, 2024

Yeah, probably. You could always add some unit tests that run with vitest where you mock the sdk methods and have one reject with a network error.


return response;
},
{ concurrency, retry: 3 },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be passed as an option through the CLI.

progressBar.increment(filepaths.length);
return results;
},
{ concurrency, retry: 3 },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be passed as an option through the CLI.

@masterT
Copy link
Contributor Author

masterT commented Jul 7, 2024

@zackpollard @jrasm91 This is ready for review.

I did my best to mock with accuracy, inspecting the requests made in the e2e tests.

I'm waiting for your feedback! 🙂

@alextran1502 alextran1502 merged commit eb89208 into immich-app:main Jul 9, 2024
24 checks passed
@masterT masterT deleted the feat/cli-queue branch July 9, 2024 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli Tasks related to the Immich CLI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CLI retry failed request and don't cancel entire flow on error
4 participants