-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cli): use a queue for duplicate and upload #10750
Conversation
{ concurrency, retry: 3 }, | ||
); | ||
|
||
for (const items of chunk(files, concurrency)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reproduced the same behaviour, but I'm unsure if the file should be batched by concurrency
; I tested it with 100, and it worked fine. Maybe the batch size should be hardcoded to a greater number than the default concurrency
value.
I tested those changes with a 62.7 GB upload (12k files), but I did not write any tests; let me know if you would like some tests. 🙂 |
If you would be willing to write some tests, that would be awesome! We're always trying to increase our coverage to decrease the chance of regressions in future releases 🙂 |
I see no setup for e2e/integration test (testing a command). Would you prefer unit tests or an e2e/integration test? |
There are e2e tests for the cli inside of the |
Using a queue to process the files makes the file duplicate detection and asset upload more stable and tolerant of network errors. If an error occurs, the whole command will not stop; the task will be retried (3 times) before logging the error and moving to the next step. The new queue abstraction is using [fastq](https://www.npmjs.com/package/fastq) internally.
@zackpollard @jrasm91 I'm not sure what the best strategy for testing the upload retry could be. This test may not be an e2e test. I thought of mocking the HTTP request in the e2e tests with something like vitest-fetch-mock, but since we are executing a shell command to run the CLI, it's not possible. What do you suggest? |
Yeah, good point. I don't see any great way to accomplish that in our actual e2e environment. The only thing i can think of is doing something like executing |
It feels like it could introduce flaky tests as the request retry will exhaust until the server has restarted. |
Yeah, probably. You could always add some unit tests that run with vitest where you mock the sdk methods and have one reject with a network error. |
|
||
return response; | ||
}, | ||
{ concurrency, retry: 3 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be passed as an option through the CLI.
progressBar.increment(filepaths.length); | ||
return results; | ||
}, | ||
{ concurrency, retry: 3 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be passed as an option through the CLI.
@zackpollard @jrasm91 This is ready for review. I did my best to mock with accuracy, inspecting the requests made in the e2e tests. I'm waiting for your feedback! 🙂 |
Using a queue to process the files makes the file duplicate detection and asset upload more stable and tolerant of network errors. If an error occurs, the whole command will not stop; the task will be retried (3 times) before logging the error and moving to the next step.
The new queue abstraction is using fastq internally.
Fixes: #5839