Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No ability to detect dump processing failure #522

Open
brett-matson opened this issue Jan 18, 2021 · 0 comments · May be fixed by #526
Open

No ability to detect dump processing failure #522

brett-matson opened this issue Jan 18, 2021 · 0 comments · May be fixed by #526

Comments

@brett-matson
Copy link
Contributor

brett-matson commented Jan 18, 2021

Hi there,

I'm still pretty new to Java, but from what I can tell, there doesn't appear to be any way for a calling program to detect a failure (e.g. IOException caused by file not found) in DumpProcessingController.processDump(). There's no method return value and the exceptions are caught and logged in the Wikidata-Toolkit (see code below). As far as I can tell, the only way an external program could detect an error would be to try to parse the Wikidata log file.

/**
	 * Processes the contents of the given dump file. All registered processor
	 * objects will be notified of all data. Note that JSON dumps do not
	 * contains any revision information, so that registered
	 * {@link MwRevisionProcessor} objects will not be notified in this case.
	 * Dumps of type {@link DumpContentType#SITES} cannot be processed with this
	 * method; use {@link #getSitesInformation()} to process these dumps.
	 *
	 * @param dumpFile
	 *            the dump to process
	 */
	public void processDump(MwDumpFile dumpFile) {
		if (dumpFile == null) {
			return;
		}

		MwDumpFileProcessor dumpFileProcessor;
		switch (dumpFile.getDumpContentType()) {
		case CURRENT:
		case DAILY:
		case FULL:
			dumpFileProcessor = getRevisionDumpFileProcessor();
			break;
		case JSON:
			dumpFileProcessor = getJsonDumpFileProcessor();
			break;
		case SITES:
		default:
			logger.error("Dumps of type " + dumpFile.getDumpContentType()
					+ " cannot be processed as entity-document dumps.");
			return;
		}

		processDumpFile(dumpFile, dumpFileProcessor);
	}
/**
	 * Processes one dump file with the given dump file processor, handling
	 * exceptions appropriately.
	 *
	 * @param dumpFile
	 *            the dump file to process
	 * @param dumpFileProcessor
	 *            the dump file processor to use
	 */
	void processDumpFile(MwDumpFile dumpFile,
			MwDumpFileProcessor dumpFileProcessor) {
		try (InputStream inputStream = dumpFile.getDumpFileStream()) {
			dumpFileProcessor.processDumpFileContents(inputStream, dumpFile);
		} catch (FileAlreadyExistsException e) {
			logger.error("Dump file "
					+ dumpFile.toString()
					+ " could not be processed since file "
					+ e.getFile()
					+ " already exists. Try deleting the file or dumpfile directory to attempt a new download.");
		} catch (IOException e) {
			logger.error("Dump file " + dumpFile.toString()
					+ " could not be processed: " + e.toString());
		}
	}

@wetneb wetneb linked a pull request Feb 3, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant