Option to skip on error #6

DaleTrexel · 2023-11-15T14:43:05Z

This is a great utility! It would be even greater if you offered a flag that allowed the extraction to skip specific records if they throw an error.

In my case, I have a WARC archive that contains some really long URLs, and during extraction it gets to the point that it throws this error, then stops:

Exception in thread "main" java.nio.file.FileSystemException: test-extract/maps.google.com/index;ll=44.969598%2C-93.247374&spn=0.007658%2C0.03006&ie=UTF8&hl=en_US&z=15&t=roadmap&sll=44.969598%2C-93.247374&sspn=0.007658%2C0.03006&q=414%20Cedar%20Ave%2C%20Minneapolis%2C%20MN%2055454%2C%20USA%20%28Malabari%20Kitchen%20Restaurant%29&output=embed.html: File name too long
	at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
	at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:261)
	at java.base/java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:482)
	at java.base/java.nio.file.Files.newOutputStream(Files.java:227)
	at org.netpreserve.warc2html.Warc2Html.writeTo(Warc2Html.java:227)
	at org.netpreserve.warc2html.Warc2Html.main(Warc2Html.java:70)

I'm OK with the extraction process skipping this record and proceeding to the next, if that is possible, though it would be good to get output of what records were skipped at the end. As it is now, I've got most of the site that I wanted to pull out of the archive, but I'm missing some of the CSS and JS files used to display it because they presumably occur later in the archive.

This is running WARC2HTML on MacOS (Sonoma), which may determine how long a filename is too long in this instance.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to skip on error #6

Option to skip on error #6

DaleTrexel commented Nov 15, 2023 •

edited

Loading

Option to skip on error #6

Option to skip on error #6

Comments

DaleTrexel commented Nov 15, 2023 • edited Loading

DaleTrexel commented Nov 15, 2023 •

edited

Loading