The Computer Oracle

Escaping query strings with wget --mirror

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Ocean Floor

--

Chapters
00:00 Escaping Query Strings With Wget --Mirror
00:46 Accepted Answer Score 18
01:49 Answer 2 Score 1
02:35 Thank you

--

Full question
https://superuser.com/questions/242597/e...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#linux #wget #mirroring

#avk47



ACCEPTED ANSWER

Score 18


See the --restrict-file-names option. While not exactly intended for this particular purpose, --restrict-file-names=windows will probably help you along:

--restrict-file-names=modes

Change which characters found in remote URLs must be escaped during generation of local filenames. [...]

When "windows" is given, Wget escapes the characters \, |, /, :, ?, ", *, <, >, and the control characters in the ranges 0--31 and 128--159. In addition to this, Wget in Windows mode uses + instead of : to separate host and port in local file names, and uses @ instead of ? to separate the query portion of the file name from the rest. Therefore, a URL that would be saved as www.xemacs.org:4300/search.pl?input=blah in Unix mode would be saved as www.xemacs.org+4300/search.pl@input=blah in Windows mode.




ANSWER 2

Score 1


Your browser will view it fine if you use an URL like

file:///tmp/example.com/post.php%3Fid=1.html

instead of

file:///tmp/example.com/post.php?id=1.html

Note: if you're having trouble with internal links from downloaded files, it'd be because you terminated wget before it was done with the downloading. Since you specified --convert-links and --html-extension (only applies when those are given), wget would normally fix the links to use %3F instead of ?; however, it does this at the end, after it's finished downloading; if it has been interrupted, it will not have fixed any of the links, and you're left in this predicament. Of course, you can always write a script to go through and fix the links, but...