wget and curl question

nico · June 7, 2004 1:56PM

Hi everybody, i have a question: I use curl and wget usually to download single file or a file that are "number continuos", for example, 1.jpg, 2.jpg etc..

But how i can do to download in a single time all the .jpg files ?

Thanks for all.

Bye

Nico

thuh freak · June 7, 2004 3:54PM

quite easily. i do this often, myself, while pr0ning, i mean, while researching...

note: the following may require the z-shell (zsh), but i don't think it will.

Code:

wget 'http://most-of-the-url-goes-here/'{1,2,3,4,5,6,7,8,9}.jpg

that will be the equivalent of doing:

wget 'http://most-of-the-url-goes-here/'1.jpg 'http://most-of-the-url-goes-here/'2.jpg [etcetera]

i do the url part in quotes, because sometimes there are question marks or other characters in the url, which the shell wants to treat specially, but shouldn't. i tend to throw --continue on there too. for curl it'll be mostly the same, except you'll probably want to put '-O' before the http part (with unquoted white space between them).

alternatively, you could do this:

Code:

for f in 0 1 2 3 4 5 6 7 8 9; do; wget 'http://most-of-the-url-goes-here/'"$f"; done

that will execute several independant wget calls (each after the previous finishes). this will be unnoticeably slower than the first version (because wget will keep the connection alive in the first version, but will kill the conenction between each grab in the second).

also noteworthy is the range i put in ( {1,2,3,4...} and '0 1 2 3 4...' ) aren't restricted to numbers. you can put any string in there, but if they have spaces then you should quotes them. ie:

wget 'http://url/'{1,"this is my string",2,"lesserString"}.jpg

nico · June 7, 2004 4:49PM

Thanks for reply. But my problem is that the file name isn't a number, is a "name", with characters. But all the files begin with the "a" letter.

So, what i can do ?

Bye

Nico

whisper · June 7, 2004 6:10PM

Quote:

Originally posted by nico

Thanks for reply. But my problem is that the file name isn't a number, is a "name", with characters. But all the files begin with the "a" letter.

So, what i can do ?

Bye

Nico

It shouldn't be too hard to write a script to do this, but I don't have time to do it at the moment. If you're still having trouble with this on Thursday (June 10th) night, let me know and I'll give the script thing a shot for you.

kickaha · June 7, 2004 6:27PM

Or you could use curl, which includes this functionality for free.

curl --output ~/Documents/Downloads/#1.jpg http://my.favorite.porn.site/pics/[a-z].jpg

nico · June 8, 2004 5:59AM

Sorry but that doesn't work. Just for know, the site is http://www.fantagiochi.com/xIm/DVD/a/ and into the directory "a" there are a lot of jpg. If i try to list the directory safari return me the error:"Directory Listing Denied

This Virtual Directory does not allow contents to be listed." So i assume that the browsing of the directory isn't allowed. But if i try the direc link http://www.fantagiochi.com/xIm/DVD/a...ondo-front.jpg it works.

thanks

Nico

allinone · June 9, 2004 11:47AM

Quote:

Originally posted by nico

Sorry but that doesn't work. Just for know, the site is http://www.fantagiochi.com/xIm/DVD/a/ and into the directory "a" there are a lot of jpg. If i try to list the directory safari return me the error:"Directory Listing Denied

This Virtual Directory does not allow contents to be listed." So i assume that the browsing of the directory isn't allowed. But if i try the direc link http://www.fantagiochi.com/xIm/DVD/a...ondo-front.jpg it works.

thanks

Nico

If you do not know the names of the jpg files that you wish to retrieve it will not be possible to retrieve them.

Or am I missing something here?

If there are links to the images from html pages then you could use a tool like Interarchy to mirror the whole site...

kickaha · June 9, 2004 12:07PM

Yeah, there's no way to just say "Gimme everything in this directory, now! GIMME!" You have to know individual file URLs ahead of time.

Since they've disallowed your directory listing, your *only* recourse is to grab an app like SiteSucker point it at a known webpage that you think will get you all the files you want, and let it crawl the site.

wget and curl question

Comments