Yesterday I had to download a file of size 232 MB. As I connect to the Internet via proxy and the authorities have restricted the file size download limit to 150 MB, I had to find some alternative. Similar scenarios are there in almost all Educational Institutions.

I was searching for the solution, somebody told me about JGet (a Multithreaded Java Software), but it hogs up a lot of system resources. I was in search of an alternate solution, then somebody on ##linux in irc.freenode.net told me about curl. Though I had heard of it earlier as well and I used it to automate web browsing like posting form data etc. But this time when I went through the man pages of curl, I realized how powerful is curl.

There are hundreds of options for CURL. I will just demonstrate you the one option which can help you download files of almost any size.

Lets start off…

In this example I am trying to download 32 Bit Fedora Live CD iso image. In this example, what we will do is download a part of file in one connection. After downloading all files we will concatenate them. This is what almost any Download manager software does.

To download a part of file theres an option -r/ --range available in curl which lets us specify the range of bytes of the file to be downloaded.

Range can be specified in following ways :
curl -r 0-499
This specifies to download first 500 bytes
curl -r 500-999
This specifies to download bytes starting from 500 to 999.
curl -r -500
specifies to download last 500 bytes.
curl -r 500-
specifies to download the bytes from offset 500 and forward

There are few more options. Check out the man pages. ( I have copied the simple options here).

Now here I have to download Fedora live CD iso which is 691 MB in size. So, I will download 100 MB in each connection, like this:
curl -# -r 0-99999999 -o fedora.iso.part1 http://download.fedoraproject.org/pub/fedora/linux/releases/9/Live/i686/Fedora-9-i686-Live.iso &

curl -# -r 100000000-199999999 -o fedora.iso.part2 http://download.fedoraproject.org/pub/fedora/linux/releases/9/Live/i686/Fedora-9-i686-Live.iso &

curl -# -r 200000000-299999999 -o fedora.iso.part3 http://download.fedoraproject.org/pub/fedora/linux/releases/9/Live/i686/Fedora-9-i686-Live.iso &

curl -# -r 300000000-399999999 -o fedora.iso.part4 http://download.fedoraproject.org/pub/fedora/linux/releases/9/Live/i686/Fedora-9-i686-Live.iso &

curl -# -r 400000000-499999999 -o fedora.iso.part5 http://download.fedoraproject.org/pub/fedora/linux/releases/9/Live/i686/Fedora-9-i686-Live.iso &

curl -# -r 500000000-599999999 -o fedora.iso.part6 http://download.fedoraproject.org/pub/fedora/linux/releases/9/Live/i686/Fedora-9-i686-Live.iso &

curl -# -r 600000000- -o fedora.iso.part7 http://download.fedoraproject.org/pub/fedora/linux/releases/9/Live/i686/Fedora-9-i686-Live.iso &

In the above code -# is to suppress the details of progress meter. You can ignore this option.

After all the files are downloaded, we need to concatenate them. This is quite simple.

$ cat fedora.iso.part? > fedora-9-live.iso
And your live CD iso image is ready as fedora-9-live.iso
To verify the file has been downloaded and concatenated correctly you may verify the checksums before burning it to a CD/DVD.

This process could be automated if somehow by a simple shell script if we could get the file size before starting the downloaded. To accomplish this, I once again went through the man pages. And I found a useful option -I/ --head . This option fetches the HTTP-header only. One can easily get the file size as Content-Length from the HTTP headers. But, when I tried this through proxy servers I got a X-Squid-Error: ERR_TOO_BIG , so couldn’t determine the correct file size. And hence I am unable to automate the process. Anybody, any help ?? (please post it as comment).

Happy Downloading !!

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Google Bookmarks
  • Reddit
  • Facebook
  • TwitThis
  • Live
  • Technorati
  • Furl
  • Sphinn
  • blogmarks
  • LinkedIn
  • Pownce



18 Comments to “Download Files Larger Than The Download Limit”

  1. Download Files Larger Than The Download Limit : blog edvdbox | August 18th, 2008 at 11:57 pm

    [...] Original post by spsneo [...]

  2. www.ubuntukungfu.org | August 18th, 2008 at 11:59 pm

    /home/spsneo/blog Download Files Larger Than The Download Limit…

    how to download large files in linux when file download size is limited by proxy servers…

  3. abcdefg | August 20th, 2008 at 11:08 pm

    Nice article there! But I had a query, somewhat unrelated, which is how do you connect to IRC chats from within IITG? I’ve tried it many times, using many hacks into the config files, none of which work. I’m using Ubuntu Linux, and the chat clients that I’ve used are X-Chat, Konversation, and pidgin too, but without any avail. Please help, if you can!

  4. spsneo | August 22nd, 2008 at 9:30 pm

    @abcdefg
    use firefox addon chatzilla

  5. Kshitiz | September 1st, 2008 at 9:10 am

    Sexy solution for those using *nix! Keep it up, ye Geek Warrior!

  6. Anderson | September 30th, 2008 at 12:56 am

    Man, this is awesome! I loved this one and I’m intended to carry on with this. I have the same problem, a bad guy proxy configuration. I’ve tried to download a file beyond its limits, and curl does not write the file. If you try to read the file (sort of “file exists”), then you’ll be able to know when the download has finished.

    I’ve made some tests, and there’s no problem with curl when you try to download a block larger than the specified range.

  7. Anderson | October 2nd, 2008 at 9:44 pm

    Hi!

    I’ve made an automated script to download it. It is very simple, but is useful and I hope its reliable. I’m testing it downloading some Fedora 10 isos.

    Here’s it:
    http://pastie.org/283677

  8. Anderson | October 2nd, 2008 at 9:45 pm

    Oh, and yes, you have to define the name of the downloaded file. I don’t know a thing about string manipulation.

  9. spsneo | October 2nd, 2008 at 9:52 pm

    @Anderson

    Thanks.
    If you wish I can include your script in my blog.

  10. Anderson | October 3rd, 2008 at 6:08 pm

    Feel free to do it. But if you can make some kind of optimization or correction on the script, I will be grateful.

  11. Anderson | October 3rd, 2008 at 10:56 pm

    Hi!
    Here’s a better version, improved, supports resuming:
    http://pastie.org/284370

  12. Toby | October 6th, 2008 at 4:10 am

    What about BitTorrent? (where available – distro images should be no problem)

  13. bart | March 24th, 2009 at 6:40 pm

    need help!!!!!
    when i used curl to download am jus getting some 1 kb file and i wanna download backtrack os of some 700 mb and its an ftp site will this curl work with ftp sites???

  14. Rishi | April 1st, 2009 at 9:12 pm

    Why don’t you just use DownThemAll in Firefox? It would do the same thing for you albeit automatically.

  15. Adriana | May 13th, 2009 at 7:55 pm

    Hi there!

    I have the same problem whith proxy.. but the restricted size to download files is set to 10 MB !!!

    And my problem is the Update Manager, I’m usign Ubuntu 9.04.

    Thanks for your help.

  16. automate curl download by range | No | May 27th, 2009 at 12:56 pm

    [...] is an example of perl script to automate 1 connection to download file. Another could be found on spsneo website. Both couldn’t work for me, so i decided to write a script for my [...]

  17. tito | October 11th, 2009 at 9:25 pm

    hmm, interesting, i have try to implement this method, it works for 2nd chunk till the end. but for first chunk it fail

    for first 2 byte (byte 0-1), still thinking how to download first 2 byte.
    arggghhh…….

  18. spsneo | October 18th, 2009 at 7:05 pm

    @tito

    Download the first 2byte separately.

Leave a Comment