Monday, April 30, 2012

Bypass proxy server's file size download limit restriction


   Many organizations and colleges restrict their employees and students respectively from downloading files from the Internet which are larger than a prescribed limit. It is way too low at 14MB where I work. Fret not! There are ways to bypass this. And here is a simple bash script I wrote to download much larger files at my workplace.

Note: This script works only with direct links and with servers which support resume-download functionality.

  I'm continually working on it. So, the latest version will be available on my github account.

How to run it?
  1. Download the following code to a text file named curldownload.sh
  2. Give executable permissions to it chmod +x curldownload.sh
  3. File size limit, fsize_limit variable, is set to 14MB. You may change it to your liking.
  4. The script takes two arguments; the first one being the url of the file to be downloaded; the second one which is optional (defaults to "./") is the output directory.
  5. For ex:- ./curldownload.sh http://ftp.jaist.ac.jp/pub/mozilla.org/firefox/releases/12.0/linux-i686/en-US/firefox-12.0.tar.bz2 "$HOME/Downloads"
  6. A little more complex example of it using multiple urls, and two command-line arguments (-d for output directory, and -u for user-agent http header) is:  ./curl-multi-url.sh -d ~/downloads/ -u "Chromium/18.0" http://ftp.jaist.ac.jp/pub/mozilla.org/firefox/releases/11.0/linux-x86_64/en-US/firefox-11.0.tar.bz2 http://ftp.jaist.ac.jp/pub/mozilla.org/firefox/releases/12.0/linux-i686/en-US/firefox-12.0.tar.bz2
#!/bin/bash
#
# Vikas Reddy @
#   http://vikas-reddy.blogspot.in/2012/04/bypass-proxy-servers-file-size-download.html
#
# 
# Usage:
#     ./curl-multi-url.sh -d OUTPUT_DIRECTORY -u USER_AGENT http://url-1/ http://url-2/;
#     Arguments -d and -u are optional
#
#

# Defaults
fsize_limit=$((14*1024*1024))
user_agent="Firefox/10.0"
output_dir="."


# Command-line options
while getopts 'd:u:' opt "$@"; do
    case "$opt" in
        d) output_dir="$OPTARG";;
        u) user_agent="$OPTARG";;
    esac
done
shift $((OPTIND - 1))


# output directory check
if [ -d "$output_dir" ]; then
    echo "Downloading all files to '$output_dir'"
else
    echo "Target directory '$output_dir' doesn't exist. Aborting..."
    exit 1
fi;


for url in "$@"; do
    filename="$(echo "$url" | sed -r 's|^.*/([^/]+)$|\1|')"
    filepath="$output_dir/$filename"

    # Avoid overwriting the file
    if [[ -f "$filepath" ]]; then
        echo -n "'$filepath' already exists. Do you want to overwrite it? [y/n] "; read response
        [ -z "$(echo "$response" | grep -i "^y")" ] && continue
    else
        cat /dev/null > "$filepath"
    fi

    echo -e "\nDownload of $url started..."
    i=1
    while true; do   # infinite loop, until the file is fully downloaded

        # setting the range
        [ $i -eq 1 ] && start=0 || start=$(( $fsize_limit * ($i - 1) + 1))
        stop=$(( $fsize_limit * i ))

        # downloading
        curl --fail --location --user-agent "$user_agent" --range "$start"-"$stop" "$url" >> "$filepath"

        exit_status="$?"

        # download finished
        [ $exit_status -eq 22 ] && echo -e "Saved $filepath\n" && break

        # other exceptions
        [ $exit_status -gt 0 ] && echo -e "Unknown exit status: $exit_status. Aborting...\n" && break

        i=$(($i + 1))
    done
]]>

19 comments:

endi said...

is there a simple way beside writing a code? such as installing sftware or something...
thanks for your information

vikasreddy said...

@endi,

I'm not sure!

Well, you'd definitely have to download bigger files chunk-wise (for eg., it's 14MB chunks in the above code) to bypass your proxy's restrictions. Anyways, this works flawlessly given that cURL, the only external program it uses is installed by default in most of the linux distros

Cheers,
Vikas

Muhammad Saad Jutt said...

in which language have you written the code?
its not c/c++
i would be very much glad to your reply
thanks

Vikas Reddy said...

It's Bash, Mr. Jutt!
Bash is the default shell in MAC OS and many Linux distributions.

لاعبه الجمباز said...
This comment has been removed by a blog administrator.
Anonymous said...

Thanks a ton dude :)
Was able to bypass a proxy limit of 50MB by using this script.

Try making parallel download if you can, but I guess it will require joining of files and more code..

BUNTY ROCKS said...

please explain in a way that even no brainers like me can also use it. how to download the script and how to proceed thereafter.
any help is truly appreciated.

singoc said...

The script work well,
Thanks

Carlos Herrera Fumero said...

Great, this script safe my live.

kanad said...
This comment has been removed by the author.
kanad said...

When I tried to execute the script it gave me the following error-
./cur1download.sh: line 71: syntax error near unexpected token `]]'
./cur1download.sh: line 71: `]]>'
It would be very helpful if you could help regarding this

Timmehta Timmy said...

It brought up this error

curl: (22) The requested URL returned error: 403 Forbidden

vandamp said...

hi,
some time ago i had the same problem.
zipped.at - is a online service to split large file downloads into multiple parts so you can bypass firewall size limit settings.

Tsotang Nthako said...

Vandamp, I tried zipped.at, but it's taking too long to load.

Wlx said...

I have sam issue as kanad,

so:

./cur1download.sh: line 71: syntax error near unexpected token `]]'
./cur1download.sh: line 71: `]]>'


I'll be really thankful for the solution

Samuel Waweru said...

please add a "-g" in your code to prevent globbing in the line with the curl command

Samuel Waweru said...

If you can add resume capability, it would be fine.

Samuel Waweru said...

Replace the "]]>" at the end of the file with "done"

Mathieu Vedie said...

Thx a lot