Thoughts. Linux. Scripts. Programming.: Bypass proxy server's file size download limit restriction

Monday, April 30, 2012

Bypass proxy server's file size download limit restriction

Many organizations and colleges restrict their employees and students respectively from downloading files from the Internet which are larger than a prescribed limit. It is way too low at 14MB where I work. Fret not! There are ways to bypass this. And here is a simple bash script I wrote to download much larger files at my workplace.

Note: This script works only with direct links and with servers which support resume-download functionality.

I'm continually working on it. So, the latest version will be available on my github account.

How to run it?

Download the following code to a text file named curldownload.sh
Give executable permissions to it chmod +x curldownload.sh
File size limit, fsize_limit variable, is set to 14MB. You may change it to your liking.
The script takes two arguments; the first one being the url of the file to be downloaded; the second one which is optional (defaults to "./") is the output directory.
For ex:- ./curldownload.sh http://ftp.jaist.ac.jp/pub/mozilla.org/firefox/releases/12.0/linux-i686/en-US/firefox-12.0.tar.bz2 "$HOME/Downloads"
A little more complex example of it using multiple urls, and two command-line arguments (-d for output directory, and -u for user-agent http header) is: ./curl-multi-url.sh -d ~/downloads/ -u "Chromium/18.0" http://ftp.jaist.ac.jp/pub/mozilla.org/firefox/releases/11.0/linux-x86_64/en-US/firefox-11.0.tar.bz2 http://ftp.jaist.ac.jp/pub/mozilla.org/firefox/releases/12.0/linux-i686/en-US/firefox-12.0.tar.bz2

#!/bin/bash
#
# Vikas Reddy @
#   http://vikas-reddy.blogspot.in/2012/04/bypass-proxy-servers-file-size-download.html
#
# 
# Usage:
#     ./curl-multi-url.sh -d OUTPUT_DIRECTORY -u USER_AGENT http://url-1/ http://url-2/;
#     Arguments -d and -u are optional
#
#

# Defaults
fsize_limit=$((14*1024*1024))
user_agent="Firefox/10.0"
output_dir="."


# Command-line options
while getopts 'd:u:' opt "$@"; do
    case "$opt" in
        d) output_dir="$OPTARG";;
        u) user_agent="$OPTARG";;
    esac
done
shift $((OPTIND - 1))


# output directory check
if [ -d "$output_dir" ]; then
    echo "Downloading all files to '$output_dir'"
else
    echo "Target directory '$output_dir' doesn't exist. Aborting..."
    exit 1
fi;


for url in "$@"; do
    filename="$(echo "$url" | sed -r 's|^.*/([^/]+)$|\1|')"
    filepath="$output_dir/$filename"

    # Avoid overwriting the file
    if [[ -f "$filepath" ]]; then
        echo -n "'$filepath' already exists. Do you want to overwrite it? [y/n] "; read response
        [ -z "$(echo "$response" | grep -i "^y")" ] && continue
    else
        cat /dev/null > "$filepath"
    fi

    echo -e "\nDownload of $url started..."
    i=1
    while true; do   # infinite loop, until the file is fully downloaded

        # setting the range
        [ $i -eq 1 ] && start=0 || start=$(( $fsize_limit * ($i - 1) + 1))
        stop=$(( $fsize_limit * i ))

        # downloading
        curl --fail --location --user-agent "$user_agent" --range "$start"-"$stop" "$url" >> "$filepath"

        exit_status="$?"

        # download finished
        [ $exit_status -eq 22 ] && echo -e "Saved $filepath\n" && break

        # other exceptions
        [ $exit_status -gt 0 ] && echo -e "Unknown exit status: $exit_status. Aborting...\n" && break

        i=$(($i + 1))
    done
]]>

19 comments:

endi said...: is there a simple way beside writing a code? such as installing sftware or something...
thanks for your information; May 7, 2012 at 6:00 PM
vikasreddy said...: @endi,

I'm not sure!

Well, you'd definitely have to download bigger files chunk-wise (for eg., it's 14MB chunks in the above code) to bypass your proxy's restrictions. Anyways, this works flawlessly given that cURL, the only external program it uses is installed by default in most of the linux distros

Cheers,
Vikas; May 11, 2012 at 8:04 AM
Unknown said...: in which language have you written the code?
its not c/c++
i would be very much glad to your reply
thanks; July 25, 2012 at 12:53 AM
Unknown said...: It's Bash, Mr. Jutt!
Bash is the default shell in MAC OS and many Linux distributions.; July 26, 2012 at 11:13 AM
لاعبه الجمباز said...: This comment has been removed by a blog administrator.; September 24, 2012 at 4:22 AM
Anonymous said...: Thanks a ton dude :)
Was able to bypass a proxy limit of 50MB by using this script.

Try making parallel download if you can, but I guess it will require joining of files and more code..; June 29, 2013 at 11:50 AM
BUNTY ROCKS said...: please explain in a way that even no brainers like me can also use it. how to download the script and how to proceed thereafter.
any help is truly appreciated.; November 28, 2013 at 6:48 PM
singoc said...: The script work well,
Thanks; November 5, 2014 at 12:24 PM
Unknown said...: Great, this script safe my live.; February 14, 2015 at 3:39 AM
kanad said...: This comment has been removed by the author.; October 19, 2015 at 11:45 AM
kanad said...: When I tried to execute the script it gave me the following error-
./cur1download.sh: line 71: syntax error near unexpected token `]]'
./cur1download.sh: line 71: `]]>'
It would be very helpful if you could help regarding this; October 19, 2015 at 11:46 AM
Unknown said...: It brought up this error

curl: (22) The requested URL returned error: 403 Forbidden; November 29, 2015 at 10:03 AM
vandamp said...: hi,
some time ago i had the same problem.
zipped.at - is a online service to split large file downloads into multiple parts so you can bypass firewall size limit settings.; January 27, 2016 at 1:05 AM
Unknown said...: Vandamp, I tried zipped.at, but it's taking too long to load.; January 27, 2016 at 3:19 PM
Wlx said...: I have sam issue as kanad,

so:

./cur1download.sh: line 71: syntax error near unexpected token `]]'
./cur1download.sh: line 71: `]]>'

I'll be really thankful for the solution; May 25, 2016 at 2:10 PM
objectivist said...: please add a "-g" in your code to prevent globbing in the line with the curl command; May 4, 2017 at 11:37 AM
objectivist said...: If you can add resume capability, it would be fine.; May 4, 2017 at 2:22 PM
objectivist said...: Replace the "]]>" at the end of the file with "done"; July 13, 2017 at 11:55 AM
Anonymous said...: Thx a lot; September 15, 2017 at 1:38 PM