Tell HN: People forget that you can stick any data at the end of a bash script

This is a neat trick I've used to write self-extracting software or scripts that extract files from archives by just using

    tail -c <number of bytes for the binary> $0
All you have to do is make sure you append an explicit 'exit' to the end of your program before your new 'data section', so that bash won't parse any of the 'data section'.

One thing to bear in mind is that if you append binary data, it will be corrupted if you save it in most text editors so when I want to make changes I just delete all the binary and reappend it.

  • If you care less about space efficiency and more about maintainability of the script, you can also encode the binary as base64 and put an

      echo '...base64 data...' | base64 -d > somefile
    
    in your script.

    Or add compression to reclaim at least some of the wasted space:

      echo '...base64 gzipped data...' | base64 -d | gunzip > somefile
    
    Also note that bash accepts line breaks in quoted strings and the base64 utility has an "ignore garbage" option that lets it skip over e.g. whitespace in its input. You can use those to break up the base64 over multiple lines:

      echo '
        ...base64 gzipped data...
        ...more data...
        ...even more data...
      ' | base64 -di | gunzip > somefile

  • This trick is used in the demoscene. Instead of using -c, I use -n,

      tail -n +2 $0
    
    The -n +2 option means “starting at line 2”, which is what you want if you cram your script into one line. You can make an executable packed with lzma this way,

      a=`mktemp`;tail -n+2 $0|unxz>$a;chmod +x $a;$a;rm $a;exit
    
    This is the polite way to do it, using mktemp. You can save some bytes if you don’t care about that stuff.

  • Ruby (and earlier, Perl) formalised this with the __END__ section: https://www.honeybadger.io/blog/data-and-end-in-ruby/

  • Shell archive it was called? There used to be a lot of installers like that.

  • Since zip files use a directory at the end, you can make a kind of mullet file - script at the front, archive at the back. I generated single-file runnable Java binaries like that at once point.

  • Ha, turns out I just wrote this helper function a few weeks ago, inspired by Perl and Ruby:

        #!/usr/bin/env bash
    
        # read data starting from the provided section marker up to the next one or EOF
        function section() {
            local section="$1"
            local source="${BASH_SOURCE[0]}"
        
            awk '/^__[A-Z0-9]+__$/{f=0} f{print} /^'"${section}"'$/{f=1}' "${source}"
        }
        
        section __JSON__ | jq
        section __YAML__ | ruby -ryaml -e 'p YAML.load(STDIN.read)'
        
        exit
        
        __JSON__
        { "a": 1 }
        __YAML__
        b:
          - 1
          - 2
          - 3
    
    My only wish is that shellcheck had a directive to stop yelling at me starting at a certain line.

    Usually I augment it with such functions for clarity:

        # whatever raw data
        function data() {
            section __DATA__
        }
    
        # man/perldoc like
        function doc() {
            section __DOC__
        }
    
        # command line help
        function help() {
            section __HELP__
        }

  • In Perl, __DATA__ indicates the beginning of the data section of the file. A portable way to provide test data or sample data.

    https://perldoc.perl.org/functions/__DATA__

  • That's how I made a bash backdoor once. It was just a script somewhere on the FS, until it unpacked itself and executed the rest of the rootkit.

    Long story but trust me that I had good intentions.

  • This is a great trick, but no one should ever run someone else's script that does this unless they have verified the script line by line beforehand.

  • Java JAR files are similar, but reversed. You can add anything you want to the beginning of the JAR file (or is it any ZIP file?) so long as it doesn't include the Zip file header "PK". So, I use this to prepend a bash script that ultimately calls

        java -jar $0
    
    It makes it very easy to setup and use Java based command line programs on a server.

  • This is my default approach to writing installers for the Unices. The program is compressed and added to the end of the script, and the script does the unpacking and any needed setup/configuration for the specific platform it's getting installed on.

    I don't append it in binary form, though. I uuencode it. That way, there is no danger in using text editors.

  • See https://man.freebsd.org/cgi/man.cgi?query=shar&sektion=1&for... for a tool to generate these types of archives.

  • I can vaguely remember that many programs used to install themselves this way under Linux.

  • "$0" otherwise it won't work for paths with spaces

  • This reminds me of ZX Spectrum Basic where all the graphics, sound, and level layouts were defined using DATA lines at the end of the program.

  • Makeself archives are a classic self-extracting tarball who do exactly that...

  • "All you have to do is make sure you append an explicit 'exit' to the end of your program before your new 'data section', so that bash won't parse any of the 'data section'."

    Or just use exec.

         exec tail -c [number of bytes for the binary] $0

  • ....that's horrid. Why would you do that to your fellow humans ?

    just use

        cat >outfile <<EOF
        some
        data
        EOF
    
    add base64 if binary

    edit: after looking thru the thread I am deeply disappointed so little people know of that feature.

  • One “naughty” thing you can do is write invisible data into the last block of a file…

    - truncate the file to extend it to the end of the last block

    - write data to that area

    - truncate the file back to its original size

    An edit of that file will likely lose you data though.

  • I think this is how GOG ships the Linux version of Battletech.

  • BASIC and Perl had or have something like that too.

    IIRC, Perl copied it from BASIC, because BASIC came much before Perl.

    And, again, IIRC, I've read about the shar (shell archive) method that someone else commented about in this thread (and which even has a Wikipedia entry), in either the classic Kernighan and Pike book, The Unix Programming Environment (which I've recommended here multiple times before), or in some Unix man pages, long ago.

    So it's quite an old method.

  • I did a similar thing for a lowish volume embedded product. The update files are just bash scripts with a tar file cat'd on them. The unit just looks for a particular file on an external flash drive to run and the bash script runs, copies off a tar and checks that it has the right hash. Super simple and flexible when customers need me to do something special. Like extract some specific log onto a flash drive.

  • This reminds me of a job I had 15+ years ago where we did code reviews by emailing files to one another with our changes. It worked like this with the first part of the file being a script and the end of the file being a base64 encoded zip of the changed files. We had tooling that would pack them, but unpacking was done by execution.

    What could possibly go wrong with emailing executable scripts?

  • I use this at work for batch scripts which call R code for some of their functionality it’s very handy providing somebody who’s not very technology literate a solution which is a single .bat file which windows is happy to run by double clicking than a directory of files which must be stored together in order to work

  • It's also good for signed bash scripts.

  • undefined

  • Von Neumann architecture to the extreme :)

  • A very large Electronic Medical Records company shipped an extremely large shell script to us for an install.

    Upon examination it contained binary data and a command to extract it to a file and then installed the application.

    This was the “efficient” way to ship and install the binary.

  • This for any sh type script, not just bash :) Will work with sh, ksh and even [t]csh

  • I use a fun little hack, a la awk:

    ``` #!/usr/local/bin/bash

    echo "HELLO"

    TAIL_REMOTE_MARKER=`awk '/^__THE_REMOTE_PART__/{flag=1;next}/^__END_THE_REMOTE_PART__/{flag=0;exit}flag' ${0}`

    eval "$TAIL_REMOTE_MARKER"

    exit 0

    __THE_REMOTE_PART__

    echo "WORLD"

    __END_THE_REMOTE_PART__ ```

  • I seem to recall that you can do the opposite as well: stash some extra data at the end of a binary file. The 'tclkit' system used this to package up an executable with the scripts you wanted to ship.

  • That's what uuencode / uudecode were once used for.

  • portswigger does that for the burpsuite installers.

    https://portswigger-cdn.net/burp/releases/download?product=c...

  • I used to do something similar for Windows executable files. Append a large file to the end as necessary.

  • This is a malware technique.

    I am not saying don't do it. But that is mostly where I see this type of trick.

  • I vaguely remember this is what Ocaml does for one format of its executable.

  • Sadly, it won't work with my favourite curl | sh.

  • [flagged]

  • I dont understand this website it is too hard and i dont understand anything. Anyone help me with this?