Hacker News Clone

Tell HN: People forget that you can stick any data at the end of a bash script

by BasedAnon on 7/5/2023, 7:32 PM with 141 comments

This is a neat trick I've used to write self-extracting software or scripts that extract files from archives by just using

    tail -c <number of bytes for the binary> $0

All you have to do is make sure you append an explicit 'exit' to the end of your program before your new 'data section', so that bash won't parse any of the 'data section'.

One thing to bear in mind is that if you append binary data, it will be corrupted if you save it in most text editors so when I want to make changes I just delete all the binary and reappend it.

by xg15 on 7/5/2023, 8:42 PM
If you care less about space efficiency and more about maintainability of the script, you can also encode the binary as base64 and put an
```
  echo '...base64 data...' | base64 -d > somefile
```
in your script.
Or add compression to reclaim at least some of the wasted space:
```
  echo '...base64 gzipped data...' | base64 -d | gunzip > somefile
```
Also note that bash accepts line breaks in quoted strings and the base64 utility has an "ignore garbage" option that lets it skip over e.g. whitespace in its input. You can use those to break up the base64 over multiple lines:
```
  echo '
    ...base64 gzipped data...
    ...more data...
    ...even more data...
  ' | base64 -di | gunzip > somefile
```
by dietrichepp on 7/5/2023, 8:48 PM
This trick is used in the demoscene. Instead of using -c, I use -n,
```
  tail -n +2 $0
```
The -n +2 option means “starting at line 2”, which is what you want if you cram your script into one line. You can make an executable packed with lzma this way,
```
  a=`mktemp`;tail -n+2 $0|unxz>$a;chmod +x $a;$a;rm $a;exit
```
This is the polite way to do it, using mktemp. You can save some bytes if you don’t care about that stuff.
by jjgreen on 7/5/2023, 7:37 PM
Ruby (and earlier, Perl) formalised this with the __END__ section: https://www.honeybadger.io/blog/data-and-end-in-ruby/
by febed on 7/6/2023, 6:24 AM
It’s better explained here:
https://www.xmodulo.com/embed-binary-file-bash-script.html
by nottorp on 7/5/2023, 8:32 PM
Shell archive it was called? There used to be a lot of installers like that.
by twic on 7/5/2023, 9:33 PM
Since zip files use a directory at the end, you can make a kind of mullet file - script at the front, archive at the back. I generated single-file runnable Java binaries like that at once point.
by rubicks on 7/6/2023, 12:27 AM
See also:
https://makeself.io/
https://manpages.debian.org/bookworm/sharutils/shar.1.en.htm...

by lloeki on 7/6/2023, 8:55 AM

Ha, turns out I just wrote this helper function a few weeks ago, inspired by Perl and Ruby:

    #!/usr/bin/env bash

    # read data starting from the provided section marker up to the next one or EOF
    function section() {
        local section="$1"
        local source="${BASH_SOURCE[0]}"
    
        awk '/^__[A-Z0-9]+__$/{f=0} f{print} /^'"${section}"'$/{f=1}' "${source}"
    }
    
    section __JSON__ | jq
    section __YAML__ | ruby -ryaml -e 'p YAML.load(STDIN.read)'
    
    exit
    
    __JSON__
    { "a": 1 }
    __YAML__
    b:
      - 1
      - 2
      - 3

My only wish is that shellcheck had a directive to stop yelling at me starting at a certain line.

Usually I augment it with such functions for clarity:

    # whatever raw data
    function data() {
        section __DATA__
    }

    # man/perldoc like
    function doc() {
        section __DOC__
    }

    # command line help
    function help() {
        section __HELP__
    }

by heresie-dabord on 7/5/2023, 10:28 PM
In Perl, __DATA__ indicates the beginning of the data section of the file. A portable way to provide test data or sample data.
https://perldoc.perl.org/functions/__DATA__
by INTPenis on 7/5/2023, 8:58 PM
That's how I made a bash backdoor once. It was just a script somewhere on the FS, until it unpacked itself and executed the rest of the rootkit.
Long story but trust me that I had good intentions.
by norir on 7/5/2023, 10:42 PM
This is a great trick, but no one should ever run someone else's script that does this unless they have verified the script line by line beforehand.
by mbreese on 7/5/2023, 11:24 PM
Java JAR files are similar, but reversed. You can add anything you want to the beginning of the JAR file (or is it any ZIP file?) so long as it doesn't include the Zip file header "PK". So, I use this to prepend a bash script that ultimately calls
```
    java -jar $0
```
It makes it very easy to setup and use Java based command line programs on a server.
by JohnFen on 7/5/2023, 9:09 PM
This is my default approach to writing installers for the Unices. The program is compressed and added to the end of the script, and the script does the unpacking and any needed setup/configuration for the specific platform it's getting installed on.
I don't append it in binary form, though. I uuencode it. That way, there is no danger in using text editors.
by eadler on 7/5/2023, 8:12 PM
See https://man.freebsd.org/cgi/man.cgi?query=shar&sektion=1&for... for a tool to generate these types of archives.
by cocodill on 7/5/2023, 7:49 PM
I can vaguely remember that many programs used to install themselves this way under Linux.
by Gabrys1 on 7/6/2023, 6:04 AM
"$0" otherwise it won't work for paths with spaces
by onion2k on 7/5/2023, 9:10 PM
This reminds me of ZX Spectrum Basic where all the graphics, sound, and level layouts were defined using DATA lines at the end of the program.
by kkfx on 7/5/2023, 8:42 PM
Makeself archives are a classic self-extracting tarball who do exactly that...
by 1vuio0pswjnm7 on 7/6/2023, 9:58 PM
"All you have to do is make sure you append an explicit 'exit' to the end of your program before your new 'data section', so that bash won't parse any of the 'data section'."
Or just use exec.
```
     exec tail -c [number of bytes for the binary] $0
```
by ilyt on 7/6/2023, 11:13 AM
....that's horrid. Why would you do that to your fellow humans ?
just use
```
    cat >outfile <<EOF
    some
    data
    EOF
```
add base64 if binary
edit: after looking thru the thread I am deeply disappointed so little people know of that feature.
by OnlyMortal on 7/6/2023, 6:38 AM
One “naughty” thing you can do is write invisible data into the last block of a file…
- truncate the file to extend it to the end of the last block
- write data to that area
- truncate the file back to its original size
An edit of that file will likely lose you data though.
by 2OEH8eoCRo0 on 7/5/2023, 9:28 PM
I think this is how GOG ships the Linux version of Battletech.
by vram22 on 7/5/2023, 9:10 PM
BASIC and Perl had or have something like that too.
IIRC, Perl copied it from BASIC, because BASIC came much before Perl.
And, again, IIRC, I've read about the shar (shell archive) method that someone else commented about in this thread (and which even has a Wikipedia entry), in either the classic Kernighan and Pike book, The Unix Programming Environment (which I've recommended here multiple times before), or in some Unix man pages, long ago.
So it's quite an old method.
by karmicthreat on 7/6/2023, 12:06 AM
I did a similar thing for a lowish volume embedded product. The update files are just bash scripts with a tar file cat'd on them. The unit just looks for a particular file on an external flash drive to run and the bash script runs, copies off a tar and checks that it has the right hash. Super simple and flexible when customers need me to do something special. Like extract some specific log onto a flash drive.
by doktorhladnjak on 7/6/2023, 1:11 AM
This reminds me of a job I had 15+ years ago where we did code reviews by emailing files to one another with our changes. It worked like this with the first part of the file being a script and the end of the file being a base64 encoded zip of the changed files. We had tooling that would pack them, but unpacking was done by execution.
What could possibly go wrong with emailing executable scripts?
by acc_297 on 7/6/2023, 11:00 AM
I use this at work for batch scripts which call R code for some of their functionality it’s very handy providing somebody who’s not very technology literate a solution which is a single .bat file which windows is happy to run by double clicking than a directory of files which must be stored together in order to work
by lakomen on 7/6/2023, 2:00 AM
It's also good for signed bash scripts.
by on 7/5/2023, 8:43 PM
undefined
by zeroonetwothree on 7/6/2023, 12:38 AM
Von Neumann architecture to the extreme :)
by mogwire on 7/6/2023, 1:39 AM
A very large Electronic Medical Records company shipped an extremely large shell script to us for an install.
Upon examination it contained binary data and a command to extract it to a file and then installed the application.
This was the “efficient” way to ship and install the binary.
by jmclnx on 7/6/2023, 12:39 AM
This for any sh type script, not just bash :) Will work with sh, ksh and even [t]csh
by sumosudo on 7/5/2023, 10:02 PM
I use a fun little hack, a la awk:
``` #!/usr/local/bin/bash
echo "HELLO"
TAIL_REMOTE_MARKER=`awk '/^__THE_REMOTE_PART__/{flag=1;next}/^__END_THE_REMOTE_PART__/{flag=0;exit}flag' ${0}`
eval "$TAIL_REMOTE_MARKER"
exit 0
__THE_REMOTE_PART__
echo "WORLD"
__END_THE_REMOTE_PART__ ```
by davidw on 7/5/2023, 9:22 PM
I seem to recall that you can do the opposite as well: stash some extra data at the end of a binary file. The 'tclkit' system used this to package up an executable with the scripts you wanted to ship.
by not2b on 7/6/2023, 12:08 AM
That's what uuencode / uudecode were once used for.
by ShowalkKama on 7/5/2023, 9:36 PM
portswigger does that for the burpsuite installers.
https://portswigger-cdn.net/burp/releases/download?product=c...
by thinkmusic2000 on 7/5/2023, 11:06 PM
I used to do something similar for Windows executable files. Append a large file to the end as necessary.
by RajT88 on 7/6/2023, 12:21 AM
This is a malware technique.
I am not saying don't do it. But that is mostly where I see this type of trick.
by Fudgel on 7/5/2023, 11:11 PM
I vaguely remember this is what Ocaml does for one format of its executable.
by pornel on 7/6/2023, 11:13 AM
Sadly, it won't work with my favourite curl | sh.
by hey00 on 7/5/2023, 10:37 PM
[flagged]
by hey00 on 7/5/2023, 10:37 PM
I dont understand this website it is too hard and i dont understand anything. Anyone help me with this?