Show HN: A minimal C runtime for Linux i386 and x86_64 in 87 SLOC of C
For x86_64 Linux only, here's a single-file crt0 (with arbitrary syscalls working from C-land): https://gist.github.com/lunixbochs/462ee21c3353c56b910f
Build with `gcc -std=c99 -ffreestanding -nostdlib`. After -Os and strip, a.out is 1232 bytes on my system. I got it to 640 bytes with `strip -R .eh_frame -R .eh_frame_hdr -R .comment a.out`.
Starting at ~640 bytes, maybe you could come close to asmutils' httpd in binary size. Failing that, take a look at [1]
You can get pretty far without a real libc, keeping in mind:
- You probably want fprintf, and things can be slower without buffered IO (due to syscall overhead)
- `mmap/munmap` is okay as a stand-in allocator, though it has more overhead for many small allocations.
- You don't get libm for math helper functions.
Of course, you can cherry-pick from diet or musl libc if you need individual functions.
[1] "A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux" http://www.muppetlabs.com/~breadbox/software/tiny/teensy.htm...
I like this a lot more than I should. :)
Somehow I've always found system calls far more pleasant to use than "section 3" C library interfaces, and it makes me sad that I'm pulling in some 3MB of library code (libc + libm + pthread; not even counting the dynamically-loaded stuff like nsswitch) that I mostly don't want.
Sadly as a C++ programmer I do at least need libgcc to implement exceptions, which in turn likely pulls in glibc anyway. Sigh. (And I haven't completely cut ties with libstdc++ yet, though I'm close...)
(And yeah, on typical system these libraries are already resident anyway since other apps are using them, so wanting to avoid them is mostly silly, but it feels nice!)
00_start.c is too hacked on x86_64. it'll work but you're getting a less efficient binary since gcc has to assume _start is called like a normal C function (e.g. it creates a preamble). you should just implement it in assembly.
__init() itself also needs some work. the argument list is weird, linux pushes all of argc, argv, and environ on the stack. why special case argc? also your method of deriving argv and environ from the function argument's address is extremely brittle, and i don't think it actually works on x86_64 (if it does, that's really lucky). you aren't calculating envp using argc, so it's probably wrong. you could get more efficient code from using __attribute__((noreturn)). this would be better:
/* called from _start */ void __init(void *initial_stack) __attribute__((noreturn)); void __init(void *initial_stack) { int argc = *(int *) initial_stack; char **argv = ((char **) initial_stack) + 1; /* assert(!argv[argc]); */ char **envp = __environ = argv + argc + 1; _exit(main(argc, argv, envp)); }
Nice! As part of some work that I was doing ages ago, I had to build myself a custom libc to statically link executables that would run on Android and WebOS since they are both essentially ARM Linux under the hood.
You can learn a lot by writing yourself a libc. Even building a simple/stupid malloc from scratch is a learning exercise.
My eventual goal is to rewrite asmutils'[0] httpd [1] in C using librt0 and get a binary about 2-3x in size (2-3K). Malloc unnecessary.
[0] http://asm.sourceforge.net/asmutils.html
[1] https://github.com/leto/asmutils/blob/master/src/httpd.asm
Nice. I went down this rabbit hole a while ago. Didn't get very far though. Have fun!
(my crappy code - https://bitbucket.org/gcmurphy/libc/)
Somewhat related: a minimal OS that just prints "Hello world" to the screen https://github.com/olalonde/minios (interesting code is in kmain.c and loader.s). Wrote it while going through http://littleosbook.github.io/ (which is great by the way if you are interested in learning a bit about OS development).
I am not familiar with embedded asm. Can someone explan what the following line does?
"register long r10 __asm__( "r10" ) = a3"
What does "SLOC" stand for in this context?
Why just not do
gcc -static -Os -fdata-sections -ffunction-sections -Wl, --gc-sections ...
or similar, in the compiler of choice?
I'll admit my ignorance down at this level. Can someone explain what does and how it can be used?
Ancient history question: What was the main problem the creators of the ELF were trying to address at the time the ELF was adopted?
For the uninitiated what does this program do and what is its importance?