title:Undefine C keywords:c,linux,asm # Undefined C There is possible to run piece of code inside online c compiler like https://www.onlinegdb.com/online_c_compiler Or run locally. With base check is done with gcc compiler. There are many small tricks around running C code in practice that aren't covered in any generic tutorials. ## Compile __hello_world.c__ ```c int main() { printf("Hello world\n"); } ``` ```bash gcc hello_world.c -o hello_world gcc -m32 hello_world.c -o hello_world_32 #for 32bit target ``` ## Syntax ### Variables Standard list of available types #### Check type size All types have size that are declared in bytes. Some of the types are machine dependents. like int/long, if there is needed machine independent types then there are int32_t/uint32_t/int64_t/uint64_t Each architecture 8bit/16bit/32bit/64bit will have different size for those types Use __sizeof()__ Running on x86 machine ```c #include #include #include int main() { printf("Sizeof int %lu\n",sizeof(int)); printf("Sizeof int32_t %lu\n",sizeof(int32_t)); printf("Sizeof int64_t %lu\n",sizeof(int64_t)); printf("Sizeof long %lu\n",sizeof(long)); printf("Sizeof long long %lu\n",sizeof(long long)); } ``` Most safest/portable way is to use [u]int[8/16/32/64]_t types. Defined macros'es to get type max and min values are https://en.cppreference.com/w/c/types/limits ```c #include int main() { printf("INT_MIN %d\n",INT_MIN); printf("INT_MAX %d\n", INT_MAX); printf("LONG_MIN %ld\n",LONG_MIN); } ``` Example from AVR __stdint.h__ https://github.com/avrdudes/avr-libc/blob/main/include/stdint.h Example from Libc https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/stdint.h #### How to shoot the leg When code suppose to run on 32bit and 64bit platform the size of type may vary. Need to take in account this case. ### Functions Function syntax, there is nothing interesting on functions ``` ( ,..) { } ``` Write simple function ```c int fun1() { return -1; } ``` Function can have multiple return statements. Here is example whne function have 3 return values. ```c int fun2(int i) { if (i<0) return -1; if (i>0) return 1; return 0; } ``` Get address of function ```c printf("fun1 address %016x",&fun1);//64bit platform ``` ### If statement ```c if () ; if () {} ``` One of the way to check error of returned functions is ```c if ((c = getfun()) == 0) { } ``` Most simplest and outdated way to do this is when getting input from command line ```c #include int main() { int c; char ch; while ((c = getchar()) != EOF ) { ch = c; printf("Typed character %c\n",c); } } ``` ### For cycle For loop is one that may involve some trickery, its as simple as ```c for (;;) { } ``` Go over values from 1 till 10 ```c int i=0; for (i=1;i<=10;i++) { printf("%d\n",i) } ``` Now lets do it from 10 till 1 ```c int i=0; for (i=10;i>0;i--) { printf("%d\n",i) } ``` Now lets make one liner ```c for (i=0;i<10;i++,printf("%d\n",i)); ``` Yes there is possible to write as many expressions as needed. ### Structure Structure allows to combine types under one new type. Structure is convenient way how to combine set of types and reuse them as one. ```c struct struct1 { uint8_t a; uint16_t b; uint32_t c; uint64_t d; }; ``` Total intuitive size of structure would be ```c int total_szie = sizeof(uint8_t) + sizeof(uint16_t) + sizeof(uint32_t) + sizeof(uint64_t); int real_size = sizeof(struct1); ``` Types are placed inside structure to make fast access to them. Some instructions of CPU may require to access aligned memory addresses to not have penalty on accessing types inside structure. To directly mess with alignment of types use attribute ```c __attribute__ ((aligned (8))) ``` Use attributes to pack structure and be not architecture dependent. ```c struct struct2 { uint8_t a; uint16_t b; uint32_t c; uint64_t d; } __attribute__((packed)); ``` Now let check size of structure after it packed ```c int new_size = sizeof(struct2); ``` Also there is possible to add aligmnet to each time in structure ```c struct struct3 { uint8_t a __attribute__((aligned (8))); uint16_t b __attribute__((aligned (8))); uint32_t c __attribute__((aligned (8))); uint64_t d __attribute__((aligned (8))); } __attribute__((aligned (8))); ``` Now size of structure will be 32. All results on amd64, other arch may differ. ### How to shoot leg Forget that struct size is not consistent. ### Recursion Recursion is technique that could be useful to write shorter code and deal with cycles. One thing that recursion suffer is that it consumes stack memory and its have default limit on platform. ```c #include #include int fun_r(int i) { printf("val %d\n",i); fun_r(i+1); return 0; } int main() { fun_r(0); } ``` Program will fail after its reach out of stack range. When increase the default stack limit it go more further. Check default stack size ``` ulimit -s ``` Set stack size ``` ulimit -s 16384 ``` ### Macro There is many things useful as macros. There is many tricks in macros to emit useful parts of code. Define values, as its enum. ```c #define VAL_0 0 #define VAL_1 1 #define VAL_LAST VAL_1 ``` Multiline macro ```c #define INC_FUN(TYPE) TYPE inc_##TYPE(a TYPE){\ TYPE c=1\ return a + c\ } INC_FUN(int) INC_FUN(char) INC_FUN(double) INC_FUN(notype) ``` to check code expansion of macro run ``` gcc -E ``` http://main.lv/writeup/c_macro_tricks.md https://jadlevesque.github.io/PPMP-Iceberg/ ### Pointers One the C most loved feature is pointers, they allow to access addresses without any sanity check and they dont have any lifetime, so anything is possible with those. Pointer contains address which is interpreted according of pointer type ```c int c; int ptr=&c; ``` Go over array of chars ```c #include #include int main() { char s[]="asd"; char *c=&s; while (*c != 0) { printf("NExt char %c addr %016x\n",*c,c); c++; } } ``` Go over array of ints ```c int i=0; int arr[] = {9,7,5,3,1}; int *ptr = arr; while (i<5) { printf("Number value %d addr %016x\n",*ptr, ptr); ptr++; i++; } ``` Pointer arithmetics like +1 will move to next address that is offset of type size. As example below structure size is 12, and increment of pointer to that structure increment address to sizeof structure. And yes address is pointing to not mapped memory, so it will segfault if accessed. ```c struct size12 { int a,b,c; } int main() { struct size12 *s=0; s++; printf("%016x\n",s); s++; printf("%016x\n",s); } ``` Double pointers are pointers to pointers ```c #include int main(int argc, char **argv) { char *arg = argv[0]; printf("Program name %s\n",arg); } ``` #### How to shoot the leg Run pointer in while loop incrementing pointer. It will stop only when segfaults. Dont initialize pointer and it will have random value. ### Allocate memory From programs perspective memory allocation is adding address range to executable that can be addressed. malloc should be accompanied with free statement, otherwise it will have memory leaks. ```c #include #include #include int main() { char *c = malloc(16); memset(c,0,16); int *arr = malloc(16*sizeof(int)); memset(arr,0,16*sizeof(int)); free(c); free(arr); } ``` ### Signed/Unsigned Signed and unsigned variables differ just in one bit interpretation. But they have different behavior on minimal and maximal values. ```c #include #include int main() { int i=INT_MAX; unsigned int u=UINT_MAX; printf("i=%d\n",i); printf("u=%u\n",u); i++; u++; printf("i=%d\n",i); printf("u=%u\n",u); i=0; u=0; i--; u--; printf("i=%d\n",i); printf("u=%u\n",u); } ``` ### Endianess ```c #include #include #include #include #include int main() { int arr[4] = {0x00112233,0x44556677,0x8899AABB, 0xCCDDEEFF}; printf("%08x\n",arr[0]); printf("%08x\n",arr[1]); printf("%08x\n",arr[2]); printf("%08x\n",arr[3]); FILE *f = fopen("int.hex","w+"); fprintf(f,"%08x",arr[0]); fprintf(f,"%08x",arr[1]); fprintf(f,"%08x",arr[2]); fprintf(f,"%08x",arr[3]); fclose(f); int fd=open("int.bin",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO); write(fd,arr,sizeof(arr)); close(fd); int i; fd = open("int.bin2",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO); for (i=0;i<4;i++) { uint32_t val = (arr[i]>>16) &0x0000ffff; val += (arr[i]<<16)&0xffff0000; write(fd,&val,sizeof(uint32_t)); } close(fd); } ``` While saving formated values to file you will get what you expect ``` $ cat int.hex 00112233445566778899aabbccddeeff ``` Saving just memory dump of all values, will give you different result ``` $ hexdump int.bin 0000000 2233 0011 6677 4455 aabb 8899 eeff ccdd 0000010 ``` Need to swap 16bit pairs to look same as value memory dump ``` $ hexdump int.bin2 0000000 0011 2233 4455 6677 8899 aabb ccdd eeff 0000010 ``` ### Compiler flags Compiler have whole list of command line arguments that you can enable for different purposes, lets look into some of them https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html Lets try to apply some of the flags to examples above. Best starte options is, those will give you more warnings. ``` -Wall -Wextra ``` Most of the examples here was written in sloppy style, so adding extra checks like will find more issues with code, probably all of provided examples will show issues with this extra compiler flags ``` Wformat-security -Wduplicated-cond -Wfloat-equal -Wshadow -Wconversion -Wjump-misses-init -Wlogical-not-parentheses -Wnull-dereference ``` To get all macroses expanded in C code add compiler flag. Output will be C source with all macro expansion ``` -E ``` Output resulting file not to binary but to generated assembly add ``` -S ``` More readable output can be obtained with ``` gcc FILE.c -Wa,-adhln=FILE.S -g -fverbose-asm -masm=intel ``` Basic compiler optimisation flags that can speedup program or make it smaller ``` -O -O0 -O1 -O2 -O3 -Os -Ofast -Og -Oz ``` https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Optimize-Options https://panthema.net/2013/0124-GCC-Output-Assembler-Code/ https://blogs.oracle.com/linux/post/making-code-more-secure-with-gcc-part-1 ### Shared library Shared library is common way how to reuse big chunks of code. ```c #include int fun1() { return 1; } int fun2() { printf("Function name fun2\n"); } int fun3(int a, int b) { return a+b; } ``` ``` $ gcc -c lib_share.c $ gcc -shared -o lib_share.so libshare.o $ ldd lib_share.so linux-vdso.so.1 (0x00007ffdb994d000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f0c39400000) /usr/lib64/ld-linux-x86-64.so.2 (0x00007f0c39835000) ``` Now lets link to our binary ```c #include //functions that are implemented in shared lib int fun1(); int fun2(); int fun3(int a, int b); int main() { fun1(); fun2(); fun3(); } ``` ``` $ gcc -L. -lshare use_share.c -o use_share ./use_share ./use_share: error while loading shared libraries: libshare.so: cannot open shared object file: No such file or directory ldd ./use_share linux-vdso.so.1 (0x00007ffedcad5000) libshare.so => not found libc.so.6 => /usr/lib/libc.so.6 (0x00007f7b99a00000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f7b99c90000) ``` Library is not in search path ``` $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd` $ ./use_share $ ldd use_share linux-vdso.so.1 (0x00007fffc415c000) libshare.so => /your/path/libshare.so (0x00007f48b03c6000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f48b0000000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f48b03d2000) ``` Other way is to set custom library search location. Lets set it to search in current directory. And no need to modify LD_LIBRARY_PATH ``` $ gcc use_share.c -o use_share -L. -lshare -Wl,-rpath=./ $ ldd ./use_share linux-vdso.so.1 (0x00007fff5c964000) libshare.so => ./libshare.so (0x00007f791000f000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f790fc00000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f791001b000) ``` So now executable runs libshare from local directory. Ofc there is possible to install shared library into systems /usr/lib ### Static library ### Static binary Static binary don't use any shared libraries, and its possible to built it once and distribute on other platforms without need to install dependencies. ```c #include #include int main(int argc, char **argv) { return 0; } ``` First step to compile file and see that is dynamically lined ``` $ gcc static_elf.c -o static_elf $ file static_elf static_elf: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=bc6ac706075874858e1c4a8accf77e704f4ea25a, for GNU/Linux 4.4.0, with debug_info, not stripped $ ldd ./static_elf linux-vdso.so.1 (0x00007ffccef49000) libc.so.6 => /usr/lib/libc.so.6 (0x00007fcbb8800000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fcbb8b63000) ``` After adding static option we can verify that tools now report it as statically linked. Size of binary increased as all functions that require to run executable are now contained in binary. ``` $ gcc static_elf.c -static -o static_elf $ file static_elf static_elf: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=c54d2e4d2a3d11fe920bee9a44af045c6f67ab56, for GNU/Linux 4.4.0, with debug_info, not stripped $ ldd static_elf not a dynamic executable ``` Statically compiled file should work on most platforms. ## Basic usage ### File manipulation with libc Create file open data using libc functions ```c #include #include #include int main() { FILE *f = fopen("file.txt","w+"); char *s = "Hello"; fwrite(s,1,strlen(s),f); fclose(f); } ``` Open file and read data back ```c #include #include #include int main() { FILE *f = fopen("file.txt","r"); char buf[128]; int r; r = fread(buf,1,128,f); buf[r] = 0; printf("->%s\n",buf,r); fclose(f); } ``` ### File manipulation with syscalls Now lets do the same without using libc functions using syscall function to directly use syscalls, its also straightforward to rewrite example for assembly. ```c #include #include #include #include int main(void) { int fd = syscall(SYS_open, "sys.txt", O_CREAT|O_WRONLY, S_IRWXU|S_IRGRP|S_IXGRP); char s[] = "hello sycall\n"; syscall(SYS_write, fd, s, strlen(s)); syscall(SYS_close, fd); return 0; } ``` Read data from file ```c #include #include #include #include int main(void) { int fd = syscall(SYS_open, "sys.txt", O_RDONLY); char s[128]; int r = syscall(SYS_read, fd, s, 128); s[r] = 0; syscall(SYS_close, fd); syscall(SYS_write, 0, s, r); return 0; } ``` ## Advanced topics ### Kernel module Linux kernel, macos kernel and *BSD's kernels written in C, so there is possibility to write kernel modules in C for some of those. Example will not match some specific things to local distribution. ```c ``` http://main.lv/writeup/kernel_hello_world.md ### Linking Linking is one of the most interesting parts of compiling of C code. When object file is created it contains functions and variables that can be of different type. And linking tries to resolve all of those. So there is possible to have fun with linking and content of object files. First example is piece of C code that can be compiled to object file, but it will not able to resolve to executable. ``` gcc -c link_elf.c ``` ```c int main() { fun1(); fun2(); } ``` So we can see that fun1 and fun2 are marked as undefined in object file. If we try compile it will not able to find those. So lets create one more object file ``` $ readelf -a link_elf.o Symbol table '.symtab' contains 6 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_elf.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 31 FUNC GLOBAL DEFAULT 1 main 4: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND fun1 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND fun2 ``` __link_fun1.c__ ```c void fun1() { printf("Hello fun1\n"); } void fun2() { printf("Hello fun2\n"); } ``` So now we have object file with funtions that are defined. and we see that its now have undefine pritnf/puts function there. ``` readelf -a link_fun1.o Symbol table '.symtab' contains 7 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun1.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata 4: 0000000000000000 22 FUNC GLOBAL DEFAULT 1 fun1 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts 6: 0000000000000016 22 FUNC GLOBAL DEFAULT 1 fun2 ``` we can merge both of those files together ```shell gcc -o link_elf link_elf.o link_fun1.o ``` The function in object files dont have any idea about input output types. That why anything can be linked that just match name lets rewrite code like this ```c int fun1(int i) { printf("Hello fun1\n"); } int fun2(int i) { printf("Hello fun2\n"); } ``` And this links without issue. Theat this as 2 sets that are merge together only few thins know when linking things. Return type, and function arguments arent exposed when object file is created. Functions can have aliases. __link_fun2.c__ ```c static void fun2() { printf("hello 2\n"); } __attribute__ ((alias("fun1"))); ``` Now function is local. ``` Symbol table '.symtab' contains 6 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun2.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata 4: 0000000000000000 22 FUNC LOCAL DEFAULT 1 fun2 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts ``` Lets compile all object to executable. And the function fun2 isnt used in this case, ``` $ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf $ ./link_elf Hello fun1 Hello fun2 ``` lets witch aliasing between 2 functions **fun2** ``` link_fun1.o 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun1.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata 4: 000000000000001d 29 FUNC LOCAL DEFAULT 1 fun2 5: 0000000000000000 29 FUNC GLOBAL DEFAULT 1 fun1 6: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts link_fun2.o 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun2.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata 4: 0000000000000000 22 FUNC GLOBAL DEFAULT 1 fun2 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts ``` ``` $ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf $ ./link_elf Hello fun1 hello 2 ``` So all of this plays role in linking object files. There is more interesting utilit called ld its doing things on lower level then gcc. ``` $ ldd ./link_elf linux-vdso.so.1 (0x00007ffd8f5d0000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f96c0e00000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f96c10d2000) ``` We havent specified those options when we compiled executable, lets use **ld** for that. ``` ld -dynamic-linker /lib/ld-linux.so.2 link_elf.o link_fun1.o link_fun2.o -o link_elf ``` ### Extern ### Attributes ### Creating shared library ### Create static libraries ### Join all objects together ### Compile with musl ### Inspect elf files ### No standard library ### Memory leaks ### Code coverage ### Profiling ### Canary ### Atomic ### Multithreading ### Write plugins ## Embedding C ### Embed in C++ ### Embed in Go ### Embed in Swift ### Embed in JS ### Lua in C ### Python in C ## Multiplatform ### Cross compile ### Different flags ### Check architecture ### ARMv8 ### AVR8 ### Emscripten ## Graphics ### SDL2 ### GTK ### OpenGL ### Shaders ### Generate image