title:Undefined C keywords:c,linux,asm # Undefined C There is possible to run piece of code inside online c compiler like https://www.onlinegdb.com/online_c_compiler Or run locally. With base check is done with gcc compiler. There are many small tricks around running C code in practice that aren't covered in any generic tutorials, so here is list of topics that may arise while coding real C code outside of tutorials. For each case there is just small example, each of those could take whole chapter on its own. ## Compile __hello_world.c__ ```c int main() { printf("Hello world\n"); } ``` ```bash gcc hello_world.c -o hello_world gcc -m32 hello_world.c -o hello_world_32 #for 32bit target ``` ## Syntax ### Variables Standard list of available types #### Check type size All types have size that are declared in bytes. Some of the types are machine dependents. like int/long, if there is needed machine independent types then there are int32_t/uint32_t/int64_t/uint64_t Each architecture 8bit/16bit/32bit/64bit will have different size for those types Use __sizeof()__ Running on x86 machine ```c #include #include #include int main() { printf("Sizeof int %lu\n",sizeof(int)); printf("Sizeof int32_t %lu\n",sizeof(int32_t)); printf("Sizeof int64_t %lu\n",sizeof(int64_t)); printf("Sizeof long %lu\n",sizeof(long)); printf("Sizeof long long %lu\n",sizeof(long long)); } ``` Most safest/portable way is to use [u]int[8/16/32/64]_t types. Defined macros'es to get type max and min values are https://en.cppreference.com/w/c/types/limits ```c #include int main() { printf("INT_MIN %d\n",INT_MIN); printf("INT_MAX %d\n", INT_MAX); printf("LONG_MIN %ld\n",LONG_MIN); } ``` Example from AVR __stdint.h__ https://github.com/avrdudes/avr-libc/blob/main/include/stdint.h Example from Libc https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/stdint.h #### How to shoot the leg When code suppose to run on 32bit and 64bit platform the size of type may vary. Need to take in account this case. ### Functions Function syntax, there is nothing interesting on functions ``` ( ,..) { } ``` Write simple function ```c int fun1() { return -1; } ``` Function can have multiple return statements. Here is example whne function have 3 return values. ```c int fun2(int i) { if (i<0) return -1; if (i>0) return 1; return 0; } ``` Get address of function ```c printf("fun1 address %016x",&fun1);//64bit platform ``` ### If statement ```c if () ; if () {} ``` One of the way to check error of returned functions is ```c if ((c = getfun()) == 0) { } ``` Most simplest and outdated way to do this is when getting input from command line ```c #include int main() { int c; char ch; while ((c = getchar()) != EOF ) { ch = c; printf("Typed character %c\n",c); } } ``` ### For cycle For loop is one that may involve some trickery, its as simple as ```c for (;;) { } ``` Go over values from 1 till 10 ```c int i=0; for (i=1;i<=10;i++) { printf("%d\n",i) } ``` Now lets do it from 10 till 1 ```c int i=0; for (i=10;i>0;i--) { printf("%d\n",i) } ``` Now lets make one liner ```c for (i=0;i<10;i++,printf("%d\n",i)); ``` Yes there is possible to write as many expressions as needed. ### Structure Structure allows to combine types under one new type. Structure is convenient way how to combine set of types and reuse them as one. ```c struct struct1 { uint8_t a; uint16_t b; uint32_t c; uint64_t d; }; ``` Total intuitive size of structure would be ```c int total_szie = sizeof(uint8_t) + sizeof(uint16_t) + sizeof(uint32_t) + sizeof(uint64_t); int real_size = sizeof(struct1); ``` Types are placed inside structure to make fast access to them. Some instructions of CPU may require to access aligned memory addresses to not have penalty on accessing types inside structure. To directly mess with alignment of types use attribute ```c __attribute__ ((aligned (8))) ``` Use attributes to pack structure and be not architecture dependent. ```c struct struct2 { uint8_t a; uint16_t b; uint32_t c; uint64_t d; } __attribute__((packed)); ``` Now let check size of structure after it packed ```c int new_size = sizeof(struct2); ``` Also there is possible to add aligmnet to each time in structure ```c struct struct3 { uint8_t a __attribute__((aligned (8))); uint16_t b __attribute__((aligned (8))); uint32_t c __attribute__((aligned (8))); uint64_t d __attribute__((aligned (8))); } __attribute__((aligned (8))); ``` Now size of structure will be 32. All results on amd64, other arch may differ. ### How to shoot leg Forget that struct size is not consistent. ### Recursion Recursion is technique that could be useful to write shorter code and deal with cycles. One thing that recursion suffer is that it consumes stack memory and its have default limit on platform. ```c #include #include int fun_r(int i) { printf("val %d\n",i); fun_r(i+1); return 0; } int main() { fun_r(0); } ``` Program will fail after its reach out of stack range. When increase the default stack limit it go more further. Check default stack size ``` ulimit -s ``` Set stack size ``` ulimit -s 16384 ``` ### Macro There is many things useful as macros. There is many tricks in macros to emit useful parts of code. Define values, as its enum. ```c #define VAL_0 0 #define VAL_1 1 #define VAL_LAST VAL_1 ``` Multiline macro ```c #define INC_FUN(TYPE) TYPE inc_##TYPE(a TYPE){\ TYPE c=1\ return a + c\ } INC_FUN(int) INC_FUN(char) INC_FUN(double) INC_FUN(notype) ``` to check code expansion of macro run ``` gcc -E ``` http://main.lv/writeup/c_macro_tricks.md https://jadlevesque.github.io/PPMP-Iceberg/ ### Pointers One the C most loved feature is pointers, they allow to access addresses without any sanity check and they dont have any lifetime, so anything is possible with those. Pointer contains address which is interpreted according of pointer type ```c int c; int ptr=&c; ``` Go over array of chars ```c #include #include int main() { char s[]="asd"; char *c=&s; while (*c != 0) { printf("NExt char %c addr %016x\n",*c,c); c++; } } ``` Go over array of ints ```c int i=0; int arr[] = {9,7,5,3,1}; int *ptr = arr; while (i<5) { printf("Number value %d addr %016x\n",*ptr, ptr); ptr++; i++; } ``` Pointer arithmetics like +1 will move to next address that is offset of type size. As example below structure size is 12, and increment of pointer to that structure increment address to sizeof structure. And yes address is pointing to not mapped memory, so it will segfault if accessed. ```c struct size12 { int a,b,c; } int main() { struct size12 *s=0; s++; printf("%016x\n",s); s++; printf("%016x\n",s); } ``` Double pointers are pointers to pointers ```c #include int main(int argc, char **argv) { char *arg = argv[0]; printf("Program name %s\n",arg); } ``` #### How to shoot the leg Run pointer in while loop incrementing pointer. It will stop only when segfaults. Dont initialize pointer and it will have random value. ### Allocate memory From programs perspective memory allocation is adding address range to executable that can be addressed. malloc should be accompanied with free statement, otherwise it will have memory leaks. ```c #include #include #include int main() { char *c = malloc(16); memset(c,0,16); int *arr = malloc(16*sizeof(int)); memset(arr,0,16*sizeof(int)); free(c); free(arr); } ``` ### Signed/Unsigned Signed and unsigned variables differ just in one bit interpretation. But they have different behavior on minimal and maximal values. ```c #include #include int main() { int i=INT_MAX; unsigned int u=UINT_MAX; printf("i=%d\n",i); printf("u=%u\n",u); i++; u++; printf("i=%d\n",i); printf("u=%u\n",u); i=0; u=0; i--; u--; printf("i=%d\n",i); printf("u=%u\n",u); } ``` ### Endianess ```c #include #include #include #include #include int main() { int arr[4] = {0x00112233,0x44556677,0x8899AABB, 0xCCDDEEFF}; printf("%08x\n",arr[0]); printf("%08x\n",arr[1]); printf("%08x\n",arr[2]); printf("%08x\n",arr[3]); FILE *f = fopen("int.hex","w+"); fprintf(f,"%08x",arr[0]); fprintf(f,"%08x",arr[1]); fprintf(f,"%08x",arr[2]); fprintf(f,"%08x",arr[3]); fclose(f); int fd=open("int.bin",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO); write(fd,arr,sizeof(arr)); close(fd); int i; fd = open("int.bin2",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO); for (i=0;i<4;i++) { uint32_t val = (arr[i]>>16) &0x0000ffff; val += (arr[i]<<16)&0xffff0000; write(fd,&val,sizeof(uint32_t)); } close(fd); } ``` While saving formated values to file you will get what you expect ``` $ cat int.hex 00112233445566778899aabbccddeeff ``` Saving just memory dump of all values, will give you different result ``` $ hexdump int.bin 0000000 2233 0011 6677 4455 aabb 8899 eeff ccdd 0000010 ``` Need to swap 16bit pairs to look same as value memory dump ``` $ hexdump int.bin2 0000000 0011 2233 4455 6677 8899 aabb ccdd eeff 0000010 ``` ### Compiler flags Compiler have whole list of command line arguments that you can enable for different purposes, lets look into some of them https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html Lets try to apply some of the flags to examples above. Best starte options is, those will give you more warnings. ``` -Wall -Wextra ``` Most of the examples here was written in sloppy style, so adding extra checks like will find more issues with code, probably all of provided examples will show issues with this extra compiler flags ``` Wformat-security -Wduplicated-cond -Wfloat-equal -Wshadow -Wconversion -Wjump-misses-init -Wlogical-not-parentheses -Wnull-dereference ``` To get all macroses expanded in C code add compiler flag. Output will be C source with all macro expansion ``` -E ``` Output resulting file not to binary but to generated assembly add ``` -S ``` More readable output can be obtained with ``` gcc FILE.c -Wa,-adhln=FILE.S -g -fverbose-asm -masm=intel ``` Basic compiler optimisation flags that can speedup program or make it smaller ``` -O -O0 -O1 -O2 -O3 -Os -Ofast -Og -Oz ``` https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Optimize-Options https://panthema.net/2013/0124-GCC-Output-Assembler-Code/ https://blogs.oracle.com/linux/post/making-code-more-secure-with-gcc-part-1 ### Shared library Shared library is common way how to reuse big chunks of code. ```c #include int fun1() { return 1; } int fun2() { printf("Function name fun2\n"); } int fun3(int a, int b) { return a+b; } ``` ``` $ gcc -c lib_share.c $ gcc -shared -o lib_share.so libshare.o $ ldd lib_share.so linux-vdso.so.1 (0x00007ffdb994d000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f0c39400000) /usr/lib64/ld-linux-x86-64.so.2 (0x00007f0c39835000) ``` Now lets link to our binary ```c #include //functions that are implemented in shared lib int fun1(); int fun2(); int fun3(int a, int b); int main() { fun1(); fun2(); fun3(); } ``` ``` $ gcc -L. -lshare use_share.c -o use_share ./use_share ./use_share: error while loading shared libraries: libshare.so: cannot open shared object file: No such file or directory ldd ./use_share linux-vdso.so.1 (0x00007ffedcad5000) libshare.so => not found libc.so.6 => /usr/lib/libc.so.6 (0x00007f7b99a00000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f7b99c90000) ``` Library is not in search path ``` $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd` $ ./use_share $ ldd use_share linux-vdso.so.1 (0x00007fffc415c000) libshare.so => /your/path/libshare.so (0x00007f48b03c6000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f48b0000000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f48b03d2000) ``` Other way is to set custom library search location. Lets set it to search in current directory. And no need to modify LD_LIBRARY_PATH ``` $ gcc use_share.c -o use_share -L. -lshare -Wl,-rpath=./ $ ldd ./use_share linux-vdso.so.1 (0x00007fff5c964000) libshare.so => ./libshare.so (0x00007f791000f000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f790fc00000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f791001b000) ``` So now executable runs libshare from local directory. Ofc there is possible to install shared library into systems /usr/lib ### Static library ### Static binary Static binary don't use any shared libraries, and its possible to built it once and distribute on other platforms without need to install dependencies. ```c #include #include int main(int argc, char **argv) { return 0; } ``` First step to compile file and see that is dynamically lined ``` $ gcc static_elf.c -o static_elf $ file static_elf static_elf: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=bc6ac706075874858e1c4a8accf77e704f4ea25a, for GNU/Linux 4.4.0, with debug_info, not stripped $ ldd ./static_elf linux-vdso.so.1 (0x00007ffccef49000) libc.so.6 => /usr/lib/libc.so.6 (0x00007fcbb8800000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fcbb8b63000) ``` After adding static option we can verify that tools now report it as statically linked. Size of binary increased as all functions that require to run executable are now contained in binary. ``` $ gcc static_elf.c -static -o static_elf $ file static_elf static_elf: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=c54d2e4d2a3d11fe920bee9a44af045c6f67ab56, for GNU/Linux 4.4.0, with debug_info, not stripped $ ldd static_elf not a dynamic executable ``` Statically compiled file should work on most platforms. ### Atomic HERE ### Multithreading HERE ## Basic usage ### File manipulation with libc Create file open data using libc functions ```c #include #include #include int main() { FILE *f = fopen("file.txt","w+"); char *s = "Hello"; fwrite(s,1,strlen(s),f); fclose(f); } ``` Open file and read data back ```c #include #include #include int main() { FILE *f = fopen("file.txt","r"); char buf[128]; int r; r = fread(buf,1,128,f); buf[r] = 0; printf("->%s\n",buf,r); fclose(f); } ``` ### File manipulation with syscalls Now lets do the same without using libc functions using syscall function to directly use syscalls, its also straightforward to rewrite example for assembly. ```c #include #include #include #include int main(void) { int fd = syscall(SYS_open, "sys.txt", O_CREAT|O_WRONLY, S_IRWXU|S_IRGRP|S_IXGRP); char s[] = "hello sycall\n"; syscall(SYS_write, fd, s, strlen(s)); syscall(SYS_close, fd); return 0; } ``` Read data from file ```c #include #include #include #include int main(void) { int fd = syscall(SYS_open, "sys.txt", O_RDONLY); char s[128]; int r = syscall(SYS_read, fd, s, 128); s[r] = 0; syscall(SYS_close, fd); syscall(SYS_write, 0, s, r); return 0; } ``` ## Advanced topics ### Kernel module Linux kernel, macos kernel and *BSD's kernels written in C, so there is possibility to write kernel modules in C for some of those. Example will not match some specific things to local distribution. ```c ``` http://main.lv/writeup/kernel_hello_world.md ### Linking Linking is one of the most interesting parts of compiling of C code. When object file is created it contains functions and variables that can be of different type. And linking tries to resolve all of those. So there is possible to have fun with linking and content of object files. First example is piece of C code that can be compiled to object file, but it will not able to resolve to executable. ``` gcc -c link_elf.c ``` ```c int main() { fun1(); fun2(); } ``` So we can see that fun1 and fun2 are marked as undefined in object file. If we try compile it will not able to find those. So lets create one more object file ``` $ readelf -a link_elf.o Symbol table '.symtab' contains 6 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_elf.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 31 FUNC GLOBAL DEFAULT 1 main 4: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND fun1 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND fun2 ``` __link_fun1.c__ ```c void fun1() { printf("Hello fun1\n"); } void fun2() { printf("Hello fun2\n"); } ``` So now we have object file with funtions that are defined. and we see that its now have undefine pritnf/puts function there. ``` readelf -a link_fun1.o Symbol table '.symtab' contains 7 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun1.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata 4: 0000000000000000 22 FUNC GLOBAL DEFAULT 1 fun1 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts 6: 0000000000000016 22 FUNC GLOBAL DEFAULT 1 fun2 ``` we can merge both of those files together ```shell gcc -o link_elf link_elf.o link_fun1.o ``` The function in object files dont have any idea about input output types. That why anything can be linked that just match name lets rewrite code like this ```c int fun1(int i) { printf("Hello fun1\n"); } int fun2(int i) { printf("Hello fun2\n"); } ``` And this links without issue. Theat this as 2 sets that are merge together only few thins know when linking things. Return type, and function arguments arent exposed when object file is created. Functions can have aliases. __link_fun2.c__ ```c static void fun2() { printf("hello 2\n"); } __attribute__ ((alias("fun1"))); ``` Now function is local. ``` Symbol table '.symtab' contains 6 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun2.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata 4: 0000000000000000 22 FUNC LOCAL DEFAULT 1 fun2 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts ``` Lets compile all object to executable. And the function fun2 isnt used in this case, ``` $ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf $ ./link_elf Hello fun1 Hello fun2 ``` lets witch aliasing between 2 functions **fun2** ``` link_fun1.o 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun1.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata 4: 000000000000001d 29 FUNC LOCAL DEFAULT 1 fun2 5: 0000000000000000 29 FUNC GLOBAL DEFAULT 1 fun1 6: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts link_fun2.o 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun2.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata 4: 0000000000000000 22 FUNC GLOBAL DEFAULT 1 fun2 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts ``` ``` $ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf $ ./link_elf Hello fun1 hello 2 ``` So all of this plays role in linking object files. There is more interesting utilit called ld its doing things on lower level then gcc. ### Extern ### Attributes PASS ### Creating shared library PASS ### Create static libraries PASS ### Join all objects together PASS ### Compile with musl The libc is not the only option as standard c library, there is few others one of them is musl ``` $ musl-gcc hello_world.c -o hello_world $ file ./hello_world hello_world_musl: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-x86_64.so.1, not stripped ``` ### Inspect elf files There is few utilities that help to check if elf file is ok. ldd show what kind of shared libraries elf will try to load ``` $ ldd hello_world linux-vdso.so.1 (0x00007fffcb2ae000) libc.so.6 => /usr/lib/libc.so.6 (0x00007ffb80c00000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007ffb80fb9000) ``` Readelf allows to inspect content of elf files, headers and interpret values in headers. In few example above we allready used that feature to check content of compiled objectfiles. ``` $ readelf -s ./hello_world Symbol table '.symtab' contains 37 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS abi-note.c 2: 000000000000039c 32 OBJECT LOCAL DEFAULT 4 __abi_tag 3: 0000000000000000 0 FILE LOCAL DEFAULT ABS init.c 4: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c 5: 0000000000001070 0 FUNC LOCAL DEFAULT 14 deregister_tm_clones 6: 00000000000010a0 0 FUNC LOCAL DEFAULT 14 register_tm_clones 7: 00000000000010e0 0 FUNC LOCAL DEFAULT 14 __do_global_dtors_aux 8: 0000000000004030 1 OBJECT LOCAL DEFAULT 25 completed.0 9: 0000000000003df0 0 OBJECT LOCAL DEFAULT 20 __do_global_dtor[...] 10: 0000000000001130 0 FUNC LOCAL DEFAULT 14 frame_dummy 11: 0000000000003de8 0 OBJECT LOCAL DEFAULT 19 __frame_dummy_in[...] 12: 0000000000000000 0 FILE LOCAL DEFAULT ABS hello_world.c 13: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c 14: 00000000000020b0 0 OBJECT LOCAL DEFAULT 18 __FRAME_END__ 15: 0000000000000000 0 FILE LOCAL DEFAULT ABS 16: 0000000000003df8 0 OBJECT LOCAL DEFAULT 21 _DYNAMIC 17: 0000000000002010 0 NOTYPE LOCAL DEFAULT 17 __GNU_EH_FRAME_HDR 18: 0000000000004000 0 OBJECT LOCAL DEFAULT 23 _GLOBAL_OFFSET_TABLE_ 19: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_mai[...] 20: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterT[...] 21: 0000000000004020 0 NOTYPE WEAK DEFAULT 24 data_start 22: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts@GLIBC_2.2.5 23: 0000000000004030 0 NOTYPE GLOBAL DEFAULT 24 _edata 24: 0000000000001154 0 FUNC GLOBAL HIDDEN 15 _fini 25: 0000000000004020 0 NOTYPE GLOBAL DEFAULT 24 __data_start 26: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__ 27: 0000000000004028 0 OBJECT GLOBAL HIDDEN 24 __dso_handle 28: 0000000000002000 4 OBJECT GLOBAL DEFAULT 16 _IO_stdin_used 29: 0000000000004038 0 NOTYPE GLOBAL DEFAULT 25 _end 30: 0000000000001040 38 FUNC GLOBAL DEFAULT 14 _start 31: 0000000000004030 0 NOTYPE GLOBAL DEFAULT 25 __bss_start 32: 0000000000001139 26 FUNC GLOBAL DEFAULT 14 main 33: 0000000000004030 0 OBJECT GLOBAL HIDDEN 24 __TMC_END__ 34: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMC[...] 35: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@G[...] 36: 0000000000001000 0 FUNC GLOBAL HIDDEN 12 _init ``` ### No standard library Lets write hello world without libc. __noc.c__ ```c void _start() { } ``` ``` $ gcc -c noc.c $ ld -dynamic-linker /lib/ld-linux.so.2 noc.o -o noc $ file noc noc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped ``` Next step to make it more working then segfaulting. ```c void _start() { asm ( \ "movl $1,%eax\n" \ "xor %ebx,%ebx\n" \ "int $128\n" \ ); } ``` Now this is all about calling the syscalls Lets print the message ```c signed int write(int fd, const void *buf, unsigned int size) { signed int ret; asm volatile ( "syscall" : "=a" (ret) // EDI RSI RDX : "0"(1), "D"(fd), "S"(buf), "d"(size) : "rcx", "r11", "memory" ); return ret; } void _start() { write(1,"no libc",8); asm ( \ "movl $1,%eax\n" \ "xor %ebx,%ebx\n" \ "int $128\n" \ ); } ``` http://main.lv/writeup/making_c_executables_smaller.md ### Memory leaks Memory leaks is cruitial part of C language. Default case when they are detected are when allocated memory wasn free'd after use. If amount of this type of memory increasing then its can eventually fill whole memory and system will be unresponsive. Here is simple example how memory leak created and how to detect it. ```c #include int main() { char *ptr = malloc(12); return 0; } ``` The best way to detect it to use valgrind. ``` $ valgrind ./malloc ==778== HEAP SUMMARY: ==778== in use at exit: 12 bytes in 1 blocks ==778== total heap usage: 2 allocs, 1 frees, 1,036 bytes allocated ``` There is seen 2 allocs and 1 free. But we see that 12bytes after exit. So our created leak is detected. More complex example. So now we created leaking function and we called it 5 times. But in larger code base it would be nice to see location of leaks. ```c #include int* mem_alloc(int sz) { int *ret=NULL; if (sz < 0) { return NULL; } ret = malloc(sz*sizeof(int)); if (sz>10) { return NULL; } return ret; } int main() { mem_alloc(0); free(mem_alloc(1)); mem_alloc(100); free(mem_alloc(2)); mem_alloc(10); return 0; } ``` There is 3 blocks that leaks, and we see where its comming from there is possible to guess but it would better to have position of where leak located. ``` valgrind --leak-check=full --track-origins=yes --log-file=log.txt ./memleak2 ==4974== HEAP SUMMARY: ==4974== in use at exit: 440 bytes in 3 blocks ==4974== total heap usage: 5 allocs, 2 frees, 452 bytes allocated ==4974== ==4974== 0 bytes in 1 blocks are definitely lost in loss record 1 of 3 ==4974== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==4974== by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2) ==4974== by 0x10919E: main (in /home/fam/prog/c/undefined_c/memleak2) ==4974== ==4974== 40 bytes in 1 blocks are definitely lost in loss record 2 of 3 ==4974== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==4974== by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2) ==4974== by 0x1091D6: main (in /home/fam/prog/c/undefined_c/memleak2) ==4974== ==4974== 400 bytes in 1 blocks are definitely lost in loss record 3 of 3 ==4974== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==4974== by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2) ==4974== by 0x1091BA: main (in /home/fam/prog/c/undefined_c/memleak2) ==4974== ==4974== LEAK SUMMARY: ==4974== definitely lost: 440 bytes in 3 blocks ==4974== indirectly lost: 0 bytes in 0 blocks ==4974== possibly lost: 0 bytes in 0 blocks ==4974== still reachable: 0 bytes in 0 blocks ==4974== suppressed: 0 bytes in 0 blocks ``` Add compilation option __g3__ ``` gcc -g3 memleak2.c -o memleak2 ``` Now it shows source lines and trace from where the leaking code where called. Thats looks better now. ``` valgrind --leak-check=full --track-origins=yes --log-file=log.txt ./memleak2 ==5073== HEAP SUMMARY: ==5073== in use at exit: 440 bytes in 3 blocks ==5073== total heap usage: 5 allocs, 2 frees, 452 bytes allocated ==5073== ==5073== 0 bytes in 1 blocks are definitely lost in loss record 1 of 3 ==5073== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==5073== by 0x109179: mem_alloc (memleak2.c:10) ==5073== by 0x10919E: main (memleak2.c:22) ==5073== ==5073== 40 bytes in 1 blocks are definitely lost in loss record 2 of 3 ==5073== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==5073== by 0x109179: mem_alloc (memleak2.c:10) ==5073== by 0x1091D6: main (memleak2.c:30) ==5073== ==5073== 400 bytes in 1 blocks are definitely lost in loss record 3 of 3 ==5073== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==5073== by 0x109179: mem_alloc (memleak2.c:10) ==5073== by 0x1091BA: main (memleak2.c:26) ==5073== ==5073== LEAK SUMMARY: ==5073== definitely lost: 440 bytes in 3 blocks ==5073== indirectly lost: 0 bytes in 0 blocks ==5073== possibly lost: 0 bytes in 0 blocks ==5073== still reachable: 0 bytes in 0 blocks ==5073== suppressed: 0 bytes in 0 blocks ==5073== ``` ### Code coverage Compile file with extra flags and generate gcov file output. Ther is only one branch not used. Coverage should show with part isnt used. ```c #include int fun1(int a) { if (a < 0) { printf("Smaller then zero\n"); } if (a==0) { printf("Equails to zero\n"); } if (a>0) { printf("Bigger then zero\n"); } } int main() { printf("Start\n"); fun1(0); fun1(1); return 0; } ``` ``` $ gcc -fprofile-arcs -ftest-coverage coverage.c -o coverage $ gcov ./coverage File 'coverage.c' Lines executed:92.31% of 13 Creating 'coverage.c.gcov' Lines executed:92.31% of 13 ``` Gcov file content. So we scant see with line wasnt executed. ```c -: 0:Source:coverage.c -: 0:Graph:coverage.gcno -: 0:Data:coverage.gcda -: 0:Runs:1 -: 1:#include -: 2: 2: 3:int fun1(int a) { 2: 4: if (a < 0) { #####: 5: printf("Smaller then zero\n"); -: 6: } 2: 7: if (a==0) { 1: 8: printf("Equails to zero\n"); -: 9: } 2: 10: if (a>0) { 1: 11: printf("Bigger then zero\n"); -: 12: } 2: 13:} -: 14: 1: 15:int main() { -: 16: 1: 17: printf("Start\n"); 1: 18: fun1(0); 1: 19: fun1(1); -: 20: 1: 21: return 0; -: 22:} ``` ### Profiling Some parts of code can take substantial amount of time and those parts need to be identified. ```c #include #include #include void slow_sin() { float r=0.0f; for (int i=0;i<10000000;i++) { r += sinf(M_PI/8); } } void slower_sin() { double r=0.0f; for (int i=0;i<10000000;i++) { r += sin(M_PI/8); } } void fast_sin() { float pre_calc = sinf(M_PI/8); float r = 0.0f; for (int i=0;i<10000000;i++) { r += pre_calc; } } int main() { slow_sin(); slower_sin(); fast_sin(); } ``` Compile and rung with profiling ``` gcc -pg perf_speed.c -o perf_speed -lm ./perf_speed gprof perf_speed gmon.cov ``` ### Sanitizer C as a greate language have good features in standart such as undefined behaviour. And also there is possible to overwrite any data you whant with your code. One of the favorite mistake is to write some buffer overruns. Its possible to catch this type of errors with stack protection So in code belove there is possible to write in to array of size 8 more then 8 characters. This is because the is no any boundry check. C runtime will be able to detect this kind of things. ```c #include #include #include void fun(char *str,int size) { char local_var[8]; memcpy(local_var, str, size); printf("Whats inside a stack? %s\n",local_var); } int main() { char some_str1[] = "Hello!"; char some_str2[] = "Hello all!!!"; fun(some_str1,strlen(some_str1)); fun(some_str2,strlen(some_str2)); } ``` ``` Whats inside a stack? Hello! Whats inside a stack? Hello all!!! *** stack smashing detected ***: terminated fish: Job 1, './stack_overrun' terminated by signal SIGABRT (Abort) ``` If this isnt happening there is possible to add __-fstack-protector__ to compile flags. C have whole list of undefined behaviours incorporated in standard https://en.cppreference.com/w/c/language/behavior functions __f__ variable __a__ isnt initialized so its undefined behaviour but there still will be some value. Run few times and each time it returns new value when __f(0)__. ```c #include size_t f(int x) { size_t a; if(x) // either x nonzero or UB a = 42; return a; } int main() { printf("%d\n",f(0)); printf("%d\n",f(1)); printf("%d\n",f(42)); } ``` Division by zero. Function __f__ dont check if divisor is 0. Programm going to abort. add flag __-fsanitize=integer-divide-by-zero__ and it will be detected at runtime ```c #include size_t f(int x) { return 10/x; } int main() { printf("%d\n",f(0)); printf("%d\n",f(1)); printf("%d\n",f(42)); } ``` ``` undefined_b.c:5:14: runtime error: division by zero fish: Job 1, './undefined_b' terminated by signal SIGFPE (Floating point exception) ``` ### Write plugins ### Preload library ## Embedding C Most of the programming languages support embeding C. As C language have where simple functiong naming when its mangled to object format it makes it easy target when linking with other languages. Most of other languages have incompatible naming for functions when compiled to binary. ### Embed in C++ __lib.h__ ```c #include #include int fun_secret_1(); ``` __lib.c__ ```c #include "lib.h" int fun_secret_1() { printf("Hello from C\n"); return -1; } ``` First thing to notice is when file is compiled with C++ is that the name of the function are in different format then when its compiled with C. ``` $ g++ -c lib.c $ readelf -s lib.o Symbol table '.symtab' contains 6 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS lib.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata 4: 0000000000000000 26 FUNC GLOBAL DEFAULT 1 _Z12fun_secret_1v 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts ``` Lets tell C++ that file is C language by adding **extern "c"** __lib.h__ ```c #include #include extern "C" { int fun_secret_1(); } ``` __lib.c__ ```c #include "lib.h" extern "C" { int fun_secret_1() { printf("Hello from C\n"); return -1; } } ``` Now compiled object file have C function names. ``` $ g++ lib.c -c $ readelf -s lib.o Symbol table '.symtab' contains 6 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS lib.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata 4: 0000000000000000 26 FUNC GLOBAL DEFAULT 1 fun_secret_1 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts ``` __cppembed.cpp___ ```cpp #include "lib.h" int main() { fun_secret_1(); } ``` Doing oposite way running C++ from C [/writeup/wraping_c_plus_plus_exceptions_templates_and_classes_in_c.md](/writeup/wraping_c_plus_plus_exceptions_templates_and_classes_in_c.md) ### Embed in Go __lib.h__ ```c #include #include int fun_secret_1(); ``` __lib.c__ ```c #include "lib.h" int fun_secret_1() { printf("Hello from C\n"); return -1; } ``` __main.go__ ```go package main // #cgo CFLAGS: -g -Wall // #include // #include "lib.h" import "C" import ( "fmt" ) func main() { fmt.Println("Start program") C.fun_secret_1() fmt.Println("End program") } ``` ``` go build ``` [https://karthikkaranth.me/blog/calling-c-code-from-go/](https://karthikkaranth.me/blog/calling-c-code-from-go/) ### Embed in Swift [/writeup/linux_hello_world_in_swift.md](/writeup/linux_hello_world_in_swift.md) ### Embed in Rust __lib.c__ ```c #include #include int fun_secret_1() { printf("Hello from C\n"); return -1; } ``` ```rust extern "C" { fn fun_secret_1(); } //rustc main.rs -o hello fn main() { println!("Start program"); unsafe {fun_secret_1()} println!("End program"); } ``` Compile with ``` gcc -c lib.c gcc -shared lib.o -o liblib.so rustc main.rs -l lib -L . -o hello -C link-arg="-Wl,-rpath=./" ``` [https://dev.to/xphoniex/how-to-call-c-code-from-rust-56do](https://dev.to/xphoniex/how-to-call-c-code-from-rust-56do) ### Lua in C [/writeup/embedding_lua_in_c.md](/writeup/embedding_lua_in_c.md) ### Python in C ## Multiplatform ### Different flags ### Check architecture ```c ``` ### AArch64 https://snapshots.linaro.org/gnu-toolchain/13.0-2022.08-1/aarch64-linux-gnu/ download any of the version of gcc and extract Add bin directory location to env variable PATH ``` export PATH=$PATH:`pwd` ``` ___main.c__ ```c #include int main() { printf("Hello world arm64\n"); } ``` ``` $ arch64-linux-gnu-gcc main.c -o main $ ./main qemu-aarch64: Could not open '/lib/ld-linux-aarch64.so.1': No such file or directory $ file ./main ./main: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=12448d90030e2ad23dbe6b7bc82a4fa7b7de9659, for GNU/Linux 3.7.0, with debug_info, not stripped ``` Download sysroot image from linaro page. With running ``` strace ./main ``` It showed that searched path for libraries are in ``` /usr/gnemul/qemu-aarch64/lib/ ``` Found missing libc and ld-linux-aarch64 inside sysroot archive and copied at searched location amd now AArch64 binary is running. ``` $ ./main Hello world arm64 ``` ### AVR8 AVR is 8bit CPU that is quite popular for hobbiest. As baremetal device its doesnt have full libc support, and needs some setup before its possible to do basics things with it. __avr_echo.c__ ```c #include #define FOSC 16000000UL #define BAUD 9600 #define MYUBRR FOSC/16/BAUD-1 void USART_Init( unsigned int ubrr) { UBRRH = (unsigned char)(ubrr>>8); UBRRL = (unsigned char)ubrr; UCSRB = (1< avr_echo.s avr-objcopy -j .text -O ihex avr_echo.out avr_echo.hex avrdude -pm16 -cavrispv2 -Pusb -U flash:w:avr_echo.hex ``` ### Emscripten [/writeup/web_assembly_sdl_example.md](/writeup/web_assembly_sdl_example.md)