diff options
Diffstat (limited to 'md/notes/undefined_c/tutorial.md')
-rw-r--r-- | md/notes/undefined_c/tutorial.md | 1749 |
1 files changed, 1749 insertions, 0 deletions
diff --git a/md/notes/undefined_c/tutorial.md b/md/notes/undefined_c/tutorial.md new file mode 100644 index 0000000..731d42c --- /dev/null +++ b/md/notes/undefined_c/tutorial.md @@ -0,0 +1,1749 @@ +title:Undefined C +keywords:c,linux,asm + +# Undefined C + +There is possible to run piece of code inside online c compiler like https://www.onlinegdb.com/online_c_compiler +Or run locally. With base check is done with gcc compiler. There are many small tricks around running C code +in practice that aren't covered in any generic tutorials, so here is list of topics that may arise while +coding real C code outside of tutorials. For each case there is just small example, each of those could +take whole chapter on its own. + +## Compile + + +__hello_world.c__ +```c +int main() { + printf("Hello world\n"); +} +``` + +```bash +gcc hello_world.c -o hello_world +gcc -m32 hello_world.c -o hello_world_32 #for 32bit target +``` + +## Syntax + +### Variables + +Standard list of available types + +#### Check type size + +All types have size that are declared in bytes. Some of the types are machine dependents. +like int/long, if there is needed machine independent types then there are int32_t/uint32_t/int64_t/uint64_t + +Each architecture 8bit/16bit/32bit/64bit will have different size for those types + +Use __sizeof()__ + +Running on x86 machine +```c +#include <stdint.h> +#include <stdlib.h> +#include <stdio.h> +int main() { + printf("Sizeof int %lu\n",sizeof(int)); + printf("Sizeof int32_t %lu\n",sizeof(int32_t)); + printf("Sizeof int64_t %lu\n",sizeof(int64_t)); + printf("Sizeof long %lu\n",sizeof(long)); + printf("Sizeof long long %lu\n",sizeof(long long)); +} +``` + +Most safest/portable way is to use [u]int[8/16/32/64]_t types. + +Defined macros'es to get type max and min values are + +https://en.cppreference.com/w/c/types/limits + +```c +#include <limits.h> +int main() { + printf("INT_MIN %d\n",INT_MIN); + printf("INT_MAX %d\n", INT_MAX); + printf("LONG_MIN %ld\n",LONG_MIN); +} +``` + +Example from AVR __stdint.h__ +https://github.com/avrdudes/avr-libc/blob/main/include/stdint.h +Example from Libc +https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/stdint.h + + + +#### How to shoot the leg + +When code suppose to run on 32bit and 64bit platform the size of type may vary. +Need to take in account this case. + + + + + +### Functions + +Function syntax, there is nothing interesting on functions + +``` +<RETURN_TYPE> <NAME>(<TYPE> <NAME>,..) { + <EXPR> +} +``` + +Write simple function + +```c +int fun1() { + return -1; +} +``` + +Function can have multiple return statements. +Here is example whne function have 3 return values. +```c +int fun2(int i) { + if (i<0) return -1; + if (i>0) return 1; + return 0; +} +``` + +Get address of function + +```c +printf("fun1 address %016x",&fun1);//64bit platform +``` + +### If statement + +```c +if () ; +if () {} +``` + +One of the way to check error of returned functions is + +```c +if ((c = getfun()) == 0) { +} +``` + +Most simplest and outdated way to do this is when getting input from command line +```c +#include <stdio.h> +int main() { + int c; + char ch; + while ((c = getchar()) != EOF ) { + ch = c; + printf("Typed character %c\n",c); + } +} +``` + +### For cycle + +For loop is one that may involve some trickery, its +as simple as + +```c +for (<INITIAL>;<TERMINATE CONDITION>;<AFTER CYCLE>) { +} +``` + +Go over values from 1 till 10 + +```c +int i=0; +for (i=1;i<=10;i++) { + printf("%d\n",i) +} +``` + +Now lets do it from 10 till 1 + +```c +int i=0; +for (i=10;i>0;i--) { + printf("%d\n",i) +} +``` + +Now lets make one liner + +```c +for (i=0;i<10;i++,printf("%d\n",i)); +``` + +Yes there is possible to write as many expressions as needed. + + +### Structure + +Structure allows to combine types under one new type. Structure is convenient way how to combine set +of types and reuse them as one. + +```c +struct struct1 { + uint8_t a; + uint16_t b; + uint32_t c; + uint64_t d; +}; +``` + +Total intuitive size of structure would be +```c +int total_szie = sizeof(uint8_t) + sizeof(uint16_t) + sizeof(uint32_t) + sizeof(uint64_t); +int real_size = sizeof(struct1); +``` + +Types are placed inside structure to make fast access to them. Some instructions of CPU may require +to access aligned memory addresses to not have penalty on accessing types inside structure. + +To directly mess with alignment of types use attribute +```c +__attribute__ ((aligned (8))) +``` + + +Use attributes to pack structure and be not architecture dependent. + +```c +struct struct2 { + uint8_t a; + uint16_t b; + uint32_t c; + uint64_t d; +} __attribute__((packed)); +``` + +Now let check size of structure after it packed + +```c +int new_size = sizeof(struct2); +``` + +Also there is possible to add aligmnet to each time in structure +```c +struct struct3 { + uint8_t a __attribute__((aligned (8))); + uint16_t b __attribute__((aligned (8))); + uint32_t c __attribute__((aligned (8))); + uint64_t d __attribute__((aligned (8))); +} __attribute__((aligned (8))); +``` + +Now size of structure will be 32. + +All results on amd64, other arch may differ. + +### How to shoot leg +Forget that struct size is not consistent. + +### Recursion + +Recursion is technique that could be useful to write shorter code +and deal with cycles. One thing that recursion suffer is that it consumes +stack memory and its have default limit on platform. + +```c +#include <stdio.h> +#include <stdlib.h> + +int fun_r(int i) { + printf("val %d\n",i); + fun_r(i+1); + return 0; +} + +int main() +{ + fun_r(0); +} +``` + +Program will fail after its reach out of stack range. +When increase the default stack limit it go more further. + + +Check default stack size + +``` +ulimit -s +``` + +Set stack size + +``` +ulimit -s 16384 +``` + +### Macro + +There is many things useful as macros. There is many tricks in macros to emit +useful parts of code. + +Define values, as its enum. +```c +#define VAL_0 0 +#define VAL_1 1 +#define VAL_LAST VAL_1 +``` + +Multiline macro +```c +#define INC_FUN(TYPE) TYPE inc_##TYPE(a TYPE){\ + TYPE c=1\ + return a + c\ +} + +INC_FUN(int) +INC_FUN(char) +INC_FUN(double) +INC_FUN(notype) +``` + +to check code expansion of macro run + +``` +gcc -E <SOURCE_FILE> +``` + + + +http://main.lv/writeup/c_macro_tricks.md + + +https://jadlevesque.github.io/PPMP-Iceberg/ + + +### Pointers + +One the C most loved feature is pointers, they allow to access addresses without any sanity check +and they dont have any lifetime, so anything is possible with those. + +Pointer contains address which is interpreted according of pointer type + +```c +int c; +int ptr=&c; +``` + +Go over array of chars +```c +#include <stdio.h> +#include <stdlib.h> + +int main() { + char s[]="asd"; + char *c=&s; + while (*c != 0) { + printf("NExt char %c addr %016x\n",*c,c); + c++; + } +} +``` +Go over array of ints +```c + int i=0; + int arr[] = {9,7,5,3,1}; + int *ptr = arr; + while (i<5) { + printf("Number value %d addr %016x\n",*ptr, ptr); + ptr++; + i++; + } +``` + +Pointer arithmetics like +1 will move to next address that is offset of type size. +As example below structure size is 12, and increment of pointer to that structure +increment address to sizeof structure. And yes address is pointing to not mapped memory, so it +will segfault if accessed. + +```c +struct size12 { + int a,b,c; +} + +int main() { + struct size12 *s=0; + s++; + printf("%016x\n",s); + s++; + printf("%016x\n",s); +} +``` + +Double pointers are pointers to pointers + +```c +#include <stdio.h> + +int main(int argc, char **argv) { + char *arg = argv[0]; + printf("Program name %s\n",arg); +} +``` + +#### How to shoot the leg +Run pointer in while loop incrementing pointer. It will stop only when segfaults. + +Dont initialize pointer and it will have random value. + + + +### Allocate memory + +From programs perspective memory allocation is adding address range to executable that can be addressed. + +malloc should be accompanied with free statement, otherwise it will have memory leaks. + +```c +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +int main() { + char *c = malloc(16); + memset(c,0,16); + int *arr = malloc(16*sizeof(int)); + memset(arr,0,16*sizeof(int)); + free(c); + free(arr); +} +``` + +### Signed/Unsigned + +Signed and unsigned variables differ just in one bit interpretation. But they have different behavior on minimal and maximal values. + + +```c +#include <stdio.h> +#include <limits.h> +int main() +{ + int i=INT_MAX; + unsigned int u=UINT_MAX; + + printf("i=%d\n",i); + printf("u=%u\n",u); + + i++; + u++; + printf("i=%d\n",i); + printf("u=%u\n",u); + i=0; + u=0; + i--; + u--; + printf("i=%d\n",i); + printf("u=%u\n",u); + +} +``` + +### Endianess + + +```c +#include <stdlib.h> +#include <stdio.h> +#include <fcntl.h> +#include <unistd.h> +#include <stdint.h> + +int main() { + int arr[4] = {0x00112233,0x44556677,0x8899AABB, 0xCCDDEEFF}; + printf("%08x\n",arr[0]); + printf("%08x\n",arr[1]); + printf("%08x\n",arr[2]); + printf("%08x\n",arr[3]); + + FILE *f = fopen("int.hex","w+"); + fprintf(f,"%08x",arr[0]); + fprintf(f,"%08x",arr[1]); + fprintf(f,"%08x",arr[2]); + fprintf(f,"%08x",arr[3]); + fclose(f); + + int fd=open("int.bin",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO); + write(fd,arr,sizeof(arr)); + close(fd); + + int i; + fd = open("int.bin2",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO); + for (i=0;i<4;i++) { + uint32_t val = (arr[i]>>16) &0x0000ffff; + val += (arr[i]<<16)&0xffff0000; + write(fd,&val,sizeof(uint32_t)); + } + close(fd); +} +``` + +While saving formated values to file you will get what you expect +``` +$ cat int.hex +00112233445566778899aabbccddeeff +``` + +Saving just memory dump of all values, will give you different result +``` +$ hexdump int.bin +0000000 2233 0011 6677 4455 aabb 8899 eeff ccdd +0000010 +``` + +Need to swap 16bit pairs to look same as value memory dump +``` +$ hexdump int.bin2 +0000000 0011 2233 4455 6677 8899 aabb ccdd eeff +0000010 +``` + +### Compiler flags + +Compiler have whole list of command line arguments that you can enable for different purposes, lets look into some of them +https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html + +Lets try to apply some of the flags to examples above. + +Best starte options is, those will give you more warnings. + +``` +-Wall -Wextra +``` + +Most of the examples here was written in sloppy style, so adding extra checks like will find more issues with code, probably +all of provided examples will show issues with this extra compiler flags + +``` +Wformat-security -Wduplicated-cond -Wfloat-equal -Wshadow -Wconversion -Wjump-misses-init -Wlogical-not-parentheses -Wnull-dereference +``` + +To get all macroses expanded in C code add compiler flag. Output will be C source with all macro expansion +``` +-E +``` + +Output resulting file not to binary but to generated assembly add +``` +-S +``` + +More readable output can be obtained with + +``` +gcc FILE.c -Wa,-adhln=FILE.S -g -fverbose-asm -masm=intel +``` + +Basic compiler optimisation flags that can speedup program or make it smaller + +``` +-O -O0 -O1 -O2 -O3 -Os -Ofast -Og -Oz +``` + +https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Optimize-Options + +https://panthema.net/2013/0124-GCC-Output-Assembler-Code/ +https://blogs.oracle.com/linux/post/making-code-more-secure-with-gcc-part-1 + + +### Shared library + +Shared library is common way how to reuse big chunks of code. + +```c +#include <stdio.h> +int fun1() { + return 1; +} + +int fun2() { + printf("Function name fun2\n"); +} + +int fun3(int a, int b) { + return a+b; +} +``` + +``` +$ gcc -c lib_share.c +$ gcc -shared -o lib_share.so libshare.o +$ ldd lib_share.so + linux-vdso.so.1 (0x00007ffdb994d000) + libc.so.6 => /usr/lib/libc.so.6 (0x00007f0c39400000) + /usr/lib64/ld-linux-x86-64.so.2 (0x00007f0c39835000) +``` + +Now lets link to our binary +```c +#include <stdio.h> + +//functions that are implemented in shared lib +int fun1(); +int fun2(); +int fun3(int a, int b); + +int main() { + fun1(); + fun2(); + fun3(); +} +``` + +``` +$ gcc -L. -lshare use_share.c -o use_share +./use_share +./use_share: error while loading shared libraries: libshare.so: cannot open shared object file: No such file or directory +ldd ./use_share + linux-vdso.so.1 (0x00007ffedcad5000) + libshare.so => not found + libc.so.6 => /usr/lib/libc.so.6 (0x00007f7b99a00000) + /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f7b99c90000) +``` + +Library is not in search path +``` +$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd` +$ ./use_share +$ ldd use_share + linux-vdso.so.1 (0x00007fffc415c000) + libshare.so => /your/path/libshare.so (0x00007f48b03c6000) + libc.so.6 => /usr/lib/libc.so.6 (0x00007f48b0000000) + /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f48b03d2000) +``` + +Other way is to set custom library search location. Lets set it to search in current directory. +And no need to modify LD_LIBRARY_PATH + +``` +$ gcc use_share.c -o use_share -L. -lshare -Wl,-rpath=./ +$ ldd ./use_share + linux-vdso.so.1 (0x00007fff5c964000) + libshare.so => ./libshare.so (0x00007f791000f000) + libc.so.6 => /usr/lib/libc.so.6 (0x00007f790fc00000) + /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f791001b000) +``` + +So now executable runs libshare from local directory. Ofc there is possible to install shared library into systems /usr/lib + +### Static library + + + + +### Static binary + +Static binary don't use any shared libraries, and its possible to built it once and distribute on other platforms +without need to install dependencies. + + +```c +#include <stdio.h> +#include <stdlib.h> + +int main(int argc, char **argv) { + return 0; +} +``` + +First step to compile file and see that is dynamically lined +``` +$ gcc static_elf.c -o static_elf +$ file static_elf +static_elf: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=bc6ac706075874858e1c4a8accf77e704f4ea25a, for GNU/Linux 4.4.0, with debug_info, not stripped +$ ldd ./static_elf + linux-vdso.so.1 (0x00007ffccef49000) + libc.so.6 => /usr/lib/libc.so.6 (0x00007fcbb8800000) + /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fcbb8b63000) + +``` + +After adding static option we can verify that tools now report it as statically linked. Size of binary increased as all functions +that require to run executable are now contained in binary. + +``` +$ gcc static_elf.c -static -o static_elf +$ file static_elf +static_elf: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=c54d2e4d2a3d11fe920bee9a44af045c6f67ab56, for GNU/Linux 4.4.0, with debug_info, not stripped +$ ldd static_elf + not a dynamic executable +``` + +Statically compiled file should work on most platforms. + + + +### Atomic +HERE + +### Multithreading +HERE + + +<!-- +### stdin,stdout,stderr +### Styles + +---> + + + + +## Basic usage + +### File manipulation with libc + +Create file open data using libc functions + +```c +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +int main() { + FILE *f = fopen("file.txt","w+"); + char *s = "Hello"; + fwrite(s,1,strlen(s),f); + fclose(f); +} +``` + +Open file and read data back + +```c +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +int main() { + FILE *f = fopen("file.txt","r"); + char buf[128]; + int r; + r = fread(buf,1,128,f); + buf[r] = 0; + printf("->%s\n",buf,r); + fclose(f); +} +``` + +### File manipulation with syscalls + +Now lets do the same without using libc functions using syscall function to directly use syscalls, +its also straightforward to rewrite example for assembly. + +```c +#include <unistd.h> +#include <fcntl.h> +#include <sys/syscall.h> +#include <string.h> + +int main(void) { + int fd = syscall(SYS_open, "sys.txt", O_CREAT|O_WRONLY, S_IRWXU|S_IRGRP|S_IXGRP); + char s[] = "hello sycall\n"; + syscall(SYS_write, fd, s, strlen(s)); + syscall(SYS_close, fd); + return 0; +} +``` + + +Read data from file + +```c +#include <unistd.h> +#include <fcntl.h> +#include <sys/syscall.h> +#include <string.h> + +int main(void) { + int fd = syscall(SYS_open, "sys.txt", O_RDONLY); + char s[128]; + int r = syscall(SYS_read, fd, s, 128); + s[r] = 0; + syscall(SYS_close, fd); + syscall(SYS_write, 0, s, r); + return 0; +} +``` + +## Advanced topics + +### Kernel module + +Linux kernel, macos kernel and *BSD's kernels written in C, +so there is possibility to write kernel modules in C for some of those. + +Example will not match some specific things to local distribution. + +```c + +``` + +http://main.lv/writeup/kernel_hello_world.md + +### Linking + +Linking is one of the most interesting parts of compiling of C code. When object file is created +it contains functions and variables that can be of different type. And linking tries to resolve +all of those. So there is possible to have fun with linking and content of object files. + + +First example is piece of C code that can be compiled to object file, but it will not able to +resolve to executable. +``` +gcc -c link_elf.c +``` +```c +int main() { + fun1(); + fun2(); +} +``` +So we can see that fun1 and fun2 are marked as undefined in object file. If we try compile it will not able to find those. +So lets create one more object file +``` +$ readelf -a link_elf.o + +Symbol table '.symtab' contains 6 entries: + Num: Value Size Type Bind Vis Ndx Name + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_elf.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 31 FUNC GLOBAL DEFAULT 1 main + 4: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND fun1 + 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND fun2 + +``` +__link_fun1.c__ +```c +void fun1() { + printf("Hello fun1\n"); +} +void fun2() { + printf("Hello fun2\n"); +} +``` + +So now we have object file with funtions that are defined. and we see that its now have undefine pritnf/puts function there. + +``` +readelf -a link_fun1.o +Symbol table '.symtab' contains 7 entries: + Num: Value Size Type Bind Vis Ndx Name + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun1.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata + 4: 0000000000000000 22 FUNC GLOBAL DEFAULT 1 fun1 + 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts + 6: 0000000000000016 22 FUNC GLOBAL DEFAULT 1 fun2 + +``` + +we can merge both of those files together +```shell +gcc -o link_elf link_elf.o link_fun1.o +``` +The function in object files dont have any idea about input output types. That why anything can be linked that just match name +lets rewrite code like this + +```c +int fun1(int i) { + printf("Hello fun1\n"); +} +int fun2(int i) { + printf("Hello fun2\n"); +} +``` +And this links without issue. Theat this as 2 sets that are merge together only few thins know when linking things. +Return type, and function arguments arent exposed when object file is created. + +Functions can have aliases. + +__link_fun2.c__ + +```c +static void fun2() { + printf("hello 2\n"); +} __attribute__ ((alias("fun1"))); +``` + +Now function is local. + +``` +Symbol table '.symtab' contains 6 entries: + Num: Value Size Type Bind Vis Ndx Name + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun2.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata + 4: 0000000000000000 22 FUNC LOCAL DEFAULT 1 fun2 + 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts +``` + +Lets compile all object to executable. And the function fun2 isnt used in this case, + + + +``` +$ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf +$ ./link_elf +Hello fun1 +Hello fun2 + +``` + + + +lets witch aliasing between 2 functions **fun2** + + +``` +link_fun1.o + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun1.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata + 4: 000000000000001d 29 FUNC LOCAL DEFAULT 1 fun2 + 5: 0000000000000000 29 FUNC GLOBAL DEFAULT 1 fun1 + 6: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts + +link_fun2.o + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun2.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata + 4: 0000000000000000 22 FUNC GLOBAL DEFAULT 1 fun2 + 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts + +``` + +``` +$ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf +$ ./link_elf +Hello fun1 +hello 2 +``` + +So all of this plays role in linking object files. +There is more interesting utilit called ld its doing things on lower level then gcc. + + +### Extern + +### Attributes +PASS +### Creating shared library +PASS +### Create static libraries +PASS +### Join all objects together +PASS +### Compile with musl + +The libc is not the only option as standard c library, there is few others one of them is musl + +``` +$ musl-gcc hello_world.c -o hello_world +$ file ./hello_world +hello_world_musl: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-x86_64.so.1, not stripped +``` + + +### Inspect elf files + +There is few utilities that help to check if elf file is ok. + +ldd show what kind of shared libraries elf will try to load + +``` +$ ldd hello_world + linux-vdso.so.1 (0x00007fffcb2ae000) + libc.so.6 => /usr/lib/libc.so.6 (0x00007ffb80c00000) + /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007ffb80fb9000) + +``` + +Readelf allows to inspect content of elf files, headers and interpret values in headers. +In few example above we allready used that feature to check content of compiled objectfiles. + +``` +$ readelf -s ./hello_world +Symbol table '.symtab' contains 37 entries: + Num: Value Size Type Bind Vis Ndx Name + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS abi-note.c + 2: 000000000000039c 32 OBJECT LOCAL DEFAULT 4 __abi_tag + 3: 0000000000000000 0 FILE LOCAL DEFAULT ABS init.c + 4: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c + 5: 0000000000001070 0 FUNC LOCAL DEFAULT 14 deregister_tm_clones + 6: 00000000000010a0 0 FUNC LOCAL DEFAULT 14 register_tm_clones + 7: 00000000000010e0 0 FUNC LOCAL DEFAULT 14 __do_global_dtors_aux + 8: 0000000000004030 1 OBJECT LOCAL DEFAULT 25 completed.0 + 9: 0000000000003df0 0 OBJECT LOCAL DEFAULT 20 __do_global_dtor[...] + 10: 0000000000001130 0 FUNC LOCAL DEFAULT 14 frame_dummy + 11: 0000000000003de8 0 OBJECT LOCAL DEFAULT 19 __frame_dummy_in[...] + 12: 0000000000000000 0 FILE LOCAL DEFAULT ABS hello_world.c + 13: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c + 14: 00000000000020b0 0 OBJECT LOCAL DEFAULT 18 __FRAME_END__ + 15: 0000000000000000 0 FILE LOCAL DEFAULT ABS + 16: 0000000000003df8 0 OBJECT LOCAL DEFAULT 21 _DYNAMIC + 17: 0000000000002010 0 NOTYPE LOCAL DEFAULT 17 __GNU_EH_FRAME_HDR + 18: 0000000000004000 0 OBJECT LOCAL DEFAULT 23 _GLOBAL_OFFSET_TABLE_ + 19: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_mai[...] + 20: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterT[...] + 21: 0000000000004020 0 NOTYPE WEAK DEFAULT 24 data_start + 22: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts@GLIBC_2.2.5 + 23: 0000000000004030 0 NOTYPE GLOBAL DEFAULT 24 _edata + 24: 0000000000001154 0 FUNC GLOBAL HIDDEN 15 _fini + 25: 0000000000004020 0 NOTYPE GLOBAL DEFAULT 24 __data_start + 26: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__ + 27: 0000000000004028 0 OBJECT GLOBAL HIDDEN 24 __dso_handle + 28: 0000000000002000 4 OBJECT GLOBAL DEFAULT 16 _IO_stdin_used + 29: 0000000000004038 0 NOTYPE GLOBAL DEFAULT 25 _end + 30: 0000000000001040 38 FUNC GLOBAL DEFAULT 14 _start + 31: 0000000000004030 0 NOTYPE GLOBAL DEFAULT 25 __bss_start + 32: 0000000000001139 26 FUNC GLOBAL DEFAULT 14 main + 33: 0000000000004030 0 OBJECT GLOBAL HIDDEN 24 __TMC_END__ + 34: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMC[...] + 35: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@G[...] + 36: 0000000000001000 0 FUNC GLOBAL HIDDEN 12 _init +``` + +### No standard library + +Lets write hello world without libc. + +__noc.c__ +```c +void _start() { + +} +``` + +``` +$ gcc -c noc.c +$ ld -dynamic-linker /lib/ld-linux.so.2 noc.o -o noc +$ file noc +noc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped +``` + +Next step to make it more working then segfaulting. + +```c +void _start() { + asm ( \ + "movl $1,%eax\n" \ + "xor %ebx,%ebx\n" \ + "int $128\n" \ + ); +} +``` + +Now this is all about calling the syscalls + +Lets print the message +```c +signed int write(int fd, const void *buf, unsigned int size) +{ + signed int ret; + asm volatile + ( + "syscall" + : "=a" (ret) + // EDI RSI RDX + : "0"(1), "D"(fd), "S"(buf), "d"(size) + : "rcx", "r11", "memory" + ); + return ret; +} + +void _start() { + write(1,"no libc",8); + asm ( \ + "movl $1,%eax\n" \ + "xor %ebx,%ebx\n" \ + "int $128\n" \ + ); +} +``` + +http://main.lv/writeup/making_c_executables_smaller.md + +### Memory leaks + +Memory leaks is cruitial part of C language. Default case when they are detected are +when allocated memory wasn free'd after use. If amount of this type of memory increasing then +its can eventually fill whole memory and system will be unresponsive. Here is simple example +how memory leak created and how to detect it. + +```c +#include <stdlib.h> + +int main() { + + char *ptr = malloc(12); + + return 0; +} +``` + +The best way to detect it to use valgrind. + +``` +$ valgrind ./malloc + +==778== HEAP SUMMARY: +==778== in use at exit: 12 bytes in 1 blocks +==778== total heap usage: 2 allocs, 1 frees, 1,036 bytes allocated +``` + +There is seen 2 allocs and 1 free. But we see that 12bytes after exit. So our created leak is detected. +More complex example. So now we created leaking function and we called it 5 times. But in larger code +base it would be nice to see location of leaks. + +```c +#include <stdlib.h> + +int* mem_alloc(int sz) { + int *ret=NULL; + + if (sz < 0) { + return NULL; + } + + ret = malloc(sz*sizeof(int)); + + if (sz>10) { + return NULL; + } + + return ret; + +} + +int main() { + + mem_alloc(0); + + free(mem_alloc(1)); + + mem_alloc(100); + + free(mem_alloc(2)); + + mem_alloc(10); + + return 0; +} +``` + +There is 3 blocks that leaks, and we see where its comming from there is possible to guess but it would better +to have position of where leak located. + +``` +valgrind --leak-check=full --track-origins=yes --log-file=log.txt ./memleak2 + +==4974== HEAP SUMMARY: +==4974== in use at exit: 440 bytes in 3 blocks +==4974== total heap usage: 5 allocs, 2 frees, 452 bytes allocated +==4974== +==4974== 0 bytes in 1 blocks are definitely lost in loss record 1 of 3 +==4974== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) +==4974== by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2) +==4974== by 0x10919E: main (in /home/fam/prog/c/undefined_c/memleak2) +==4974== +==4974== 40 bytes in 1 blocks are definitely lost in loss record 2 of 3 +==4974== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) +==4974== by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2) +==4974== by 0x1091D6: main (in /home/fam/prog/c/undefined_c/memleak2) +==4974== +==4974== 400 bytes in 1 blocks are definitely lost in loss record 3 of 3 +==4974== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) +==4974== by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2) +==4974== by 0x1091BA: main (in /home/fam/prog/c/undefined_c/memleak2) +==4974== +==4974== LEAK SUMMARY: +==4974== definitely lost: 440 bytes in 3 blocks +==4974== indirectly lost: 0 bytes in 0 blocks +==4974== possibly lost: 0 bytes in 0 blocks +==4974== still reachable: 0 bytes in 0 blocks +==4974== suppressed: 0 bytes in 0 blocks +``` + +Add compilation option __g3__ + +``` +gcc -g3 memleak2.c -o memleak2 +``` + +Now it shows source lines and trace from where the leaking code where called. Thats looks better now. + +``` +valgrind --leak-check=full --track-origins=yes --log-file=log.txt ./memleak2 + +==5073== HEAP SUMMARY: +==5073== in use at exit: 440 bytes in 3 blocks +==5073== total heap usage: 5 allocs, 2 frees, 452 bytes allocated +==5073== +==5073== 0 bytes in 1 blocks are definitely lost in loss record 1 of 3 +==5073== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) +==5073== by 0x109179: mem_alloc (memleak2.c:10) +==5073== by 0x10919E: main (memleak2.c:22) +==5073== +==5073== 40 bytes in 1 blocks are definitely lost in loss record 2 of 3 +==5073== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) +==5073== by 0x109179: mem_alloc (memleak2.c:10) +==5073== by 0x1091D6: main (memleak2.c:30) +==5073== +==5073== 400 bytes in 1 blocks are definitely lost in loss record 3 of 3 +==5073== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) +==5073== by 0x109179: mem_alloc (memleak2.c:10) +==5073== by 0x1091BA: main (memleak2.c:26) +==5073== +==5073== LEAK SUMMARY: +==5073== definitely lost: 440 bytes in 3 blocks +==5073== indirectly lost: 0 bytes in 0 blocks +==5073== possibly lost: 0 bytes in 0 blocks +==5073== still reachable: 0 bytes in 0 blocks +==5073== suppressed: 0 bytes in 0 blocks +==5073== +``` + + +### Code coverage + +Compile file with extra flags and generate gcov file output. +Ther is only one branch not used. Coverage should show with part isnt used. + +```c +#include <stdio.h> + +int fun1(int a) { + if (a < 0) { + printf("Smaller then zero\n"); + } + if (a==0) { + printf("Equails to zero\n"); + } + if (a>0) { + printf("Bigger then zero\n"); + } +} + +int main() { + + printf("Start\n"); + fun1(0); + fun1(1); + + return 0; +} +``` + +``` +$ gcc -fprofile-arcs -ftest-coverage coverage.c -o coverage +$ gcov ./coverage +File 'coverage.c' +Lines executed:92.31% of 13 +Creating 'coverage.c.gcov' + +Lines executed:92.31% of 13 + +``` + +Gcov file content. So we scant see with line wasnt executed. + +```c + -: 0:Source:coverage.c + -: 0:Graph:coverage.gcno + -: 0:Data:coverage.gcda + -: 0:Runs:1 + -: 1:#include <stdio.h> + -: 2: + 2: 3:int fun1(int a) { + 2: 4: if (a < 0) { + #####: 5: printf("Smaller then zero\n"); + -: 6: } + 2: 7: if (a==0) { + 1: 8: printf("Equails to zero\n"); + -: 9: } + 2: 10: if (a>0) { + 1: 11: printf("Bigger then zero\n"); + -: 12: } + 2: 13:} + -: 14: + 1: 15:int main() { + -: 16: + 1: 17: printf("Start\n"); + 1: 18: fun1(0); + 1: 19: fun1(1); + -: 20: + 1: 21: return 0; + -: 22:} +``` + +### Profiling + +Some parts of code can take substantial amount of time and those parts need to be identified. + +```c +#include <stdio.h> +#include <stdlib.h> +#include <math.h> + +void slow_sin() { + float r=0.0f; + for (int i=0;i<10000000;i++) { + r += sinf(M_PI/8); + } +} + +void slower_sin() { + double r=0.0f; + for (int i=0;i<10000000;i++) { + r += sin(M_PI/8); + } +} + +void fast_sin() { + float pre_calc = sinf(M_PI/8); + float r = 0.0f; + for (int i=0;i<10000000;i++) { + r += pre_calc; + } +} + +int main() { + slow_sin(); + slower_sin(); + fast_sin(); +} +``` + +Compile and rung with profiling + +``` +gcc -pg perf_speed.c -o perf_speed -lm +./perf_speed +gprof perf_speed gmon.cov +``` + +### Sanitizer + +C as a greate language have good features in standart such as undefined behaviour. And +also there is possible to overwrite any data you whant with your code. One of the favorite +mistake is to write some buffer overruns. Its possible to catch this type of errors with +stack protection + +So in code belove there is possible to write in to array of size 8 more then 8 characters. This is because the is no any boundry check. +C runtime will be able to detect this kind of things. + + +```c +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +void fun(char *str,int size) { + char local_var[8]; + memcpy(local_var, str, size); + printf("Whats inside a stack? %s\n",local_var); +} + +int main() { + char some_str1[] = "Hello!"; + char some_str2[] = "Hello all!!!"; + + fun(some_str1,strlen(some_str1)); + fun(some_str2,strlen(some_str2)); +} +``` + +``` +Whats inside a stack? Hello! +Whats inside a stack? Hello all!!! +*** stack smashing detected ***: terminated +fish: Job 1, './stack_overrun' terminated by signal SIGABRT (Abort) +``` + +If this isnt happening there is possible to add __-fstack-protector__ to compile flags. + +C have whole list of undefined behaviours incorporated in standard +https://en.cppreference.com/w/c/language/behavior + + + +functions __f__ variable __a__ isnt initialized so its undefined behaviour but there still will be some value. Run few +times and each time it returns new value when __f(0)__. +```c +#include <stdio.h> + +size_t f(int x) +{ + size_t a; + if(x) // either x nonzero or UB + a = 42; + return a; +} + +int main() { + printf("%d\n",f(0)); + printf("%d\n",f(1)); + printf("%d\n",f(42)); +} +``` + +Division by zero. Function __f__ dont check if divisor is 0. Programm going to abort. +add flag __-fsanitize=integer-divide-by-zero__ and it will be detected at runtime + +```c +#include <stdio.h> + +size_t f(int x) +{ + return 10/x; +} + +int main() { + printf("%d\n",f(0)); + printf("%d\n",f(1)); + printf("%d\n",f(42)); +} +``` + +``` +undefined_b.c:5:14: runtime error: division by zero +fish: Job 1, './undefined_b' terminated by signal SIGFPE (Floating point exception) +``` + +<!-- +### FARMA-C +---> + + + + +### Write plugins +### Preload library + + +## Embedding C + +Most of the programming languages support embeding C. As C language have where simple +functiong naming when its mangled to object format it makes it easy target when +linking with other languages. Most of other languages have incompatible naming for +functions when compiled to binary. + +### Embed in C++ + +__lib.h__ +```c +#include <stdlib.h> +#include <stdio.h> + +int fun_secret_1(); +``` + +__lib.c__ +```c +#include "lib.h" + +int fun_secret_1() { + printf("Hello from C\n"); + return -1; +} +``` + +First thing to notice is when file is compiled with C++ is that the name of the function are in different format +then when its compiled with C. +``` +$ g++ -c lib.c +$ readelf -s lib.o +Symbol table '.symtab' contains 6 entries: + Num: Value Size Type Bind Vis Ndx Name + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS lib.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata + 4: 0000000000000000 26 FUNC GLOBAL DEFAULT 1 _Z12fun_secret_1v + 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts + +``` + +Lets tell C++ that file is C language by adding **extern "c"** + +__lib.h__ +```c +#include <stdlib.h> +#include <stdio.h> + +extern "C" { + +int fun_secret_1(); + +} +``` + +__lib.c__ +```c +#include "lib.h" + +extern "C" { + +int fun_secret_1() { + printf("Hello from C\n"); + return -1; +} + +} +``` + +Now compiled object file have C function names. + +``` +$ g++ lib.c -c +$ readelf -s lib.o + +Symbol table '.symtab' contains 6 entries: + Num: Value Size Type Bind Vis Ndx Name + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS lib.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata + 4: 0000000000000000 26 FUNC GLOBAL DEFAULT 1 fun_secret_1 + 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts + +``` + +__cppembed.cpp___ +```cpp +#include "lib.h" + +int main() { + fun_secret_1(); +} +``` + +Doing oposite way running C++ from C +[/writeup/wraping_c_plus_plus_exceptions_templates_and_classes_in_c.md](/writeup/wraping_c_plus_plus_exceptions_templates_and_classes_in_c.md) + +### Embed in Go + +__lib.h__ +```c +#include <stdlib.h> +#include <stdio.h> + +int fun_secret_1(); +``` +__lib.c__ +```c +#include "lib.h" + +int fun_secret_1() { + printf("Hello from C\n"); + return -1; +} +``` + +__main.go__ +```go +package main +// #cgo CFLAGS: -g -Wall +// #include <stdlib.h> +// #include "lib.h" +import "C" +import ( + "fmt" + +) +func main() { + fmt.Println("Start program") + C.fun_secret_1() + fmt.Println("End program") +} +``` + +``` +go build +``` + +[https://karthikkaranth.me/blog/calling-c-code-from-go/](https://karthikkaranth.me/blog/calling-c-code-from-go/) + + +### Embed in Swift + +[/writeup/linux_hello_world_in_swift.md](/writeup/linux_hello_world_in_swift.md) + +### Embed in Rust + +__lib.c__ +```c +#include <stdio.h> +#include <stdlib.h> + +int fun_secret_1() { + printf("Hello from C\n"); + return -1; +} +``` + +```rust +extern "C" { + fn fun_secret_1(); +} + +//rustc main.rs -o hello +fn main() { + println!("Start program"); + unsafe {fun_secret_1()} + println!("End program"); +} +``` + +Compile with + +``` +gcc -c lib.c +gcc -shared lib.o -o liblib.so +rustc main.rs -l lib -L . -o hello -C link-arg="-Wl,-rpath=./" +``` + +[https://dev.to/xphoniex/how-to-call-c-code-from-rust-56do](https://dev.to/xphoniex/how-to-call-c-code-from-rust-56do) + +### Lua in C + +[/writeup/embedding_lua_in_c.md](/writeup/embedding_lua_in_c.md) + +### Python in C + + + +## Multiplatform + +### Different flags + +### Check architecture + + + +```c +``` + +### AArch64 + +https://snapshots.linaro.org/gnu-toolchain/13.0-2022.08-1/aarch64-linux-gnu/ + +download any of the version of gcc and extract + +Add bin directory location to env variable PATH + +``` +export PATH=$PATH:`pwd` +``` + +___main.c__ +```c +#include <stdio.h> + +int main() { + printf("Hello world arm64\n"); +} +``` + +``` +$ arch64-linux-gnu-gcc main.c -o main +$ ./main +qemu-aarch64: Could not open '/lib/ld-linux-aarch64.so.1': No such file or directory +$ file ./main +./main: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=12448d90030e2ad23dbe6b7bc82a4fa7b7de9659, for GNU/Linux 3.7.0, with debug_info, not stripped +``` + +Download sysroot image from linaro page. +With running + +``` +strace ./main +``` + +It showed that searched path for libraries are in + +``` +/usr/gnemul/qemu-aarch64/lib/ +``` + +Found missing libc and ld-linux-aarch64 inside sysroot archive and copied at searched location amd now AArch64 binary is running. + +``` +$ ./main +Hello world arm64 +``` + +### AVR8 + +AVR is 8bit CPU that is quite popular for hobbiest. As baremetal device its doesnt have full libc support, +and needs some setup before its possible to do basics things with it. + +__avr_echo.c__ +```c +#include <avr/io.h> + +#define FOSC 16000000UL +#define BAUD 9600 +#define MYUBRR FOSC/16/BAUD-1 + +void USART_Init( unsigned int ubrr) +{ + UBRRH = (unsigned char)(ubrr>>8); + UBRRL = (unsigned char)ubrr; + UCSRB = (1<<RXEN)|(1<<TXEN); + UCSRC = (1<<URSEL)|(1<<USBS)|(3<<UCSZ0); +} + +int main() +{ + char c; + USART_Init( MYUBRR ); + while(1) + { + while ( !(UCSRA & (1<<RXC))){}; + c = UDR; + while (!(UCSRA & (1<<UDRE))){}; + UDR = c; + } + return 0; +} +``` + +``` +avr-gcc avr_echo.c -mmcu=atmega16 -Wall -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums -o avr_echo.out +``` + +Next steps woule be to programm it, in case you have ISPv2 programmer and ATmega16 chip + +```bash +avr-objdump -s --disassemble avr_echo.out > avr_echo.s +avr-objcopy -j .text -O ihex avr_echo.out avr_echo.hex +avrdude -pm16 -cavrispv2 -Pusb -U flash:w:avr_echo.hex +``` + +### Emscripten + +[/writeup/web_assembly_sdl_example.md](/writeup/web_assembly_sdl_example.md) + + + + + + + |