From 51e6cef5d338c8ea74572f44c92dab9c0ac13dda Mon Sep 17 00:00:00 2001 From: FreeArtMan Date: Fri, 26 Aug 2022 10:03:34 +0100 Subject: Renamed undefined c --- md/notes/undefined_c/titles.md | 1749 -------------------------------------- md/notes/undefined_c/tutorial.md | 1749 ++++++++++++++++++++++++++++++++++++++ md/random.md | 2 +- 3 files changed, 1750 insertions(+), 1750 deletions(-) delete mode 100644 md/notes/undefined_c/titles.md create mode 100644 md/notes/undefined_c/tutorial.md diff --git a/md/notes/undefined_c/titles.md b/md/notes/undefined_c/titles.md deleted file mode 100644 index 731d42c..0000000 --- a/md/notes/undefined_c/titles.md +++ /dev/null @@ -1,1749 +0,0 @@ -title:Undefined C -keywords:c,linux,asm - -# Undefined C - -There is possible to run piece of code inside online c compiler like https://www.onlinegdb.com/online_c_compiler -Or run locally. With base check is done with gcc compiler. There are many small tricks around running C code -in practice that aren't covered in any generic tutorials, so here is list of topics that may arise while -coding real C code outside of tutorials. For each case there is just small example, each of those could -take whole chapter on its own. - -## Compile - - -__hello_world.c__ -```c -int main() { - printf("Hello world\n"); -} -``` - -```bash -gcc hello_world.c -o hello_world -gcc -m32 hello_world.c -o hello_world_32 #for 32bit target -``` - -## Syntax - -### Variables - -Standard list of available types - -#### Check type size - -All types have size that are declared in bytes. Some of the types are machine dependents. -like int/long, if there is needed machine independent types then there are int32_t/uint32_t/int64_t/uint64_t - -Each architecture 8bit/16bit/32bit/64bit will have different size for those types - -Use __sizeof()__ - -Running on x86 machine -```c -#include -#include -#include -int main() { - printf("Sizeof int %lu\n",sizeof(int)); - printf("Sizeof int32_t %lu\n",sizeof(int32_t)); - printf("Sizeof int64_t %lu\n",sizeof(int64_t)); - printf("Sizeof long %lu\n",sizeof(long)); - printf("Sizeof long long %lu\n",sizeof(long long)); -} -``` - -Most safest/portable way is to use [u]int[8/16/32/64]_t types. - -Defined macros'es to get type max and min values are - -https://en.cppreference.com/w/c/types/limits - -```c -#include -int main() { - printf("INT_MIN %d\n",INT_MIN); - printf("INT_MAX %d\n", INT_MAX); - printf("LONG_MIN %ld\n",LONG_MIN); -} -``` - -Example from AVR __stdint.h__ -https://github.com/avrdudes/avr-libc/blob/main/include/stdint.h -Example from Libc -https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/stdint.h - - - -#### How to shoot the leg - -When code suppose to run on 32bit and 64bit platform the size of type may vary. -Need to take in account this case. - - - - - -### Functions - -Function syntax, there is nothing interesting on functions - -``` - ( ,..) { - -} -``` - -Write simple function - -```c -int fun1() { - return -1; -} -``` - -Function can have multiple return statements. -Here is example whne function have 3 return values. -```c -int fun2(int i) { - if (i<0) return -1; - if (i>0) return 1; - return 0; -} -``` - -Get address of function - -```c -printf("fun1 address %016x",&fun1);//64bit platform -``` - -### If statement - -```c -if () ; -if () {} -``` - -One of the way to check error of returned functions is - -```c -if ((c = getfun()) == 0) { -} -``` - -Most simplest and outdated way to do this is when getting input from command line -```c -#include -int main() { - int c; - char ch; - while ((c = getchar()) != EOF ) { - ch = c; - printf("Typed character %c\n",c); - } -} -``` - -### For cycle - -For loop is one that may involve some trickery, its -as simple as - -```c -for (;;) { -} -``` - -Go over values from 1 till 10 - -```c -int i=0; -for (i=1;i<=10;i++) { - printf("%d\n",i) -} -``` - -Now lets do it from 10 till 1 - -```c -int i=0; -for (i=10;i>0;i--) { - printf("%d\n",i) -} -``` - -Now lets make one liner - -```c -for (i=0;i<10;i++,printf("%d\n",i)); -``` - -Yes there is possible to write as many expressions as needed. - - -### Structure - -Structure allows to combine types under one new type. Structure is convenient way how to combine set -of types and reuse them as one. - -```c -struct struct1 { - uint8_t a; - uint16_t b; - uint32_t c; - uint64_t d; -}; -``` - -Total intuitive size of structure would be -```c -int total_szie = sizeof(uint8_t) + sizeof(uint16_t) + sizeof(uint32_t) + sizeof(uint64_t); -int real_size = sizeof(struct1); -``` - -Types are placed inside structure to make fast access to them. Some instructions of CPU may require -to access aligned memory addresses to not have penalty on accessing types inside structure. - -To directly mess with alignment of types use attribute -```c -__attribute__ ((aligned (8))) -``` - - -Use attributes to pack structure and be not architecture dependent. - -```c -struct struct2 { - uint8_t a; - uint16_t b; - uint32_t c; - uint64_t d; -} __attribute__((packed)); -``` - -Now let check size of structure after it packed - -```c -int new_size = sizeof(struct2); -``` - -Also there is possible to add aligmnet to each time in structure -```c -struct struct3 { - uint8_t a __attribute__((aligned (8))); - uint16_t b __attribute__((aligned (8))); - uint32_t c __attribute__((aligned (8))); - uint64_t d __attribute__((aligned (8))); -} __attribute__((aligned (8))); -``` - -Now size of structure will be 32. - -All results on amd64, other arch may differ. - -### How to shoot leg -Forget that struct size is not consistent. - -### Recursion - -Recursion is technique that could be useful to write shorter code -and deal with cycles. One thing that recursion suffer is that it consumes -stack memory and its have default limit on platform. - -```c -#include -#include - -int fun_r(int i) { - printf("val %d\n",i); - fun_r(i+1); - return 0; -} - -int main() -{ - fun_r(0); -} -``` - -Program will fail after its reach out of stack range. -When increase the default stack limit it go more further. - - -Check default stack size - -``` -ulimit -s -``` - -Set stack size - -``` -ulimit -s 16384 -``` - -### Macro - -There is many things useful as macros. There is many tricks in macros to emit -useful parts of code. - -Define values, as its enum. -```c -#define VAL_0 0 -#define VAL_1 1 -#define VAL_LAST VAL_1 -``` - -Multiline macro -```c -#define INC_FUN(TYPE) TYPE inc_##TYPE(a TYPE){\ - TYPE c=1\ - return a + c\ -} - -INC_FUN(int) -INC_FUN(char) -INC_FUN(double) -INC_FUN(notype) -``` - -to check code expansion of macro run - -``` -gcc -E -``` - - - -http://main.lv/writeup/c_macro_tricks.md - - -https://jadlevesque.github.io/PPMP-Iceberg/ - - -### Pointers - -One the C most loved feature is pointers, they allow to access addresses without any sanity check -and they dont have any lifetime, so anything is possible with those. - -Pointer contains address which is interpreted according of pointer type - -```c -int c; -int ptr=&c; -``` - -Go over array of chars -```c -#include -#include - -int main() { - char s[]="asd"; - char *c=&s; - while (*c != 0) { - printf("NExt char %c addr %016x\n",*c,c); - c++; - } -} -``` -Go over array of ints -```c - int i=0; - int arr[] = {9,7,5,3,1}; - int *ptr = arr; - while (i<5) { - printf("Number value %d addr %016x\n",*ptr, ptr); - ptr++; - i++; - } -``` - -Pointer arithmetics like +1 will move to next address that is offset of type size. -As example below structure size is 12, and increment of pointer to that structure -increment address to sizeof structure. And yes address is pointing to not mapped memory, so it -will segfault if accessed. - -```c -struct size12 { - int a,b,c; -} - -int main() { - struct size12 *s=0; - s++; - printf("%016x\n",s); - s++; - printf("%016x\n",s); -} -``` - -Double pointers are pointers to pointers - -```c -#include - -int main(int argc, char **argv) { - char *arg = argv[0]; - printf("Program name %s\n",arg); -} -``` - -#### How to shoot the leg -Run pointer in while loop incrementing pointer. It will stop only when segfaults. - -Dont initialize pointer and it will have random value. - - - -### Allocate memory - -From programs perspective memory allocation is adding address range to executable that can be addressed. - -malloc should be accompanied with free statement, otherwise it will have memory leaks. - -```c -#include -#include -#include - -int main() { - char *c = malloc(16); - memset(c,0,16); - int *arr = malloc(16*sizeof(int)); - memset(arr,0,16*sizeof(int)); - free(c); - free(arr); -} -``` - -### Signed/Unsigned - -Signed and unsigned variables differ just in one bit interpretation. But they have different behavior on minimal and maximal values. - - -```c -#include -#include -int main() -{ - int i=INT_MAX; - unsigned int u=UINT_MAX; - - printf("i=%d\n",i); - printf("u=%u\n",u); - - i++; - u++; - printf("i=%d\n",i); - printf("u=%u\n",u); - i=0; - u=0; - i--; - u--; - printf("i=%d\n",i); - printf("u=%u\n",u); - -} -``` - -### Endianess - - -```c -#include -#include -#include -#include -#include - -int main() { - int arr[4] = {0x00112233,0x44556677,0x8899AABB, 0xCCDDEEFF}; - printf("%08x\n",arr[0]); - printf("%08x\n",arr[1]); - printf("%08x\n",arr[2]); - printf("%08x\n",arr[3]); - - FILE *f = fopen("int.hex","w+"); - fprintf(f,"%08x",arr[0]); - fprintf(f,"%08x",arr[1]); - fprintf(f,"%08x",arr[2]); - fprintf(f,"%08x",arr[3]); - fclose(f); - - int fd=open("int.bin",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO); - write(fd,arr,sizeof(arr)); - close(fd); - - int i; - fd = open("int.bin2",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO); - for (i=0;i<4;i++) { - uint32_t val = (arr[i]>>16) &0x0000ffff; - val += (arr[i]<<16)&0xffff0000; - write(fd,&val,sizeof(uint32_t)); - } - close(fd); -} -``` - -While saving formated values to file you will get what you expect -``` -$ cat int.hex -00112233445566778899aabbccddeeff -``` - -Saving just memory dump of all values, will give you different result -``` -$ hexdump int.bin -0000000 2233 0011 6677 4455 aabb 8899 eeff ccdd -0000010 -``` - -Need to swap 16bit pairs to look same as value memory dump -``` -$ hexdump int.bin2 -0000000 0011 2233 4455 6677 8899 aabb ccdd eeff -0000010 -``` - -### Compiler flags - -Compiler have whole list of command line arguments that you can enable for different purposes, lets look into some of them -https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html - -Lets try to apply some of the flags to examples above. - -Best starte options is, those will give you more warnings. - -``` --Wall -Wextra -``` - -Most of the examples here was written in sloppy style, so adding extra checks like will find more issues with code, probably -all of provided examples will show issues with this extra compiler flags - -``` -Wformat-security -Wduplicated-cond -Wfloat-equal -Wshadow -Wconversion -Wjump-misses-init -Wlogical-not-parentheses -Wnull-dereference -``` - -To get all macroses expanded in C code add compiler flag. Output will be C source with all macro expansion -``` --E -``` - -Output resulting file not to binary but to generated assembly add -``` --S -``` - -More readable output can be obtained with - -``` -gcc FILE.c -Wa,-adhln=FILE.S -g -fverbose-asm -masm=intel -``` - -Basic compiler optimisation flags that can speedup program or make it smaller - -``` --O -O0 -O1 -O2 -O3 -Os -Ofast -Og -Oz -``` - -https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Optimize-Options - -https://panthema.net/2013/0124-GCC-Output-Assembler-Code/ -https://blogs.oracle.com/linux/post/making-code-more-secure-with-gcc-part-1 - - -### Shared library - -Shared library is common way how to reuse big chunks of code. - -```c -#include -int fun1() { - return 1; -} - -int fun2() { - printf("Function name fun2\n"); -} - -int fun3(int a, int b) { - return a+b; -} -``` - -``` -$ gcc -c lib_share.c -$ gcc -shared -o lib_share.so libshare.o -$ ldd lib_share.so - linux-vdso.so.1 (0x00007ffdb994d000) - libc.so.6 => /usr/lib/libc.so.6 (0x00007f0c39400000) - /usr/lib64/ld-linux-x86-64.so.2 (0x00007f0c39835000) -``` - -Now lets link to our binary -```c -#include - -//functions that are implemented in shared lib -int fun1(); -int fun2(); -int fun3(int a, int b); - -int main() { - fun1(); - fun2(); - fun3(); -} -``` - -``` -$ gcc -L. -lshare use_share.c -o use_share -./use_share -./use_share: error while loading shared libraries: libshare.so: cannot open shared object file: No such file or directory -ldd ./use_share - linux-vdso.so.1 (0x00007ffedcad5000) - libshare.so => not found - libc.so.6 => /usr/lib/libc.so.6 (0x00007f7b99a00000) - /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f7b99c90000) -``` - -Library is not in search path -``` -$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd` -$ ./use_share -$ ldd use_share - linux-vdso.so.1 (0x00007fffc415c000) - libshare.so => /your/path/libshare.so (0x00007f48b03c6000) - libc.so.6 => /usr/lib/libc.so.6 (0x00007f48b0000000) - /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f48b03d2000) -``` - -Other way is to set custom library search location. Lets set it to search in current directory. -And no need to modify LD_LIBRARY_PATH - -``` -$ gcc use_share.c -o use_share -L. -lshare -Wl,-rpath=./ -$ ldd ./use_share - linux-vdso.so.1 (0x00007fff5c964000) - libshare.so => ./libshare.so (0x00007f791000f000) - libc.so.6 => /usr/lib/libc.so.6 (0x00007f790fc00000) - /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f791001b000) -``` - -So now executable runs libshare from local directory. Ofc there is possible to install shared library into systems /usr/lib - -### Static library - - - - -### Static binary - -Static binary don't use any shared libraries, and its possible to built it once and distribute on other platforms -without need to install dependencies. - - -```c -#include -#include - -int main(int argc, char **argv) { - return 0; -} -``` - -First step to compile file and see that is dynamically lined -``` -$ gcc static_elf.c -o static_elf -$ file static_elf -static_elf: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=bc6ac706075874858e1c4a8accf77e704f4ea25a, for GNU/Linux 4.4.0, with debug_info, not stripped -$ ldd ./static_elf - linux-vdso.so.1 (0x00007ffccef49000) - libc.so.6 => /usr/lib/libc.so.6 (0x00007fcbb8800000) - /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fcbb8b63000) - -``` - -After adding static option we can verify that tools now report it as statically linked. Size of binary increased as all functions -that require to run executable are now contained in binary. - -``` -$ gcc static_elf.c -static -o static_elf -$ file static_elf -static_elf: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=c54d2e4d2a3d11fe920bee9a44af045c6f67ab56, for GNU/Linux 4.4.0, with debug_info, not stripped -$ ldd static_elf - not a dynamic executable -``` - -Statically compiled file should work on most platforms. - - - -### Atomic -HERE - -### Multithreading -HERE - - - - - - - -## Basic usage - -### File manipulation with libc - -Create file open data using libc functions - -```c -#include -#include -#include - -int main() { - FILE *f = fopen("file.txt","w+"); - char *s = "Hello"; - fwrite(s,1,strlen(s),f); - fclose(f); -} -``` - -Open file and read data back - -```c -#include -#include -#include - -int main() { - FILE *f = fopen("file.txt","r"); - char buf[128]; - int r; - r = fread(buf,1,128,f); - buf[r] = 0; - printf("->%s\n",buf,r); - fclose(f); -} -``` - -### File manipulation with syscalls - -Now lets do the same without using libc functions using syscall function to directly use syscalls, -its also straightforward to rewrite example for assembly. - -```c -#include -#include -#include -#include - -int main(void) { - int fd = syscall(SYS_open, "sys.txt", O_CREAT|O_WRONLY, S_IRWXU|S_IRGRP|S_IXGRP); - char s[] = "hello sycall\n"; - syscall(SYS_write, fd, s, strlen(s)); - syscall(SYS_close, fd); - return 0; -} -``` - - -Read data from file - -```c -#include -#include -#include -#include - -int main(void) { - int fd = syscall(SYS_open, "sys.txt", O_RDONLY); - char s[128]; - int r = syscall(SYS_read, fd, s, 128); - s[r] = 0; - syscall(SYS_close, fd); - syscall(SYS_write, 0, s, r); - return 0; -} -``` - -## Advanced topics - -### Kernel module - -Linux kernel, macos kernel and *BSD's kernels written in C, -so there is possibility to write kernel modules in C for some of those. - -Example will not match some specific things to local distribution. - -```c - -``` - -http://main.lv/writeup/kernel_hello_world.md - -### Linking - -Linking is one of the most interesting parts of compiling of C code. When object file is created -it contains functions and variables that can be of different type. And linking tries to resolve -all of those. So there is possible to have fun with linking and content of object files. - - -First example is piece of C code that can be compiled to object file, but it will not able to -resolve to executable. -``` -gcc -c link_elf.c -``` -```c -int main() { - fun1(); - fun2(); -} -``` -So we can see that fun1 and fun2 are marked as undefined in object file. If we try compile it will not able to find those. -So lets create one more object file -``` -$ readelf -a link_elf.o - -Symbol table '.symtab' contains 6 entries: - Num: Value Size Type Bind Vis Ndx Name - 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND - 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_elf.c - 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text - 3: 0000000000000000 31 FUNC GLOBAL DEFAULT 1 main - 4: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND fun1 - 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND fun2 - -``` -__link_fun1.c__ -```c -void fun1() { - printf("Hello fun1\n"); -} -void fun2() { - printf("Hello fun2\n"); -} -``` - -So now we have object file with funtions that are defined. and we see that its now have undefine pritnf/puts function there. - -``` -readelf -a link_fun1.o -Symbol table '.symtab' contains 7 entries: - Num: Value Size Type Bind Vis Ndx Name - 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND - 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun1.c - 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text - 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata - 4: 0000000000000000 22 FUNC GLOBAL DEFAULT 1 fun1 - 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts - 6: 0000000000000016 22 FUNC GLOBAL DEFAULT 1 fun2 - -``` - -we can merge both of those files together -```shell -gcc -o link_elf link_elf.o link_fun1.o -``` -The function in object files dont have any idea about input output types. That why anything can be linked that just match name -lets rewrite code like this - -```c -int fun1(int i) { - printf("Hello fun1\n"); -} -int fun2(int i) { - printf("Hello fun2\n"); -} -``` -And this links without issue. Theat this as 2 sets that are merge together only few thins know when linking things. -Return type, and function arguments arent exposed when object file is created. - -Functions can have aliases. - -__link_fun2.c__ - -```c -static void fun2() { - printf("hello 2\n"); -} __attribute__ ((alias("fun1"))); -``` - -Now function is local. - -``` -Symbol table '.symtab' contains 6 entries: - Num: Value Size Type Bind Vis Ndx Name - 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND - 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun2.c - 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text - 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata - 4: 0000000000000000 22 FUNC LOCAL DEFAULT 1 fun2 - 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts -``` - -Lets compile all object to executable. And the function fun2 isnt used in this case, - - - -``` -$ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf -$ ./link_elf -Hello fun1 -Hello fun2 - -``` - - - -lets witch aliasing between 2 functions **fun2** - - -``` -link_fun1.o - 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND - 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun1.c - 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text - 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata - 4: 000000000000001d 29 FUNC LOCAL DEFAULT 1 fun2 - 5: 0000000000000000 29 FUNC GLOBAL DEFAULT 1 fun1 - 6: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts - -link_fun2.o - 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND - 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun2.c - 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text - 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata - 4: 0000000000000000 22 FUNC GLOBAL DEFAULT 1 fun2 - 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts - -``` - -``` -$ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf -$ ./link_elf -Hello fun1 -hello 2 -``` - -So all of this plays role in linking object files. -There is more interesting utilit called ld its doing things on lower level then gcc. - - -### Extern - -### Attributes -PASS -### Creating shared library -PASS -### Create static libraries -PASS -### Join all objects together -PASS -### Compile with musl - -The libc is not the only option as standard c library, there is few others one of them is musl - -``` -$ musl-gcc hello_world.c -o hello_world -$ file ./hello_world -hello_world_musl: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-x86_64.so.1, not stripped -``` - - -### Inspect elf files - -There is few utilities that help to check if elf file is ok. - -ldd show what kind of shared libraries elf will try to load - -``` -$ ldd hello_world - linux-vdso.so.1 (0x00007fffcb2ae000) - libc.so.6 => /usr/lib/libc.so.6 (0x00007ffb80c00000) - /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007ffb80fb9000) - -``` - -Readelf allows to inspect content of elf files, headers and interpret values in headers. -In few example above we allready used that feature to check content of compiled objectfiles. - -``` -$ readelf -s ./hello_world -Symbol table '.symtab' contains 37 entries: - Num: Value Size Type Bind Vis Ndx Name - 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND - 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS abi-note.c - 2: 000000000000039c 32 OBJECT LOCAL DEFAULT 4 __abi_tag - 3: 0000000000000000 0 FILE LOCAL DEFAULT ABS init.c - 4: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c - 5: 0000000000001070 0 FUNC LOCAL DEFAULT 14 deregister_tm_clones - 6: 00000000000010a0 0 FUNC LOCAL DEFAULT 14 register_tm_clones - 7: 00000000000010e0 0 FUNC LOCAL DEFAULT 14 __do_global_dtors_aux - 8: 0000000000004030 1 OBJECT LOCAL DEFAULT 25 completed.0 - 9: 0000000000003df0 0 OBJECT LOCAL DEFAULT 20 __do_global_dtor[...] - 10: 0000000000001130 0 FUNC LOCAL DEFAULT 14 frame_dummy - 11: 0000000000003de8 0 OBJECT LOCAL DEFAULT 19 __frame_dummy_in[...] - 12: 0000000000000000 0 FILE LOCAL DEFAULT ABS hello_world.c - 13: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c - 14: 00000000000020b0 0 OBJECT LOCAL DEFAULT 18 __FRAME_END__ - 15: 0000000000000000 0 FILE LOCAL DEFAULT ABS - 16: 0000000000003df8 0 OBJECT LOCAL DEFAULT 21 _DYNAMIC - 17: 0000000000002010 0 NOTYPE LOCAL DEFAULT 17 __GNU_EH_FRAME_HDR - 18: 0000000000004000 0 OBJECT LOCAL DEFAULT 23 _GLOBAL_OFFSET_TABLE_ - 19: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_mai[...] - 20: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterT[...] - 21: 0000000000004020 0 NOTYPE WEAK DEFAULT 24 data_start - 22: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts@GLIBC_2.2.5 - 23: 0000000000004030 0 NOTYPE GLOBAL DEFAULT 24 _edata - 24: 0000000000001154 0 FUNC GLOBAL HIDDEN 15 _fini - 25: 0000000000004020 0 NOTYPE GLOBAL DEFAULT 24 __data_start - 26: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__ - 27: 0000000000004028 0 OBJECT GLOBAL HIDDEN 24 __dso_handle - 28: 0000000000002000 4 OBJECT GLOBAL DEFAULT 16 _IO_stdin_used - 29: 0000000000004038 0 NOTYPE GLOBAL DEFAULT 25 _end - 30: 0000000000001040 38 FUNC GLOBAL DEFAULT 14 _start - 31: 0000000000004030 0 NOTYPE GLOBAL DEFAULT 25 __bss_start - 32: 0000000000001139 26 FUNC GLOBAL DEFAULT 14 main - 33: 0000000000004030 0 OBJECT GLOBAL HIDDEN 24 __TMC_END__ - 34: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMC[...] - 35: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@G[...] - 36: 0000000000001000 0 FUNC GLOBAL HIDDEN 12 _init -``` - -### No standard library - -Lets write hello world without libc. - -__noc.c__ -```c -void _start() { - -} -``` - -``` -$ gcc -c noc.c -$ ld -dynamic-linker /lib/ld-linux.so.2 noc.o -o noc -$ file noc -noc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped -``` - -Next step to make it more working then segfaulting. - -```c -void _start() { - asm ( \ - "movl $1,%eax\n" \ - "xor %ebx,%ebx\n" \ - "int $128\n" \ - ); -} -``` - -Now this is all about calling the syscalls - -Lets print the message -```c -signed int write(int fd, const void *buf, unsigned int size) -{ - signed int ret; - asm volatile - ( - "syscall" - : "=a" (ret) - // EDI RSI RDX - : "0"(1), "D"(fd), "S"(buf), "d"(size) - : "rcx", "r11", "memory" - ); - return ret; -} - -void _start() { - write(1,"no libc",8); - asm ( \ - "movl $1,%eax\n" \ - "xor %ebx,%ebx\n" \ - "int $128\n" \ - ); -} -``` - -http://main.lv/writeup/making_c_executables_smaller.md - -### Memory leaks - -Memory leaks is cruitial part of C language. Default case when they are detected are -when allocated memory wasn free'd after use. If amount of this type of memory increasing then -its can eventually fill whole memory and system will be unresponsive. Here is simple example -how memory leak created and how to detect it. - -```c -#include - -int main() { - - char *ptr = malloc(12); - - return 0; -} -``` - -The best way to detect it to use valgrind. - -``` -$ valgrind ./malloc - -==778== HEAP SUMMARY: -==778== in use at exit: 12 bytes in 1 blocks -==778== total heap usage: 2 allocs, 1 frees, 1,036 bytes allocated -``` - -There is seen 2 allocs and 1 free. But we see that 12bytes after exit. So our created leak is detected. -More complex example. So now we created leaking function and we called it 5 times. But in larger code -base it would be nice to see location of leaks. - -```c -#include - -int* mem_alloc(int sz) { - int *ret=NULL; - - if (sz < 0) { - return NULL; - } - - ret = malloc(sz*sizeof(int)); - - if (sz>10) { - return NULL; - } - - return ret; - -} - -int main() { - - mem_alloc(0); - - free(mem_alloc(1)); - - mem_alloc(100); - - free(mem_alloc(2)); - - mem_alloc(10); - - return 0; -} -``` - -There is 3 blocks that leaks, and we see where its comming from there is possible to guess but it would better -to have position of where leak located. - -``` -valgrind --leak-check=full --track-origins=yes --log-file=log.txt ./memleak2 - -==4974== HEAP SUMMARY: -==4974== in use at exit: 440 bytes in 3 blocks -==4974== total heap usage: 5 allocs, 2 frees, 452 bytes allocated -==4974== -==4974== 0 bytes in 1 blocks are definitely lost in loss record 1 of 3 -==4974== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) -==4974== by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2) -==4974== by 0x10919E: main (in /home/fam/prog/c/undefined_c/memleak2) -==4974== -==4974== 40 bytes in 1 blocks are definitely lost in loss record 2 of 3 -==4974== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) -==4974== by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2) -==4974== by 0x1091D6: main (in /home/fam/prog/c/undefined_c/memleak2) -==4974== -==4974== 400 bytes in 1 blocks are definitely lost in loss record 3 of 3 -==4974== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) -==4974== by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2) -==4974== by 0x1091BA: main (in /home/fam/prog/c/undefined_c/memleak2) -==4974== -==4974== LEAK SUMMARY: -==4974== definitely lost: 440 bytes in 3 blocks -==4974== indirectly lost: 0 bytes in 0 blocks -==4974== possibly lost: 0 bytes in 0 blocks -==4974== still reachable: 0 bytes in 0 blocks -==4974== suppressed: 0 bytes in 0 blocks -``` - -Add compilation option __g3__ - -``` -gcc -g3 memleak2.c -o memleak2 -``` - -Now it shows source lines and trace from where the leaking code where called. Thats looks better now. - -``` -valgrind --leak-check=full --track-origins=yes --log-file=log.txt ./memleak2 - -==5073== HEAP SUMMARY: -==5073== in use at exit: 440 bytes in 3 blocks -==5073== total heap usage: 5 allocs, 2 frees, 452 bytes allocated -==5073== -==5073== 0 bytes in 1 blocks are definitely lost in loss record 1 of 3 -==5073== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) -==5073== by 0x109179: mem_alloc (memleak2.c:10) -==5073== by 0x10919E: main (memleak2.c:22) -==5073== -==5073== 40 bytes in 1 blocks are definitely lost in loss record 2 of 3 -==5073== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) -==5073== by 0x109179: mem_alloc (memleak2.c:10) -==5073== by 0x1091D6: main (memleak2.c:30) -==5073== -==5073== 400 bytes in 1 blocks are definitely lost in loss record 3 of 3 -==5073== at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) -==5073== by 0x109179: mem_alloc (memleak2.c:10) -==5073== by 0x1091BA: main (memleak2.c:26) -==5073== -==5073== LEAK SUMMARY: -==5073== definitely lost: 440 bytes in 3 blocks -==5073== indirectly lost: 0 bytes in 0 blocks -==5073== possibly lost: 0 bytes in 0 blocks -==5073== still reachable: 0 bytes in 0 blocks -==5073== suppressed: 0 bytes in 0 blocks -==5073== -``` - - -### Code coverage - -Compile file with extra flags and generate gcov file output. -Ther is only one branch not used. Coverage should show with part isnt used. - -```c -#include - -int fun1(int a) { - if (a < 0) { - printf("Smaller then zero\n"); - } - if (a==0) { - printf("Equails to zero\n"); - } - if (a>0) { - printf("Bigger then zero\n"); - } -} - -int main() { - - printf("Start\n"); - fun1(0); - fun1(1); - - return 0; -} -``` - -``` -$ gcc -fprofile-arcs -ftest-coverage coverage.c -o coverage -$ gcov ./coverage -File 'coverage.c' -Lines executed:92.31% of 13 -Creating 'coverage.c.gcov' - -Lines executed:92.31% of 13 - -``` - -Gcov file content. So we scant see with line wasnt executed. - -```c - -: 0:Source:coverage.c - -: 0:Graph:coverage.gcno - -: 0:Data:coverage.gcda - -: 0:Runs:1 - -: 1:#include - -: 2: - 2: 3:int fun1(int a) { - 2: 4: if (a < 0) { - #####: 5: printf("Smaller then zero\n"); - -: 6: } - 2: 7: if (a==0) { - 1: 8: printf("Equails to zero\n"); - -: 9: } - 2: 10: if (a>0) { - 1: 11: printf("Bigger then zero\n"); - -: 12: } - 2: 13:} - -: 14: - 1: 15:int main() { - -: 16: - 1: 17: printf("Start\n"); - 1: 18: fun1(0); - 1: 19: fun1(1); - -: 20: - 1: 21: return 0; - -: 22:} -``` - -### Profiling - -Some parts of code can take substantial amount of time and those parts need to be identified. - -```c -#include -#include -#include - -void slow_sin() { - float r=0.0f; - for (int i=0;i<10000000;i++) { - r += sinf(M_PI/8); - } -} - -void slower_sin() { - double r=0.0f; - for (int i=0;i<10000000;i++) { - r += sin(M_PI/8); - } -} - -void fast_sin() { - float pre_calc = sinf(M_PI/8); - float r = 0.0f; - for (int i=0;i<10000000;i++) { - r += pre_calc; - } -} - -int main() { - slow_sin(); - slower_sin(); - fast_sin(); -} -``` - -Compile and rung with profiling - -``` -gcc -pg perf_speed.c -o perf_speed -lm -./perf_speed -gprof perf_speed gmon.cov -``` - -### Sanitizer - -C as a greate language have good features in standart such as undefined behaviour. And -also there is possible to overwrite any data you whant with your code. One of the favorite -mistake is to write some buffer overruns. Its possible to catch this type of errors with -stack protection - -So in code belove there is possible to write in to array of size 8 more then 8 characters. This is because the is no any boundry check. -C runtime will be able to detect this kind of things. - - -```c -#include -#include -#include - -void fun(char *str,int size) { - char local_var[8]; - memcpy(local_var, str, size); - printf("Whats inside a stack? %s\n",local_var); -} - -int main() { - char some_str1[] = "Hello!"; - char some_str2[] = "Hello all!!!"; - - fun(some_str1,strlen(some_str1)); - fun(some_str2,strlen(some_str2)); -} -``` - -``` -Whats inside a stack? Hello! -Whats inside a stack? Hello all!!! -*** stack smashing detected ***: terminated -fish: Job 1, './stack_overrun' terminated by signal SIGABRT (Abort) -``` - -If this isnt happening there is possible to add __-fstack-protector__ to compile flags. - -C have whole list of undefined behaviours incorporated in standard -https://en.cppreference.com/w/c/language/behavior - - - -functions __f__ variable __a__ isnt initialized so its undefined behaviour but there still will be some value. Run few -times and each time it returns new value when __f(0)__. -```c -#include - -size_t f(int x) -{ - size_t a; - if(x) // either x nonzero or UB - a = 42; - return a; -} - -int main() { - printf("%d\n",f(0)); - printf("%d\n",f(1)); - printf("%d\n",f(42)); -} -``` - -Division by zero. Function __f__ dont check if divisor is 0. Programm going to abort. -add flag __-fsanitize=integer-divide-by-zero__ and it will be detected at runtime - -```c -#include - -size_t f(int x) -{ - return 10/x; -} - -int main() { - printf("%d\n",f(0)); - printf("%d\n",f(1)); - printf("%d\n",f(42)); -} -``` - -``` -undefined_b.c:5:14: runtime error: division by zero -fish: Job 1, './undefined_b' terminated by signal SIGFPE (Floating point exception) -``` - - - - - - -### Write plugins -### Preload library - - -## Embedding C - -Most of the programming languages support embeding C. As C language have where simple -functiong naming when its mangled to object format it makes it easy target when -linking with other languages. Most of other languages have incompatible naming for -functions when compiled to binary. - -### Embed in C++ - -__lib.h__ -```c -#include -#include - -int fun_secret_1(); -``` - -__lib.c__ -```c -#include "lib.h" - -int fun_secret_1() { - printf("Hello from C\n"); - return -1; -} -``` - -First thing to notice is when file is compiled with C++ is that the name of the function are in different format -then when its compiled with C. -``` -$ g++ -c lib.c -$ readelf -s lib.o -Symbol table '.symtab' contains 6 entries: - Num: Value Size Type Bind Vis Ndx Name - 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND - 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS lib.c - 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text - 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata - 4: 0000000000000000 26 FUNC GLOBAL DEFAULT 1 _Z12fun_secret_1v - 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts - -``` - -Lets tell C++ that file is C language by adding **extern "c"** - -__lib.h__ -```c -#include -#include - -extern "C" { - -int fun_secret_1(); - -} -``` - -__lib.c__ -```c -#include "lib.h" - -extern "C" { - -int fun_secret_1() { - printf("Hello from C\n"); - return -1; -} - -} -``` - -Now compiled object file have C function names. - -``` -$ g++ lib.c -c -$ readelf -s lib.o - -Symbol table '.symtab' contains 6 entries: - Num: Value Size Type Bind Vis Ndx Name - 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND - 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS lib.c - 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text - 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata - 4: 0000000000000000 26 FUNC GLOBAL DEFAULT 1 fun_secret_1 - 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts - -``` - -__cppembed.cpp___ -```cpp -#include "lib.h" - -int main() { - fun_secret_1(); -} -``` - -Doing oposite way running C++ from C -[/writeup/wraping_c_plus_plus_exceptions_templates_and_classes_in_c.md](/writeup/wraping_c_plus_plus_exceptions_templates_and_classes_in_c.md) - -### Embed in Go - -__lib.h__ -```c -#include -#include - -int fun_secret_1(); -``` -__lib.c__ -```c -#include "lib.h" - -int fun_secret_1() { - printf("Hello from C\n"); - return -1; -} -``` - -__main.go__ -```go -package main -// #cgo CFLAGS: -g -Wall -// #include -// #include "lib.h" -import "C" -import ( - "fmt" - -) -func main() { - fmt.Println("Start program") - C.fun_secret_1() - fmt.Println("End program") -} -``` - -``` -go build -``` - -[https://karthikkaranth.me/blog/calling-c-code-from-go/](https://karthikkaranth.me/blog/calling-c-code-from-go/) - - -### Embed in Swift - -[/writeup/linux_hello_world_in_swift.md](/writeup/linux_hello_world_in_swift.md) - -### Embed in Rust - -__lib.c__ -```c -#include -#include - -int fun_secret_1() { - printf("Hello from C\n"); - return -1; -} -``` - -```rust -extern "C" { - fn fun_secret_1(); -} - -//rustc main.rs -o hello -fn main() { - println!("Start program"); - unsafe {fun_secret_1()} - println!("End program"); -} -``` - -Compile with - -``` -gcc -c lib.c -gcc -shared lib.o -o liblib.so -rustc main.rs -l lib -L . -o hello -C link-arg="-Wl,-rpath=./" -``` - -[https://dev.to/xphoniex/how-to-call-c-code-from-rust-56do](https://dev.to/xphoniex/how-to-call-c-code-from-rust-56do) - -### Lua in C - -[/writeup/embedding_lua_in_c.md](/writeup/embedding_lua_in_c.md) - -### Python in C - - - -## Multiplatform - -### Different flags - -### Check architecture - - - -```c -``` - -### AArch64 - -https://snapshots.linaro.org/gnu-toolchain/13.0-2022.08-1/aarch64-linux-gnu/ - -download any of the version of gcc and extract - -Add bin directory location to env variable PATH - -``` -export PATH=$PATH:`pwd` -``` - -___main.c__ -```c -#include - -int main() { - printf("Hello world arm64\n"); -} -``` - -``` -$ arch64-linux-gnu-gcc main.c -o main -$ ./main -qemu-aarch64: Could not open '/lib/ld-linux-aarch64.so.1': No such file or directory -$ file ./main -./main: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=12448d90030e2ad23dbe6b7bc82a4fa7b7de9659, for GNU/Linux 3.7.0, with debug_info, not stripped -``` - -Download sysroot image from linaro page. -With running - -``` -strace ./main -``` - -It showed that searched path for libraries are in - -``` -/usr/gnemul/qemu-aarch64/lib/ -``` - -Found missing libc and ld-linux-aarch64 inside sysroot archive and copied at searched location amd now AArch64 binary is running. - -``` -$ ./main -Hello world arm64 -``` - -### AVR8 - -AVR is 8bit CPU that is quite popular for hobbiest. As baremetal device its doesnt have full libc support, -and needs some setup before its possible to do basics things with it. - -__avr_echo.c__ -```c -#include - -#define FOSC 16000000UL -#define BAUD 9600 -#define MYUBRR FOSC/16/BAUD-1 - -void USART_Init( unsigned int ubrr) -{ - UBRRH = (unsigned char)(ubrr>>8); - UBRRL = (unsigned char)ubrr; - UCSRB = (1< avr_echo.s -avr-objcopy -j .text -O ihex avr_echo.out avr_echo.hex -avrdude -pm16 -cavrispv2 -Pusb -U flash:w:avr_echo.hex -``` - -### Emscripten - -[/writeup/web_assembly_sdl_example.md](/writeup/web_assembly_sdl_example.md) - - - - - - - diff --git a/md/notes/undefined_c/tutorial.md b/md/notes/undefined_c/tutorial.md new file mode 100644 index 0000000..731d42c --- /dev/null +++ b/md/notes/undefined_c/tutorial.md @@ -0,0 +1,1749 @@ +title:Undefined C +keywords:c,linux,asm + +# Undefined C + +There is possible to run piece of code inside online c compiler like https://www.onlinegdb.com/online_c_compiler +Or run locally. With base check is done with gcc compiler. There are many small tricks around running C code +in practice that aren't covered in any generic tutorials, so here is list of topics that may arise while +coding real C code outside of tutorials. For each case there is just small example, each of those could +take whole chapter on its own. + +## Compile + + +__hello_world.c__ +```c +int main() { + printf("Hello world\n"); +} +``` + +```bash +gcc hello_world.c -o hello_world +gcc -m32 hello_world.c -o hello_world_32 #for 32bit target +``` + +## Syntax + +### Variables + +Standard list of available types + +#### Check type size + +All types have size that are declared in bytes. Some of the types are machine dependents. +like int/long, if there is needed machine independent types then there are int32_t/uint32_t/int64_t/uint64_t + +Each architecture 8bit/16bit/32bit/64bit will have different size for those types + +Use __sizeof()__ + +Running on x86 machine +```c +#include +#include +#include +int main() { + printf("Sizeof int %lu\n",sizeof(int)); + printf("Sizeof int32_t %lu\n",sizeof(int32_t)); + printf("Sizeof int64_t %lu\n",sizeof(int64_t)); + printf("Sizeof long %lu\n",sizeof(long)); + printf("Sizeof long long %lu\n",sizeof(long long)); +} +``` + +Most safest/portable way is to use [u]int[8/16/32/64]_t types. + +Defined macros'es to get type max and min values are + +https://en.cppreference.com/w/c/types/limits + +```c +#include +int main() { + printf("INT_MIN %d\n",INT_MIN); + printf("INT_MAX %d\n", INT_MAX); + printf("LONG_MIN %ld\n",LONG_MIN); +} +``` + +Example from AVR __stdint.h__ +https://github.com/avrdudes/avr-libc/blob/main/include/stdint.h +Example from Libc +https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/stdint.h + + + +#### How to shoot the leg + +When code suppose to run on 32bit and 64bit platform the size of type may vary. +Need to take in account this case. + + + + + +### Functions + +Function syntax, there is nothing interesting on functions + +``` + ( ,..) { + +} +``` + +Write simple function + +```c +int fun1() { + return -1; +} +``` + +Function can have multiple return statements. +Here is example whne function have 3 return values. +```c +int fun2(int i) { + if (i<0) return -1; + if (i>0) return 1; + return 0; +} +``` + +Get address of function + +```c +printf("fun1 address %016x",&fun1);//64bit platform +``` + +### If statement + +```c +if () ; +if () {} +``` + +One of the way to check error of returned functions is + +```c +if ((c = getfun()) == 0) { +} +``` + +Most simplest and outdated way to do this is when getting input from command line +```c +#include +int main() { + int c; + char ch; + while ((c = getchar()) != EOF ) { + ch = c; + printf("Typed character %c\n",c); + } +} +``` + +### For cycle + +For loop is one that may involve some trickery, its +as simple as + +```c +for (;;) { +} +``` + +Go over values from 1 till 10 + +```c +int i=0; +for (i=1;i<=10;i++) { + printf("%d\n",i) +} +``` + +Now lets do it from 10 till 1 + +```c +int i=0; +for (i=10;i>0;i--) { + printf("%d\n",i) +} +``` + +Now lets make one liner + +```c +for (i=0;i<10;i++,printf("%d\n",i)); +``` + +Yes there is possible to write as many expressions as needed. + + +### Structure + +Structure allows to combine types under one new type. Structure is convenient way how to combine set +of types and reuse them as one. + +```c +struct struct1 { + uint8_t a; + uint16_t b; + uint32_t c; + uint64_t d; +}; +``` + +Total intuitive size of structure would be +```c +int total_szie = sizeof(uint8_t) + sizeof(uint16_t) + sizeof(uint32_t) + sizeof(uint64_t); +int real_size = sizeof(struct1); +``` + +Types are placed inside structure to make fast access to them. Some instructions of CPU may require +to access aligned memory addresses to not have penalty on accessing types inside structure. + +To directly mess with alignment of types use attribute +```c +__attribute__ ((aligned (8))) +``` + + +Use attributes to pack structure and be not architecture dependent. + +```c +struct struct2 { + uint8_t a; + uint16_t b; + uint32_t c; + uint64_t d; +} __attribute__((packed)); +``` + +Now let check size of structure after it packed + +```c +int new_size = sizeof(struct2); +``` + +Also there is possible to add aligmnet to each time in structure +```c +struct struct3 { + uint8_t a __attribute__((aligned (8))); + uint16_t b __attribute__((aligned (8))); + uint32_t c __attribute__((aligned (8))); + uint64_t d __attribute__((aligned (8))); +} __attribute__((aligned (8))); +``` + +Now size of structure will be 32. + +All results on amd64, other arch may differ. + +### How to shoot leg +Forget that struct size is not consistent. + +### Recursion + +Recursion is technique that could be useful to write shorter code +and deal with cycles. One thing that recursion suffer is that it consumes +stack memory and its have default limit on platform. + +```c +#include +#include + +int fun_r(int i) { + printf("val %d\n",i); + fun_r(i+1); + return 0; +} + +int main() +{ + fun_r(0); +} +``` + +Program will fail after its reach out of stack range. +When increase the default stack limit it go more further. + + +Check default stack size + +``` +ulimit -s +``` + +Set stack size + +``` +ulimit -s 16384 +``` + +### Macro + +There is many things useful as macros. There is many tricks in macros to emit +useful parts of code. + +Define values, as its enum. +```c +#define VAL_0 0 +#define VAL_1 1 +#define VAL_LAST VAL_1 +``` + +Multiline macro +```c +#define INC_FUN(TYPE) TYPE inc_##TYPE(a TYPE){\ + TYPE c=1\ + return a + c\ +} + +INC_FUN(int) +INC_FUN(char) +INC_FUN(double) +INC_FUN(notype) +``` + +to check code expansion of macro run + +``` +gcc -E +``` + + + +http://main.lv/writeup/c_macro_tricks.md + + +https://jadlevesque.github.io/PPMP-Iceberg/ + + +### Pointers + +One the C most loved feature is pointers, they allow to access addresses without any sanity check +and they dont have any lifetime, so anything is possible with those. + +Pointer contains address which is interpreted according of pointer type + +```c +int c; +int ptr=&c; +``` + +Go over array of chars +```c +#include +#include + +int main() { + char s[]="asd"; + char *c=&s; + while (*c != 0) { + printf("NExt char %c addr %016x\n",*c,c); + c++; + } +} +``` +Go over array of ints +```c + int i=0; + int arr[] = {9,7,5,3,1}; + int *ptr = arr; + while (i<5) { + printf("Number value %d addr %016x\n",*ptr, ptr); + ptr++; + i++; + } +``` + +Pointer arithmetics like +1 will move to next address that is offset of type size. +As example below structure size is 12, and increment of pointer to that structure +increment address to sizeof structure. And yes address is pointing to not mapped memory, so it +will segfault if accessed. + +```c +struct size12 { + int a,b,c; +} + +int main() { + struct size12 *s=0; + s++; + printf("%016x\n",s); + s++; + printf("%016x\n",s); +} +``` + +Double pointers are pointers to pointers + +```c +#include + +int main(int argc, char **argv) { + char *arg = argv[0]; + printf("Program name %s\n",arg); +} +``` + +#### How to shoot the leg +Run pointer in while loop incrementing pointer. It will stop only when segfaults. + +Dont initialize pointer and it will have random value. + + + +### Allocate memory + +From programs perspective memory allocation is adding address range to executable that can be addressed. + +malloc should be accompanied with free statement, otherwise it will have memory leaks. + +```c +#include +#include +#include + +int main() { + char *c = malloc(16); + memset(c,0,16); + int *arr = malloc(16*sizeof(int)); + memset(arr,0,16*sizeof(int)); + free(c); + free(arr); +} +``` + +### Signed/Unsigned + +Signed and unsigned variables differ just in one bit interpretation. But they have different behavior on minimal and maximal values. + + +```c +#include +#include +int main() +{ + int i=INT_MAX; + unsigned int u=UINT_MAX; + + printf("i=%d\n",i); + printf("u=%u\n",u); + + i++; + u++; + printf("i=%d\n",i); + printf("u=%u\n",u); + i=0; + u=0; + i--; + u--; + printf("i=%d\n",i); + printf("u=%u\n",u); + +} +``` + +### Endianess + + +```c +#include +#include +#include +#include +#include + +int main() { + int arr[4] = {0x00112233,0x44556677,0x8899AABB, 0xCCDDEEFF}; + printf("%08x\n",arr[0]); + printf("%08x\n",arr[1]); + printf("%08x\n",arr[2]); + printf("%08x\n",arr[3]); + + FILE *f = fopen("int.hex","w+"); + fprintf(f,"%08x",arr[0]); + fprintf(f,"%08x",arr[1]); + fprintf(f,"%08x",arr[2]); + fprintf(f,"%08x",arr[3]); + fclose(f); + + int fd=open("int.bin",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO); + write(fd,arr,sizeof(arr)); + close(fd); + + int i; + fd = open("int.bin2",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO); + for (i=0;i<4;i++) { + uint32_t val = (arr[i]>>16) &0x0000ffff; + val += (arr[i]<<16)&0xffff0000; + write(fd,&val,sizeof(uint32_t)); + } + close(fd); +} +``` + +While saving formated values to file you will get what you expect +``` +$ cat int.hex +00112233445566778899aabbccddeeff +``` + +Saving just memory dump of all values, will give you different result +``` +$ hexdump int.bin +0000000 2233 0011 6677 4455 aabb 8899 eeff ccdd +0000010 +``` + +Need to swap 16bit pairs to look same as value memory dump +``` +$ hexdump int.bin2 +0000000 0011 2233 4455 6677 8899 aabb ccdd eeff +0000010 +``` + +### Compiler flags + +Compiler have whole list of command line arguments that you can enable for different purposes, lets look into some of them +https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html + +Lets try to apply some of the flags to examples above. + +Best starte options is, those will give you more warnings. + +``` +-Wall -Wextra +``` + +Most of the examples here was written in sloppy style, so adding extra checks like will find more issues with code, probably +all of provided examples will show issues with this extra compiler flags + +``` +Wformat-security -Wduplicated-cond -Wfloat-equal -Wshadow -Wconversion -Wjump-misses-init -Wlogical-not-parentheses -Wnull-dereference +``` + +To get all macroses expanded in C code add compiler flag. Output will be C source with all macro expansion +``` +-E +``` + +Output resulting file not to binary but to generated assembly add +``` +-S +``` + +More readable output can be obtained with + +``` +gcc FILE.c -Wa,-adhln=FILE.S -g -fverbose-asm -masm=intel +``` + +Basic compiler optimisation flags that can speedup program or make it smaller + +``` +-O -O0 -O1 -O2 -O3 -Os -Ofast -Og -Oz +``` + +https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Optimize-Options + +https://panthema.net/2013/0124-GCC-Output-Assembler-Code/ +https://blogs.oracle.com/linux/post/making-code-more-secure-with-gcc-part-1 + + +### Shared library + +Shared library is common way how to reuse big chunks of code. + +```c +#include +int fun1() { + return 1; +} + +int fun2() { + printf("Function name fun2\n"); +} + +int fun3(int a, int b) { + return a+b; +} +``` + +``` +$ gcc -c lib_share.c +$ gcc -shared -o lib_share.so libshare.o +$ ldd lib_share.so + linux-vdso.so.1 (0x00007ffdb994d000) + libc.so.6 => /usr/lib/libc.so.6 (0x00007f0c39400000) + /usr/lib64/ld-linux-x86-64.so.2 (0x00007f0c39835000) +``` + +Now lets link to our binary +```c +#include + +//functions that are implemented in shared lib +int fun1(); +int fun2(); +int fun3(int a, int b); + +int main() { + fun1(); + fun2(); + fun3(); +} +``` + +``` +$ gcc -L. -lshare use_share.c -o use_share +./use_share +./use_share: error while loading shared libraries: libshare.so: cannot open shared object file: No such file or directory +ldd ./use_share + linux-vdso.so.1 (0x00007ffedcad5000) + libshare.so => not found + libc.so.6 => /usr/lib/libc.so.6 (0x00007f7b99a00000) + /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f7b99c90000) +``` + +Library is not in search path +``` +$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd` +$ ./use_share +$ ldd use_share + linux-vdso.so.1 (0x00007fffc415c000) + libshare.so => /your/path/libshare.so (0x00007f48b03c6000) + libc.so.6 => /usr/lib/libc.so.6 (0x00007f48b0000000) + /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f48b03d2000) +``` + +Other way is to set custom library search location. Lets set it to search in current directory. +And no need to modify LD_LIBRARY_PATH + +``` +$ gcc use_share.c -o use_share -L. -lshare -Wl,-rpath=./ +$ ldd ./use_share + linux-vdso.so.1 (0x00007fff5c964000) + libshare.so => ./libshare.so (0x00007f791000f000) + libc.so.6 => /usr/lib/libc.so.6 (0x00007f790fc00000) + /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f791001b000) +``` + +So now executable runs libshare from local directory. Ofc there is possible to install shared library into systems /usr/lib + +### Static library + + + + +### Static binary + +Static binary don't use any shared libraries, and its possible to built it once and distribute on other platforms +without need to install dependencies. + + +```c +#include +#include + +int main(int argc, char **argv) { + return 0; +} +``` + +First step to compile file and see that is dynamically lined +``` +$ gcc static_elf.c -o static_elf +$ file static_elf +static_elf: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=bc6ac706075874858e1c4a8accf77e704f4ea25a, for GNU/Linux 4.4.0, with debug_info, not stripped +$ ldd ./static_elf + linux-vdso.so.1 (0x00007ffccef49000) + libc.so.6 => /usr/lib/libc.so.6 (0x00007fcbb8800000) + /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fcbb8b63000) + +``` + +After adding static option we can verify that tools now report it as statically linked. Size of binary increased as all functions +that require to run executable are now contained in binary. + +``` +$ gcc static_elf.c -static -o static_elf +$ file static_elf +static_elf: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=c54d2e4d2a3d11fe920bee9a44af045c6f67ab56, for GNU/Linux 4.4.0, with debug_info, not stripped +$ ldd static_elf + not a dynamic executable +``` + +Statically compiled file should work on most platforms. + + + +### Atomic +HERE + +### Multithreading +HERE + + + + + + + +## Basic usage + +### File manipulation with libc + +Create file open data using libc functions + +```c +#include +#include +#include + +int main() { + FILE *f = fopen("file.txt","w+"); + char *s = "Hello"; + fwrite(s,1,strlen(s),f); + fclose(f); +} +``` + +Open file and read data back + +```c +#include +#include +#include + +int main() { + FILE *f = fopen("file.txt","r"); + char buf[128]; + int r; + r = fread(buf,1,128,f); + buf[r] = 0; + printf("->%s\n",buf,r); + fclose(f); +} +``` + +### File manipulation with syscalls + +Now lets do the same without using libc functions using syscall function to directly use syscalls, +its also straightforward to rewrite example for assembly. + +```c +#include +#include +#include +#include + +int main(void) { + int fd = syscall(SYS_open, "sys.txt", O_CREAT|O_WRONLY, S_IRWXU|S_IRGRP|S_IXGRP); + char s[] = "hello sycall\n"; + syscall(SYS_write, fd, s, strlen(s)); + syscall(SYS_close, fd); + return 0; +} +``` + + +Read data from file + +```c +#include +#include +#include +#include + +int main(void) { + int fd = syscall(SYS_open, "sys.txt", O_RDONLY); + char s[128]; + int r = syscall(SYS_read, fd, s, 128); + s[r] = 0; + syscall(SYS_close, fd); + syscall(SYS_write, 0, s, r); + return 0; +} +``` + +## Advanced topics + +### Kernel module + +Linux kernel, macos kernel and *BSD's kernels written in C, +so there is possibility to write kernel modules in C for some of those. + +Example will not match some specific things to local distribution. + +```c + +``` + +http://main.lv/writeup/kernel_hello_world.md + +### Linking + +Linking is one of the most interesting parts of compiling of C code. When object file is created +it contains functions and variables that can be of different type. And linking tries to resolve +all of those. So there is possible to have fun with linking and content of object files. + + +First example is piece of C code that can be compiled to object file, but it will not able to +resolve to executable. +``` +gcc -c link_elf.c +``` +```c +int main() { + fun1(); + fun2(); +} +``` +So we can see that fun1 and fun2 are marked as undefined in object file. If we try compile it will not able to find those. +So lets create one more object file +``` +$ readelf -a link_elf.o + +Symbol table '.symtab' contains 6 entries: + Num: Value Size Type Bind Vis Ndx Name + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_elf.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 31 FUNC GLOBAL DEFAULT 1 main + 4: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND fun1 + 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND fun2 + +``` +__link_fun1.c__ +```c +void fun1() { + printf("Hello fun1\n"); +} +void fun2() { + printf("Hello fun2\n"); +} +``` + +So now we have object file with funtions that are defined. and we see that its now have undefine pritnf/puts function there. + +``` +readelf -a link_fun1.o +Symbol table '.symtab' contains 7 entries: + Num: Value Size Type Bind Vis Ndx Name + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun1.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata + 4: 0000000000000000 22 FUNC GLOBAL DEFAULT 1 fun1 + 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts + 6: 0000000000000016 22 FUNC GLOBAL DEFAULT 1 fun2 + +``` + +we can merge both of those files together +```shell +gcc -o link_elf link_elf.o link_fun1.o +``` +The function in object files dont have any idea about input output types. That why anything can be linked that just match name +lets rewrite code like this + +```c +int fun1(int i) { + printf("Hello fun1\n"); +} +int fun2(int i) { + printf("Hello fun2\n"); +} +``` +And this links without issue. Theat this as 2 sets that are merge together only few thins know when linking things. +Return type, and function arguments arent exposed when object file is created. + +Functions can have aliases. + +__link_fun2.c__ + +```c +static void fun2() { + printf("hello 2\n"); +} __attribute__ ((alias("fun1"))); +``` + +Now function is local. + +``` +Symbol table '.symtab' contains 6 entries: + Num: Value Size Type Bind Vis Ndx Name + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun2.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata + 4: 0000000000000000 22 FUNC LOCAL DEFAULT 1 fun2 + 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts +``` + +Lets compile all object to executable. And the function fun2 isnt used in this case, + + + +``` +$ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf +$ ./link_elf +Hello fun1 +Hello fun2 + +``` + + + +lets witch aliasing between 2 functions **fun2** + + +``` +link_fun1.o + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun1.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata + 4: 000000000000001d 29 FUNC LOCAL DEFAULT 1 fun2 + 5: 0000000000000000 29 FUNC GLOBAL DEFAULT 1 fun1 + 6: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts + +link_fun2.o + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS link_fun2.c + 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text + 3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata + 4: 0000000000000000 22 FUNC GLOBAL DEFAULT 1 fun2 + 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts + +``` + +``` +$ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf +$ ./link_elf +Hello fun1 +hello 2 +``` + +So all of this plays role in linking object files. +There is more interesting utilit called ld its doing things on lower level then gcc. + + +### Extern + +### Attributes +PASS +### Creating shared library +PASS +### Create static libraries +PASS +### Join all objects together +PASS +### Compile with musl + +The libc is not the only option as standard c library, there is few others one of them is musl + +``` +$ musl-gcc hello_world.c -o hello_world +$ file ./hello_world +hello_world_musl: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-x86_64.so.1, not stripped +``` + + +### Inspect elf files + +There is few utilities that help to check if elf file is ok. + +ldd show what kind of shared libraries elf will try to load + +``` +$ ldd hello_world + linux-vdso.so.1 (0x00007fffcb2ae000) + libc.so.6 => /usr/lib/libc.so.6 (0x00007ffb80c00000) + /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007ffb80fb9000) + +``` + +Readelf allows to inspect content of elf files, headers and interpret values in headers. +In few example above we allready used that feature to check content of compi