Preparation for Creating Process 1

3. Creation and Execution of Process 1 85

3.1.1 Preparation for Creating Process 1

In Linux, any new process is created by calling fork(). The procedure is shown in Figure 3.1.

The code is as follows:

//code path:init/main.c:

……

static inline _syscall0(int,fork)//correspond to fork static inline _syscall0(int,pause)

static inline _syscall1(int,setup,void *,BIOS)

……

void main(void) {

sti();

The fork() in main.c indicates that the execution is actually transferred to syscall0 macro in unistd.h. The code is shown as follows:

move_to_user_mode();

if (!fork()) { /* we count on this going ok */

init();

} /*

* NOTE!! For any other task ‘pause()’ would mean we have to get a

* signal to awaken, but task0 is the sole exception (see ‘schedule()’)

* as task 0 gets activated at every idle moment (when no other tasks

* can run). For task0 ‘pause()’ just means we go check if some other

* task can run, and if not we return here.

for(;;) pause();

}

//code path:include/unistd.h:

……

#define __NR_setup 0 /* used only by init, to get system going */

#define __NR_exit 1

#define __NR_fork 2

#define __NR_read 3

#define __NR_write 4

#define __NR_open 5

#define __NR_close 6

……

#define _syscall0(type,name) \ type name(void) \

long __res; \

__asm__ volatile (“int $0x80” \ : “ = a” (__res) \ : “0” (__NR_##name)); \ if (__res > = 0) \

return (type) __res; \ errno = -__res; \

return -1; \ }

……

volatile void _exit(int status);

int fcntl(int fildes, int cmd,...);

int fork(void);

int getpid(void);

int getuid(void);

int geteuid(void);

……

//code path:include/linux/sys.h:

extern int sys_setup();

extern int sys_exit();

extern int sys_fork(); //correspond to _sys_fork in system_call.s, there is an // underline“_” in front of

extern int sys_read(); //functions in Assembler corresponding to that in C language.

extern int sys_write(); //such as:_sys_fork is the corresponding function of sys_fork.

extern int sys_open();

……

fn_ptr sys_call_table[] = {sys_setup, sys_exit, sys_fork, sys_read, //sys_fork corresponds to //the third item

sys_write, sys_open, sys_close, sys_waitpid, sys_creat, sys_link, //of _sys_call_table sys_unlink, sys_execve, sys_chdir, sys_time, sys_mknod, sys_chmod,

……

Syscall0 looks like the following after expansion:

int fork(void) //refer to the code annotation about embedding assemble

{ //in section 2.5, 2.9, 2.14

long __res;

__asm__ volatile (“int $0x80” //int 0x80 is the head entry of all system call,one of which //is fork().You can refer to the explanation in section 2.9 : “ = a” (__res) //output part.The value of _res is assigned to eax.

: “0” (__NR_ fork)); //input part, “0” is eax, NR_ fork is 2, which is assigned to eax.

if (__res > = 0) //this line will be executed after return from int 0x80 return (int) __res;

errno = -__res;

return -1;

}

//Attentions:Caused by int 0x80, the hardware will automatically push ss, esp, eflags, cs, eip!

//You can refer to the explanation in section 2.14

Kernel

ROM BIOS

and VGA Enable

interrupt

0x00000 0x9FFFF 0xFFFFF 0x3FFFFF 0x5FFFFF 0xFFFFFF

Kernel code area Kernel data area

0 47 255 system_call sys_fork

init_task IDT

CS:EIP Step 1: int 0x80

soft interruption

Step 2: jump to execute sys_fork from IDT After entering system_call, the stack data is here Process status

Process 0 Ready Current process

Kernel code area Kernel data area

IDT

After the creation of system call Before the creation of system call

0 47 255 system_call sys_fork

init_task

Figure 3.1 Preparation for creating process 1.

Regarding the long procedure of int 0x80, the general process is illustrated below.

The detailed steps are as follows:

The code “0” (__NR_ fork) is executed first. The value of _NR_fork, which is the corresponding function number of fork in sys_call_table[], is assigned to eax, namely, 2. This number is the offset value of the sys_fork function in sys_call_table.

……_system_call:

cmpl $nr_system_calls-1,%eax ja bad_sys_call

push %ds push %es push %fs pushl %edx pushl %ecx pushl %ebx movl $0x10,%edx mov %dx,%ds mov %dx,%es movl $0x17,%edx mov %dx,%fs

call _sys_call_table(,%eax,4) pushl %eax

movl _current,%eax cmpl $0,state(%eax) jne reschedule cmpl $0,counter(%eax) je reschedule ……

Privilege level 0 Privilege level 3

……

_exit(int status) fcntl(int fildes, int cmd, ...) fork(void)

getpid(void) getuid(void)

……

IDT

int 0x80

sys_call_table 0

1 2 3 4

sys_setup sys_exit sys_fork sys_read sys_write

……_sys_fork:

call _find_empty_process testl %eax,%eax js 1f

push %gs pushl %esi pushl %edi pushl %ebp pushl %eax call _copy_process addl $20,%esp 1: ret

……

sys_setup()

find_empty_process()

copy_process()

get_free_page()

copy_mem() get_limit() get_base() set_base()

copy_page_tables() get_free_page() Figure 3.2 The calling path of system call.

After which, “int $ 0x80” is executed. It triggers a soft interrupt and the central process- ing unit (CPU) starts executing the kernel code in privilege level 0 from the process code in level 3. The hardware automatically pushes ss, esp, eflags, cs, and eip to the kernel stack of process 0 in init_task, which is shown in Figure 3.1. You should pay attention to the red stripe after the init_task structure, which is the value of the five registers. The push action in the move_

to_user_mode mentioned before is to imitate hardware push in interruption. The pushed data will be used to initialize the TSS of process 1 in the subsequent copy_process function.

Note that the pushed eip data points to the next line of the instruction “int $ 0x80,”

which is the line if (__res> = 0). Process 0 will continue to execute this line after being back from interrupt in fork(). In Section 3.3, we will see that this line is also the first instruction that process 1 starts to execute. Keep this point in mind!

According to the settings of set_system_gate (0x80, & system_call) in sched_init() explained in Section 2.9, after pushing automatically, the CPU jumps to _system_call in sytem_call.s and continues to push in ds, es, fs, edx, ecx, and ebx, which is used to prepare for initializing TSS in process 1 when calling copy_process(). Finally, according to the offset value 2, the kernel looks up sys_call_table[] to learn that the corresponding function is sys_fork. The corresponding function name to C language in Assembler has an under- score “_” in front (e.g., _sys_fork in Assembler corresponds to sys_fork in C language);

thus, the kernel starts to execute _sys_fork.

Tip:

The function parameters are not defined by the function itself but made out by another procedure through pushing, which is one of the main distinctions between the OS code and the application code. Clearly understanding the com- pilation and implementation of C language will help one grasp this method.

Parameters of C language exist in the stack when operating; hence, system designers can force the value in stacks as parameters of function in sequence. In this way, functions will use the value in stacks as parameters.

The code is as follows:

//Code path:kernel/system_call.s:

……

_system_call: #int 0x80— — head entry of system call cmpl $nr_system_calls-1,%eax

ja bad_sys_call

push%ds #all the 6 pushes are used as the parameters of

#copy_process ().

push%es #remember the order and don’t forget int 0x80 in front.

push%fs #5 values are pushed, too.

pushl%edx

pushl%ecx # push%ebx,%ecx,%edx as parameters

pushl%ebx # to the system call

movl $0x10,%edx # set up ds,es to kernel space mov%dx,%ds

mov%dx,%es

movl $0x17,%edx # fs points to local data space mov%dx,%fs

call _sys_call_table(,%eax,4) #eax is 2, this line is equal to call (_sys_call_

table + 2×4),

pushl%eax #namely the entry of _sys_fork movl _current,%eax

cmpl $0,state(%eax) # state

In the line call _sys_call_table (,%eax,4), the value of eax is 2. This line can also be seen as call _sys_call_table + 2 × 4, in which the value 4 means 4 bytes in each item of _sys_call_table[]. It is equal to call _sys_call_table[2], namely, sys_fork.

Note: The instruction call _sys_call_table (,% eax, 4) will protect the field by itself, and the sixth parameter in copy_process(), long none, indicates this push action. The execution code is as follows:

jne reschedule

cmpl $0,counter(%eax) # counter je reschedule

ret_from_sys_call:

movl _current,%eax # task[0] cannot have signals cmpl _task,%eax

je 3f

cmpw $0x0f,CS(%esp) # was old code segment supervisor ? jne 3f

cmpw $0x17,OLDSS(%esp) # was stack segment = 0x17 ? jne 3f

movl signal(%eax),%ebx movl blocked(%eax),%ecx notl%ecx

andl%ebx,%ecx bsfl%ecx,%ecx je 3f btrl%ecx,%ebx movl%ebx,signal(%eax) incl%ecx

pushl%ecx call _do_signal popl%eax 3: popl%eax popl%ebx popl%ecx popl%edx pop%fs pop%es pop%ds iret

……

_sys_fork: #the entry of sys_fork()

……

//Code path:kernel/system_call.s:

……

_system_call:

……

_sys_fork:

call _find_empty_process

testl%eax,%eax #if the return value is –EAGAIN(11), there have been 64 process

#in execution

js 1f #already.

push%gs #the following values of five registers pushed in the stack

#is set as the initial parameters of copy_process () pushl%esi

pushl%edi pushl%ebp pushl%eax call _copy_process addl $20,%esp

1: ret

……

Loading the Second Part of Code— —Setup

Open A20 and Achieve 32-Bit Addressing