3. Creation and Execution of Process 1 85
3.1.1 Preparation for Creating Process 1
In Linux, any new process is created by calling fork(). The procedure is shown in Figure 3.1.
The code is as follows:
//code path:init/main.c:
……
static inline _syscall0(int,fork)//correspond to fork static inline _syscall0(int,pause)
static inline _syscall1(int,setup,void *,BIOS)
……
void main(void) {
sti();
The fork() in main.c indicates that the execution is actually transferred to syscall0 macro in unistd.h. The code is shown as follows:
move_to_user_mode();
if (!fork()) { /* we count on this going ok */
init();
} /*
* NOTE!! For any other task ‘pause()’ would mean we have to get a
* signal to awaken, but task0 is the sole exception (see ‘schedule()’)
* as task 0 gets activated at every idle moment (when no other tasks
* can run). For task0 ‘pause()’ just means we go check if some other
* task can run, and if not we return here.
*/
for(;;) pause();
}
//code path:include/unistd.h:
……
#define __NR_setup 0 /* used only by init, to get system going */
#define __NR_exit 1
#define __NR_fork 2
#define __NR_read 3
#define __NR_write 4
#define __NR_open 5
#define __NR_close 6
……
#define _syscall0(type,name) \ type name(void) \
{\
long __res; \
__asm__ volatile (“int $0x80” \ : “ = a” (__res) \ : “0” (__NR_##name)); \ if (__res > = 0) \
return (type) __res; \ errno = -__res; \
return -1; \ }
……
volatile void _exit(int status);
int fcntl(int fildes, int cmd,...);
int fork(void);
int getpid(void);
int getuid(void);
int geteuid(void);
……
//code path:include/linux/sys.h:
extern int sys_setup();
extern int sys_exit();
extern int sys_fork(); //correspond to _sys_fork in system_call.s, there is an // underline“_” in front of
extern int sys_read(); //functions in Assembler corresponding to that in C language.
extern int sys_write(); //such as:_sys_fork is the corresponding function of sys_fork.
extern int sys_open();
……
fn_ptr sys_call_table[] = {sys_setup, sys_exit, sys_fork, sys_read, //sys_fork corresponds to //the third item
sys_write, sys_open, sys_close, sys_waitpid, sys_creat, sys_link, //of _sys_call_table sys_unlink, sys_execve, sys_chdir, sys_time, sys_mknod, sys_chmod,
……
Syscall0 looks like the following after expansion:
int fork(void) //refer to the code annotation about embedding assemble
{ //in section 2.5, 2.9, 2.14
long __res;
__asm__ volatile (“int $0x80” //int 0x80 is the head entry of all system call,one of which //is fork().You can refer to the explanation in section 2.9 : “ = a” (__res) //output part.The value of _res is assigned to eax.
: “0” (__NR_ fork)); //input part, “0” is eax, NR_ fork is 2, which is assigned to eax.
if (__res > = 0) //this line will be executed after return from int 0x80 return (int) __res;
errno = -__res;
return -1;
}
//Attentions:Caused by int 0x80, the hardware will automatically push ss, esp, eflags, cs, eip!
//You can refer to the explanation in section 2.14
Kernel
ROM BIOS
and VGA Enable
interrupt
0x00000 0x9FFFF 0xFFFFF 0x3FFFFF 0x5FFFFF 0xFFFFFF
Kernel code area Kernel data area
0 47 255 system_call sys_fork
init_task IDT
CS:EIP Step 1: int 0x80
soft interruption
Step 2: jump to execute sys_fork from IDT After entering system_call, the stack data is here Process status
Process 0 Ready Current process
Kernel code area Kernel data area
IDT
After the creation of system call Before the creation of system call
0 47 255 system_call sys_fork
init_task
Figure 3.1 Preparation for creating process 1.
Regarding the long procedure of int 0x80, the general process is illustrated below.
The detailed steps are as follows:
The code “0” (__NR_ fork) is executed first. The value of _NR_fork, which is the cor- responding function number of fork in sys_call_table[], is assigned to eax, namely, 2. This number is the offset value of the sys_fork function in sys_call_table.
……_system_call:
cmpl $nr_system_calls-1,%eax ja bad_sys_call
push %ds push %es push %fs pushl %edx pushl %ecx pushl %ebx movl $0x10,%edx mov %dx,%ds mov %dx,%es movl $0x17,%edx mov %dx,%fs
call _sys_call_table(,%eax,4) pushl %eax
movl _current,%eax cmpl $0,state(%eax) jne reschedule cmpl $0,counter(%eax) je reschedule ……
Privilege level 0 Privilege level 3
……
_exit(int status) fcntl(int fildes, int cmd, ...) fork(void)
getpid(void) getuid(void)
……
IDT
int 0x80
sys_call_table 0
1 2 3 4
sys_setup sys_exit sys_fork sys_read sys_write
……_sys_fork:
call _find_empty_process testl %eax,%eax js 1f
push %gs pushl %esi pushl %edi pushl %ebp pushl %eax call _copy_process addl $20,%esp 1: ret
……
sys_setup()
find_empty_process()
copy_process()
get_free_page()
copy_mem() get_limit() get_base() set_base()
copy_page_tables() get_free_page() Figure 3.2 The calling path of system call.
After which, “int $ 0x80” is executed. It triggers a soft interrupt and the central process- ing unit (CPU) starts executing the kernel code in privilege level 0 from the process code in level 3. The hardware automatically pushes ss, esp, eflags, cs, and eip to the kernel stack of pro- cess 0 in init_task, which is shown in Figure 3.1. You should pay attention to the red stripe after the init_task structure, which is the value of the five registers. The push action in the move_
to_user_mode mentioned before is to imitate hardware push in interruption. The pushed data will be used to initialize the TSS of process 1 in the subsequent copy_process function.
Note that the pushed eip data points to the next line of the instruction “int $ 0x80,”
which is the line if (__res> = 0). Process 0 will continue to execute this line after being back from interrupt in fork(). In Section 3.3, we will see that this line is also the first instruction that process 1 starts to execute. Keep this point in mind!
According to the settings of set_system_gate (0x80, & system_call) in sched_init() explained in Section 2.9, after pushing automatically, the CPU jumps to _system_call in sytem_call.s and continues to push in ds, es, fs, edx, ecx, and ebx, which is used to prepare for initializing TSS in process 1 when calling copy_process(). Finally, according to the off- set value 2, the kernel looks up sys_call_table[] to learn that the corresponding function is sys_fork. The corresponding function name to C language in Assembler has an under- score “_” in front (e.g., _sys_fork in Assembler corresponds to sys_fork in C language);
thus, the kernel starts to execute _sys_fork.
Tip:
The function parameters are not defined by the function itself but made out by another procedure through pushing, which is one of the main distinctions between the OS code and the application code. Clearly understanding the com- pilation and implementation of C language will help one grasp this method.
Parameters of C language exist in the stack when operating; hence, system designers can force the value in stacks as parameters of function in sequence. In this way, functions will use the value in stacks as parameters.
The code is as follows:
//Code path:kernel/system_call.s:
……
_system_call: #int 0x80— — head entry of system call cmpl $nr_system_calls-1,%eax
ja bad_sys_call
push%ds #all the 6 pushes are used as the parameters of
#copy_process ().
push%es #remember the order and don’t forget int 0x80 in front.
push%fs #5 values are pushed, too.
pushl%edx
pushl%ecx # push%ebx,%ecx,%edx as parameters
pushl%ebx # to the system call
movl $0x10,%edx # set up ds,es to kernel space mov%dx,%ds
mov%dx,%es
movl $0x17,%edx # fs points to local data space mov%dx,%fs
call _sys_call_table(,%eax,4) #eax is 2, this line is equal to call (_sys_call_
table + 2×4),
pushl%eax #namely the entry of _sys_fork movl _current,%eax
cmpl $0,state(%eax) # state
In the line call _sys_call_table (,%eax,4), the value of eax is 2. This line can also be seen as call _sys_call_table + 2 × 4, in which the value 4 means 4 bytes in each item of _sys_call_table[]. It is equal to call _sys_call_table[2], namely, sys_fork.
Note: The instruction call _sys_call_table (,% eax, 4) will protect the field by itself, and the sixth parameter in copy_process(), long none, indicates this push action. The execution code is as follows:
jne reschedule
cmpl $0,counter(%eax) # counter je reschedule
ret_from_sys_call:
movl _current,%eax # task[0] cannot have signals cmpl _task,%eax
je 3f
cmpw $0x0f,CS(%esp) # was old code segment supervisor ? jne 3f
cmpw $0x17,OLDSS(%esp) # was stack segment = 0x17 ? jne 3f
movl signal(%eax),%ebx movl blocked(%eax),%ecx notl%ecx
andl%ebx,%ecx bsfl%ecx,%ecx je 3f btrl%ecx,%ebx movl%ebx,signal(%eax) incl%ecx
pushl%ecx call _do_signal popl%eax 3: popl%eax popl%ebx popl%ecx popl%edx pop%fs pop%es pop%ds iret
……
_sys_fork: #the entry of sys_fork()
……
//Code path:kernel/system_call.s:
……
_system_call:
……
_sys_fork:
call _find_empty_process
testl%eax,%eax #if the return value is –EAGAIN(11), there have been 64 process
#in execution
js 1f #already.
push%gs #the following values of five registers pushed in the stack
#is set as the initial parameters of copy_process () pushl%esi
pushl%edi pushl%ebp pushl%eax call _copy_process addl $20,%esp
1: ret
……