zhuzilin's Blog

about

6.828 笔记2 x86 and PC architecture

date: 2019-02-13
tags: OS 6.828

这里会记录阅读6.828课程lecture note的我的个人笔记。可能会中英混杂，不是很适合外人阅读，也请见谅。

Lecture 2: x86 and PC architecture

PC architecture

一个完整的CPU有：

x86 CPU与其寄存器，执行单元和内存管理部分
CPU chip pins, include address and data signals
内存
硬盘
键盘
显示器
其他资源：BIOS, ROM, clock, ...

我们从零来一步一步得到一个CPU:

可以把CPU就想成一个无穷循环

for(;;){
	run next instruction
}

为了更多空间，加入寄存器

加入4个16位寄存器AX, BX, CX, DX，每个都可以分成两个8位的，如AH和AL。他们速度非常快。

为了更多空间，加入内存

内存，通过CPU sends out address on address lines (wires, one bit per wire)(这个是memory bus嘛？)，以读写数据。

movl  %eax, %edx   # edx = eax; register mode
movl $0x123 %edx   # edx = 0x123; immediate
movl 0x123, %edx   # edx = *(int32_t*)0x123; direct
movl (%ebx), %edx  # edx = *(int32_t*)ebx; indirect
movl 4(%ebx), %edx # edx = *(int32_t*)(ebx+4); displaced

上面几行具体的address mode的解释可以看这里。大致区别是：

register mode: 寄存器到寄存器
immediate mode: 指令里头带的常数到寄存器
后面几种都是计算到内存位置的

address registers (栈与和栈类似的东西)

为了满足如上需求，就需要有address registers，他们的作用是pointers into memory，有以下4个

SP：stack pointer，指向当前的栈顶
BP：frame base pointer，指向当前函数的帧顶
SI：source index
DI

读取指令

需要注意，instructions是存在内存中的，用EIP指向instruction，每执行一个就会增加EIP，而CALL, RET, JMP等汇编指令可以改变IP。

FLAGS

除了以上内容，我们还需要条件语句进行conditional jump，这就需要FLAGS，有各种各样的FLAGS。

有了以上功能的CPU仍然是一个没什么意思的程序，因为其没有IO。

// write a byte to line printer
#define DATA_PORT    0x378
#define STATUS_PORT  0x379
#define   BUSY 0x80
#define CONTROL_PORT 0x37A
#define   STROBE 0x01
void
lpt_putc(int c)
{
  /* wait for printer to consume previous byte */
  while((inb(STATUS_PORT) & BUSY) == 0)
    ;

  /* put the byte on the parallel lines */
  outb(DATA_PORT, c);

  /* tell the printer to look at the data */
  outb(CONTROL_PORT, STROBE);
  outb(CONTROL_PORT, 0);
}

上面的这段代码是传统的PC architecture: use dedicated I/O space.

没太看懂，但是感觉这里的outb和inb和manul里的是不同的。。。

这种方式有如下的几个特点：

和memory access类似，但是需要set I/O signal (代码里面的strobe是这个signal？)
只有1024 I/O address，(1024是0x3ff，所以和上面PORT挺对应的)。
需要特殊的指令（IN, OUT）

而现在都是用Memory-Mapped I/O，大致就是把I/O装置当成物理内存之外扩充的一部分内存，从而把I/O就当成内存用。每个PORT耶直接映射到了内存地址上。

其特点是：

Use normal physical memory addresses
- Gets around limited size of I/O address space
- No need for special instructions
- System controller routes to appropriate device
Works like "magic'' memory:
- Addressed and accessed like memory, but ...
- ... does not behave like memory!
- Reads and writes can have ``side effects''
- Read results can change due to external events
如何使用不止2^16bytes的内存

虽然IP只有16位，但是加入了CS之后就可以有20位了。

注意，这个20位的情况只出现在real mode，也就是一个用于兼容之前的16位cpu用的。这个模式没有任何保护，只能访问1M内存，可以直接访问BIOS及周边内存，现在只有在boot的过程中才是用这种情况。而之后下面提到支持32位地址之后，在Boot完成之后，会转化为protected mode，就直接只用一个32位的eip寄存器了。

希望有超过16位的address

80386首次支持32位地址。Boot的时候是16位，通过boot.S转换为32位。

其寄存器也都是32位的，所以叫EAX而不是AX了。在32位模式下，通过前面加0x66前缀来toggle between 16-bit and 32-bit。而boot.s中的.code32正是做的这一点。如：

b8 cd ab		16-bit CPU,  AX <- 0xabcd
b8 34 12 cd ab  32-bit CPU, EAX <- 0xabcd1234
66 b8 cd ab		32-bit CPU,  AX <- 0xabcd

x86 Physical Memory Map

The physical address space mostly looks like ordinary RAM
Except some low-memory addresses actually refer to other things
Writes to VGA memory appear on the screen
Reset or power-on jumps to ROM at 0xfffffff0 (so must be ROM at top...)

+------------------+  <- 0xFFFFFFFF (4GB)
|      32-bit      |
|  memory mapped   |
|     devices      |
|                  |
/\/\/\/\/\/\/\/\/\/\

/\/\/\/\/\/\/\/\/\/\
|                  |
|      Unused      |
|                  |
+------------------+  <- depends on amount of RAM
|                  |
|                  |
| Extended Memory  |
|                  |
|                  |
+------------------+  <- 0x00100000 (1MB)
|     BIOS ROM     |
+------------------+  <- 0x000F0000 (960KB)
|  16-bit devices, |
|  expansion ROMs  |
+------------------+  <- 0x000C0000 (768KB)
|   VGA Display    |
+------------------+  <- 0x000A0000 (640KB)
|                  |
|    Low Memory    |
|                  |
+------------------+  <- 0x00000000

这里的图在lab1中出现过，在lab1中有更详细的讲解。

x86 Instruction Set

对于本课使用的AT&T(gcc/gas) syntax: op src, dst。

使用b, w, l来表示不同大小的操作。

data movement: MOV, PUSH, POP, ...
arithmetic: TEST, SHL, ADD, AND, ...
i/o: IN, OUT, ...
control: JMP, JZ, JNZ, CALL, RET
string: REP MOVSB, ...
system: IRET, INT

gcc x86 calling conventions

这里主要讲解了如何调用函数，也就是如何使用栈。

pushl %eax   <=> subl $4, %esp # esp -= 4
			     movl %eax, (%esp) # *(esp) = eax
popl %eax    <=> movl (%esp), %eax # eax = *(esp)
			     addl $4, %esp # esp += 4
call 0x12345 <=> pushl %eip(*) # 存起来call指令的地址
				 movl $0x12345, %eip(*) # 下一个指令执行函数
ret 		 <=> popl %eip(*) # 回到call指令
# (*) mean it is not real instruction

GCC dictates 该如何使用栈. 在x86上，caller和callee之间的协议(Contract)如下:

在函数的入口处 (i.e. just after call):
- %eip 指向函数的首个指令的地址
- %esp+4 指首个参数（其实就是首个参数被push了）
- %esp 指向返回地址（return address）
运行ret指令之后（函数返回之后）:
- %eip 包含%esp的值，也就是返回地址
- %esp 因为重新被addl 4, 所以指向首个参数
- 函数可能有trashed argument（没懂什么意思）
- %eax (and %edx, if return type is 64-bit) 存有返回值 (or trash if function is void)
- %eax, %edx (above), and %ecx may be trashed
- %ebp, %ebx, %esi, %edi must contain contents from time of call（比如ebx是输入参数，ebp是caller的frame address）
Terminology:
- %eax, %ecx, %edx are "caller save" registers
- %ebp, %ebx, %esi, %edi are "callee save" registers

这里原note中写的非常清楚，应仔细阅读。

只要不违反上述contract，Function什么都可以做，根据习惯，GCC会：

each function has a stack frame marked by %ebp, %esp

       +------------+   |
       | arg 2      |   \
       +------------+    >- previous function's stack frame
       | arg 1      |   /
       +------------+   |
       | ret %eip   |   /
       +============+   
       | saved %ebp |   \
%ebp-> +------------+   |
       |            |   |
       |   local    |   \
       | variables, |    >- current function's stack frame
       |    etc.    |   /
       |            |   |
       |            |   |
%esp-> +------------+   /

可以通过移动%esp来增大减小（应该就是通过push和pop，那到底有没有一个frame大小作为限制呢？）
%ebp指向之前函数的%ebp，从而形成链（可以看后面两部分的代码）

function prologue:

pushl %ebp  # 把当前的frame address存在esp对应的地址里
movl %esp, %ebp	 # 把现在的esp的值存在ebp里，
				 # 可以理解为*(ebp) = ebp_old
				 # 从而组成了一个frame address的链

enter $0, $0

enter usually not used: 4 bytes vs 3 for pushl+movl, not on hardware fast-path anymore

function epilogue can easily find return EIP on stack:

movl %ebp, %esp # esp = ebp，注意这时*(ebp) = ebp_old
popl %ebp # 把*(ebp)也就是ebp_old重新赋给ebp, ebp = *(ebp)
		  # 同时esp也恢复到调用这个函数之前的值

leave

leave used often because it's 1 byte, vs 3 for movl+popl

这里有一个简单的调用的例子：

C code

int main(void) { return f(8)+1; }
int f(int x) { return g(x); }
int g(int x) { return x+3; }

assembler

_main:
				prologue
	pushl %ebp
	movl %esp, %ebp
				body
	pushl $8 # 在调用前，把8，也就是参数推进栈
    call _f # 调用
    addl $1, %eax # 返回值存在eax中
    			epilogue
	movl %ebp, %esp
	popl %ebp
	ret
_f:
				prologue
	pushl %ebp # ebp为main的frame address
	movl %esp, %ebp # ebp为f的frame address，指向main的
				body
	pushl 8(%esp) # 这里的8(%esp)是x的地址对应的值，是调用函数前放进去的
				  # 把传入的x再次放入栈中
	call _g
				epilogue
	movl %ebp, %esp
	popl %ebp
	ret

_g:
				prologue
	pushl %ebp
	movl %esp, %ebp
				save %ebx
	pushl %ebx # 这里不懂。。。
				body
	movl 8(%ebp), %ebx # 取出传入的x
	addl $3, %ebx
	movl %ebx, %eax # 把计算结果存给eax
				restore %ebx
	popl %ebx
				epilogue
	movl %ebp, %esp
	popl %ebp
	ret

注意，如果使用objdump -d进行反汇编，会有出入。

Super-small _g:

_g:
	movl 4(%esp), %eax
	addl $3, %eax
	ret

编译

编译语言的方式可以见我之前的帖子How is python run.

PC emulation

The Bochs emulator(和qemu是不同的软件，不过差不多) works by

和PC做完全一样的事
全部只用软件实现

而实际上只是host的一个普通进程。

用进程的存储来模拟硬件状态，如：

把寄存器存为全局变量

int32_t regs[8];
#define REG_EAX 1;
#define REG_EBX 2;
#define REG_ECX 3;
...
int32_t eip;
int16_t segregs[4];
...

用一个数组模拟物理内存。
```
char mem[256*1024*1024]         
```

通过死循环来执行指令（可以和上面的CPU最开始对比）

for (;;) {
	read_instruction();
	switch (decode_instruction_opcode()) {
		case OPCODE_ADD:
			int src = decode_src_reg();
			int dst = decode_dst_reg();
			regs[dst] = regs[dst] + regs[src];
			break;
		case OPCODE_SUB:
			int src = decode_src_reg();
			int dst = decode_dst_reg();
			regs[dst] = regs[dst] - regs[src];
			break;
		...
	}
	eip += instruction_length;
}

内存的布局和实际上的物理内存一样（分成lower, BIOS, extension...）

#define KB		1024
#define MB		1024*1024

#define LOW_MEMORY	640*KB
#define EXT_MEMORY	10*MB

uint8_t low_mem[LOW_MEMORY];
uint8_t ext_mem[EXT_MEMORY];
uint8_t bios_rom[64*KB];

uint8_t read_byte(uint32_t phys_addr) {
	if (phys_addr < LOW_MEMORY)
		return low_mem[phys_addr];
	else if (phys_addr >= 960*KB && phys_addr < 1*MB)
		return rom_bios[phys_addr - 960*KB];
	else if (phys_addr >= 1*MB && phys_addr < 1*MB+EXT_MEMORY) {
		return ext_mem[phys_addr-1*MB];
	else ...
}

void write_byte(uint32_t phys_addr, uint8_t val) {
	if (phys_addr < LOW_MEMORY)
		low_mem[phys_addr] = val;
	else if (phys_addr >= 960*KB && phys_addr < 1*MB)
		; /* ignore attempted write to ROM! */
	else if (phys_addr >= 1*MB && phys_addr < 1*MB+EXT_MEMORY) {
		ext_mem[phys_addr-1*MB] = val;
	else ...
}

通过检查对“特殊”内存及I/O space的访问并按照真是情况进行模拟来模拟I/O等，如,

把对模拟的硬盘的读写转化成对host机上的文件的读写。
对模拟VGA硬件的写入转化为drawing into an X window
模拟的键盘读入转化为reads from X input event queue