documenting

This commit is contained in:
anon
2024-07-17 00:47:45 +02:00
parent 5ac9e9a3f9
commit 7ee985ea91
3 changed files with 94 additions and 58 deletions

39
documentation/Guide.md Normal file

@ -0,0 +1,39 @@
# Source files
## main.c
Responsible for dispatching initialization,
deinitialization
and the compiler.
Only deals with the highest level of abstractions
and kept clean.
## eaxhla.l|y|c|h
Flex/Bison scanner/parser respectively.
The C source contains definitions which
store the abstract state and or
required to construct it
## compile.c|h
Responsible for transforming the abstract state
of eaxhla.c to something that can be understood
by the assembler,
dispatching it
and creating the executable.
## assembler.c|h
Creates machine code from an array of tokens.
## debug.h
Defines various debug output functions or
nop alternatives for them in non-debug builds.
Must be kept in symmetrical ballance regarding
`#if DEBUG ==` `1` and `!1`
# Builds
We like Gnu Make.
Debug builds might be create by defining `DEBUG` as 1.
The Makefile respects `DEBUG` in the environment.
Some default values are determined as appropriate for
the compiling system.

@ -1,5 +1,10 @@
# General
## The following instructions are fully supported:
```asm
; XXX fillin
```
## 2 argument instructions (t6)
### add, or, adc, sbb, and, sub, xor, cmp;
@ -55,3 +60,26 @@ mov eax x // MOV D32 REG R0 REL 69
...
u32 x = 420 // ASMDIRMEM 69 ASMDIRIMM D32 420
```
As for core instructions we must support, no matter what, here's the list:
```
mov <REG/MEM> <REG/MEM/IMM> -- immediate value can be up to 64 bits.
add or adc sbb and sub xor cmp <REG/MEM> <REG/MEM/IMM> -- immediate value can be up to 32 bits.
inc dec not neg mul imul div idiv <REG/MEM> -- no immediates allowed here.
jmp jCC cmovCC <> <> -- conditional instructions, important!
enter leave pop push -- stack focused instructions.
sysenter sysexit syscall sysret -- kernel/system focused instructions.
in out nop call bswap sal sar shr shl rol ror xchg loop -- ease-of-use focused instructions.
```
- Here is what's available to use, it's in the list below, there are 5 combinations, for 2 argument instructions.
```
ins REG REG -- mov, add, cmp, xor, and
ins REG MEM -- ^
ins REG IMM -- ^
ins MEM REG -- ^
ins MEM IMM -- ^
ins REG -- inc, dec, not, div, mul
ins MEM -- ^
ins -- syscall, pause, hlt, ret, leave
```

@ -18,7 +18,7 @@ NOTE: the compiler front-end should be able to handle the preprocessing someway,
### implementation
1. flex parsing
2. bison creates partial syntax trees (since we dont optimize, we can render in relatively small chunks because not all that much context is needed)
3. xolatile magic
3. xolatile magick
## Command line interface
```
@ -30,10 +30,31 @@ options:
-o | --output <file>
-a | --architecture <architecture>
```
## Keywords -- XOLATILE EDIT PLEASE SOMEONE SANE REFACTOR LATER
- Not supporting legacy 8-bit register access ah, ch, dh, bh, all other accessing options are available.
- Please keep in mind that rCx and rDx go before rBx, order matters a lot, assembler trusts that user didn't make a mistake.
- Register accessing process (these are 64 keywords for registers):
## Syntax
### Macros
+ fuck macros
+ use a preprocessor
### Comments
```c
// single line comment
/* multi
line
comment */
```
HLA uses C/C++ comments,
but C comments cannot be multilined.
Nested multiline comments are still not allowed.
### Asm
+ no ',' argument deliteters
For the specifics of the supported instructions consult
[Instruction\_reference.md](Instruction\_reference.md).
#### Registers
```
| NUM | QWORD | DWORD | WORD | BYTE |
| NUM | 64 BIT | 32 BIT | 16 BIT | 8 BIT |
@ -54,37 +75,9 @@ options:
| E | r14 | r14d | r14w | r14b |
| F | r15 | r15d | r15w | r15b |
```
- For legacy and conventional reasons, we should however adopt stupid decisions made before we were born.
## Instructions
As for core instructions we must support, no matter what, here's the list:
```
mov <REG/MEM> <REG/MEM/IMM> -- immediate value can be up to 64 bits.
add or adc sbb and sub xor cmp <REG/MEM> <REG/MEM/IMM> -- immediate value can be up to 32 bits.
inc dec not neg mul imul div idiv <REG/MEM> -- no immediates allowed here.
jmp jCC cmovCC <> <> -- conditional instructions, important!
enter leave pop push -- stack focused instructions.
sysenter sysexit syscall sysret -- kernel/system focused instructions.
in out nop call bswap sal sar shr shl rol ror xchg loop -- ease-of-use focused instructions.
```
- ? ANON: Note that we can use 'loope' or 'loopne' instructions, since 'loop' is our keyword, but it can cause confusion...
- ? Keep in mind that for most of these instructions both destination and source can't be "memory addresses".
- Here is what's available to use, it's in the list below, there are 5 combinations, for 2 argument instructions.
```
ins REG REG -- mov, add, cmp, xor, and
ins REG MEM -- ^
ins REG IMM -- ^
ins MEM REG -- ^
ins MEM IMM -- ^
ins REG -- inc, dec, not, div, mul
ins MEM -- ^
ins -- syscall, pause, hlt, ret, leave
```
- ANON & EMIL: I leave other, HLA-related keywords to you guys, these above are important if we want powerful language.
- I'll implement a lot more instructions into assembler, but you can choose what to directly support by the HLA!
- For example, I'll have a procedure to generate machine code for 'loop' instruction, you can simply not use it when doing HLA->ASM.
## Types
### Types
```
<int prefix><int size>
<float prefix><float size>
@ -120,29 +113,6 @@ then we'll utilize the flag (`-fterry-types`) to enable them.
// if so, why are they not listed above?
// if not, does that mean no variables an be declared?
## Syntax
### Macros
+ fuck macros
+ use a preprocessor
### Comments
```c
// single line comment
/* multi
line
comment */
```
HLA uses C/C++ comments,
but C comments cannot be multilined.
Nested multiline comments are still not allowed.
### Asm
+ no ',' argument deliteters
#### The following instructions are fully supported:
```asm
; XXX fillin
```
### Machine code
```
@ -192,7 +162,6 @@ type:
```C
my_label:
```
Labels act like variables,
but should not be dereferenced.
Feel free to use them inside jump instructions.