tbsp/README.md
2024-09-14 14:39:16 +02:00

78 lines
2.4 KiB
Markdown

# TBSP
> Tree-Based Source-Processing language
## Notes
I stole the idea from here:
[https://github.com/oppiliappan/tbsp](https://github.com/oppiliappan/tbsp)
Now, there are some obvious problems with this project:
+ its written in rust
+ it tries to be a general purpose language for no reason
+ >"[ ] bytecode VM?"; seriously?
I have tried contacting the owner, the response is pending.
I have tried hacking Bison into this behaviour, its too noisy.
I firmly believe code generation is the way to go, not just here,
but for DSL-es in general.
This project will heavy depend on tree-sitter,
there is no sense pretending otherwise with decoupling.
## Language semantics
Modelled half after the original, half after Flex/Bison.
```
<declaration-section>
%%
<rule-section>
%%
<code-section>
```
### Declaration section
```
%top { <...> } // code to be pasted at the top of the source file
%language <lang> // tree-sitter langauge name (for the right includes)
```
### Rule section
```
[enter|close]+ <node-type> { <...> } // code to run when tree-sitter node-type <node-type> is encountered/popped from
```
### Code
The code section is verbatim pasted to the end of the output file.
#### Globals
```C
int tbtraverse(const char * const code); // master function; rules are evaluated here
```
#### In tbtraverse
```C
GET_TBTEXT; // macro that returns a `char *` to the current node's text value (not ts_node_string); its the programmers responsibility to free() it
int tblen; // string lenght of tbtext; XXX probably broken?
// XXX: these should probably be renamed
TSNode current_node; // node corresponding to the rule
// XXX need a macro bool for leave/enter
```
### Thinking area
```C
// This should be allowed to mean 'a' or 'b'
enter a b { <...> }
// In the node type, blobbing should probably be allowed, however regex sounds like overkill
/* A query language should also exist
* $0-><name>
* Where <name> is the named field of the rules node.
* The reason something like this could be useful is because
* if such queries are performed by hand, they can easily segv if not checked,
* however, because of the required checking they are very non-ergonomic.
* For error handling, say something this could be employed:
* enter a { ; } catch { ; }
* Where 'catch' could be implemented as a goto.
* I am unsure whether this would be too generic to be useful or not.
*/
```