-=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- (c) WidthPadding Industries 1987 0|526|0 -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=-
SoCoder -> Snippet Home -> Misc


 
Cower
Created : 13 March 2010
 
Language : Blitz Max

BlitzMax Lexer Module

Module for tokenizing BlitzMax source code

I originally wrote this in Ruby, but there is a rather annoying issue with writing any code in Ruby: using it anywhere else is an immense pain. If you've ever had to work with the C API to embed Ruby in something, you're probably aware of this. You may also be insane if you're going "I did it and I thoroughly enjoyed the experience." I can't help those people, they're clearly lost causes.

Anyhow, so I ported the code to C, and overall I think it's an improvement because it's a little less messy. There's not a lot of comments — there are actually four total in the C side of things, and only a handful in the BlitzMax code just because BlitzMax sucks at actually working with C code and sometimes I need to make a note about what type something really is. The C API is private in this, mostly because I think most BlitzMax users would find it terrifying even if it's relatively simple.

The BlitzMax API is fairly simple, I don't think I need to explain what each method does or what the fields of something are. If it has an _ before it, you don't touch that, fairly simple.

If you need to parse BlitzMax code, this is probably a decent starting point so you don't have to concern yourself with the annoying string parsing crap you'd otherwise have to do and just focus on structure and chunks of code. If you want to tweak the lexer to match certain other things, it's probably fairly easy to do and could be a decent starting point for something else (most of what you'd change would likely be covered by the token singles/pairs arrays and changing those to match your own preferences - case sensitivity options are in there, so you could work that in as well).

Anyhow, the C side of things...

lexer.h


lexer.c


bmxlexer.bmx

 

Comments