22. pyxc: Type Aliases
Where We Are
Chapter 21 added string literals. We can write "hello" and pass it to puts. But the parameter type is a C-style ptr[int8] which is a bit annoying to write all the time. After this chapter we can write:
type string = ptr[int8]
extern def puts(s: string) -> int
def greet(name: string) -> int:
return puts(name)
def main() -> int:
greet("world")
return 0
Source Code
git clone --depth 1 https://github.com/alankarmisra/pyxc-llvm-tutorial
cd pyxc-llvm-tutorial/code/chapter-22
Grammar
This chapter adds two new productions (typealias, aliastype) and extends two existing ones (top gains a typealias alternative; type gains an aliastype alternative).
code/chapter-22/pyxc.ebnf
program = [ eols ] [ top { eols top } ] [ eols ] ;
top = typealias | structdef | definition | decorateddef | external | toplevelexpr ; -- new
typealias = "type" identifier "=" type ; -- new
...
type = builtintype | aliastype | structtype | pointertype ; -- new (aliastype)
aliastype = identifier ; -- new
aliastype and structtype are both written as identifier. The parser tries TypeAliases first, then StructTypes, and rejects the identifier if neither lookup succeeds.
Full Grammar
code/chapter-22/pyxc.ebnf
program = [ eols ] [ top { eols top } ] [ eols ] ;
eols = eol { eol } ;
top = typealias | structdef | definition | decorateddef | external | toplevelexpr ;
typealias = "type" identifier "=" type ;
structdef = "struct" identifier ":" eols structblock ;
structblock = indent fielddecl { eols fielddecl } dedent ;
fielddecl = identifier ":" type ;
definition = "def" prototype [ "->" type ] ":" ( simplestmt | eols block ) ;
decorateddef = binarydecorator eols "def" binaryopprototype [ "->" type ] ":" ( simplestmt | eols block )
| unarydecorator eols "def" unaryopprototype [ "->" type ] ":" ( simplestmt | eols block ) ;
binarydecorator = "@" "binary" "(" integer ")" ;
unarydecorator = "@" "unary" ;
binaryopprototype = customopchar "(" typedparam "," typedparam ")" ;
unaryopprototype = customopchar "(" typedparam ")" ;
external = "extern" "def" prototype [ "->" type ] ;
toplevelexpr = expression ;
prototype = identifier "(" [ typedparam { "," typedparam } ] ")" ;
typedparam = identifier ":" type ;
ifstmt = "if" expression ":" suite
[ eols "else" ":" suite ] ;
forstmt = "for"
( "var" identifier ":" type | identifier )
"=" expression "," expression "," expression ":" suite ;
varstmt = "var" varbinding { "," varbinding } ;
assignstmt = lvalue "=" expression ;
simplestmt = returnstmt | varstmt | assignstmt | expression ;
compoundstmt = ifstmt | forstmt ;
statement = simplestmt | compoundstmt ;
suite = simplestmt | compoundstmt | eols block ;
returnstmt = "return" [ expression ] ;
block = indent statement { eols statement } dedent ;
expression = unaryexpr binoprhs ;
binoprhs = { binaryop unaryexpr } ;
lvalue = identifier | fieldaccess | indexexpr ;
varbinding = identifier ":" type [ "=" expression ] ;
unaryexpr = unaryop unaryexpr | primary ;
unaryop = "-" | userdefunaryop ;
primary = castexpr | sizeofexpr | addrexpr | stringliteral | identifierexpr | fieldaccess | indexexpr | numberexpr | bool_literal | parenexpr ;
castexpr = casttype "(" expression ")" ;
sizeofexpr = "sizeof" "(" type ")" ;
addrexpr = "addr" "(" lvalue ")" ;
identifierexpr = identifier | callexpr ;
callexpr = identifier "(" [ expression { "," expression } ] ")" ;
fieldaccess = identifier "." identifier { "." identifier } ;
indexexpr = identifier "[" expression "]" ;
numberexpr = number ;
stringliteral = "\"" { ? any char except " and newline ? | escape } "\"" ;
escape = "\\" ( "\\" | "\"" | "n" | "t" | "0" ) ;
parenexpr = "(" expression ")" ;
binaryop = builtinbinaryop | userdefbinaryop ;
indent = INDENT ;
dedent = DEDENT ;
builtinbinaryop = "+" | "-" | "*" | "<" | "<=" | ">" | ">=" | "==" | "!=" ;
userdefbinaryop = ? any opchar defined as a custom binary operator ? ;
userdefunaryop = ? any opchar defined as a custom unary operator ? ;
customopchar = ? any opchar that is not "-" or a builtinbinaryop,
and not already defined as a custom operator ? ;
opchar = ? any single ASCII punctuation character ? ;
identifier = (letter | "_") { letter | digit | "_" } ;
builtintype = "int" | "int8" | "int16" | "int32" | "int64"
| "float" | "float32" | "float64"
| "bool" | "None" ;
aliastype = identifier ;
structtype = identifier ;
pointertype = "ptr" "[" type "]" ;
type = builtintype | aliastype | structtype | pointertype ;
casttype = "int" | "int8" | "int16" | "int32" | "int64"
| "float" | "float32" | "float64"
| "bool" | pointertype ;
integer = digit { digit } ;
number = digit { digit } [ "." { digit } ]
| "." digit { digit } ;
bool_literal = "True" | "False" ;
letter = "A".."Z" | "a".."z" ;
digit = "0".."9" ;
eol = "\r\n" | "\r" | "\n" ;
ws = " " | "\t" ;
INDENT = ? synthetic token emitted by lexer ? ;
DEDENT = ? synthetic token emitted by lexer ? ;
New Keyword: type
tok_type = -39,
Registered in the keyword table:
{"type", tok_type}
The TypeAliases Map
static std::map<string, std::pair<ValueType, string>> TypeAliases;
This maps an alias name to the fully-resolved type it stands for. The pair is (ValueType, StructName), you will notice, is the same two-field representation used in chapter 18.
TypeAliases.clear() is called at the start of each new module (inside ResetParserStateForFile), alongside StructTypes.clear(). For now, aliases do not persist across compilation units. This will change once we get into multi-file and import territory.
Extending ParseTypeToken
ParseTypeToken is the single function that all type annotations in the language go through — parameter types, return types, var declarations, sizeof operands, cast targets. That means adding alias resolution in one place makes aliases work everywhere.
Before this chapter, an unknown identifier in ParseTypeToken was an immediate error. Now there is a lookup before the error:
case tok_identifier: {
string TyName = IdentifierStr;
auto AliasIt = TypeAliases.find(TyName);
if (AliasIt != TypeAliases.end()) {
getNextToken();
if (StructName)
*StructName = AliasIt->second.second;
return AliasIt->second.first;
}
if (!StructTypes.count(TyName)) {
LogError(("Unknown type '" + TyName + "'").c_str());
return ValueType::Error;
}
getNextToken();
if (StructName)
*StructName = TyName;
return ValueType::Struct;
}
If the identifier matches an alias, the resolved type and struct name are returned directly. No other part of the compiler needs to change — the alias is transparent from this point on.
ParseTypeAliasDefinition
static bool ParseTypeAliasDefinition() {
getNextToken(); // eat 'type'
// expect identifier
string AliasName = IdentifierStr;
if (TypeAliases.count(AliasName))
return LogError("Type alias 'X' is already defined");
if (StructTypes.count(AliasName))
return LogError("Name 'X' is already defined as a struct");
getNextToken(); // eat alias name
// expect '='
getNextToken(); // eat '='
string AliasStructName;
ValueType AliasType = ParseTypeToken(&AliasStructName);
if (AliasType == ValueType::Error)
return false;
TypeAliases[AliasName] = {AliasType, AliasStructName};
LastTopLevelEndedWithBlock = false;
return true;
}
The parser eats type, validates the alias name against both TypeAliases and StructTypes, eats the =, then calls ParseTypeToken to resolve the right-hand side. Whatever ParseTypeToken returns — after fully resolving any chain of aliases — is stored directly. There is no stored pointer to the original name.
HandleTypeAliasDef wraps this in the standard top-level handler and is wired into both the file-mode and REPL-mode dispatch loops under tok_type.
Resolution happens at definition time, not at use time. When type Score = MyInt is processed, ParseTypeToken("MyInt") runs immediately and looks up MyInt in TypeAliases. If MyInt is already defined as (Int, ""), then Score is stored as (Int, ""). There is no indirection at use time — Score and int64 are identical to the compiler from the moment the alias is defined.
Conflict Rules
The name spaces for aliases and struct types are shared. Three conflicts are checked:
Alias redefinition. Defining the same alias name twice is an error:
type Foo = int
type Foo = int64 → Error: Type alias 'Foo' is already defined
Alias name collides with a struct. If a struct is already defined under that name, the alias is rejected:
struct Foo:
x: int
type Foo = int → Error: Name 'Foo' is already defined as a struct
Struct name collides with an alias. ParseStructDefinition checks TypeAliases before accepting the struct name:
type Foo = int
struct Foo: → Error: Name 'Foo' is already defined as a type alias
x: int
Forward references are not supported. Using an alias before defining it gives "Unknown type 'X'" — ParseTypeToken only searches entries that already exist in the map.
IR Transparency
Aliases produce no IR. They are resolved entirely during parsing and leave no trace in the generated output.
type Score = int64
def id(x: Score) -> Score:
return x
define i64 @id(i64 %x) {
entry:
%x.addr = alloca i64
store i64 %x, ptr %x.addr
%x1 = load i64, ptr %x.addr
ret i64 %x1
}
Score does not appear. LLVM sees i64 exactly as if int64 had been written directly. The same holds for pointer aliases:
type string = ptr[int8]
def say(msg: string) -> int:
return puts(msg)
def greeting() -> string:
return "hello"
The IR for both functions is identical to what you would get with ptr[int8] in every annotation.
Build and Run
cd code/chapter-22
cmake -S . -B build && cmake --build build
Try It
string as a type
extern def puts(s: ptr[int8]) -> int
type string = ptr[int8]
def greet(name: string) -> int:
return puts(name)
def main() -> int:
greet("world")
return 0
world
string is accepted as a parameter type and return type. The IR uses ptr throughout.
IR transparency
type Score = int64
def id(x: Score) -> Score:
return x
pyxc --emit llvm-ir -o out.ll program.pyxc
grep 'define' out.ll
define i64 @id(i64 %x)
Score is gone. The function signature is plain i64.
Alias chain
type MyInt = int
type Score = MyInt
Score resolves to (Int, "") at definition time. type Score = MyInt calls ParseTypeToken("MyInt"), which finds the alias and returns (Int, "") immediately. The chain is collapsed to a single lookup in the map.
Alias for a struct type
struct Point:
x: int
y: int
type Vec2 = Point
After this, Vec2 can be used as a parameter type, return type, or var type, and the compiler treats it exactly like Point.
Forward reference error
def use_it(x: Meters) -> Meters:
return x
type Meters = int64
Error: Unknown type 'Meters'
The alias must be defined before it is used. There are no forward references.
Known Limitations
No forward references. The alias must appear before any use. type List = ptr[List] would fail because List is not yet in TypeAliases when the right-hand side is parsed.
No recursive aliases. A consequence of no forward references — self-referential alias definitions are not possible.
Aliases are purely syntactic. There is no nominal typing. Score and int64 are the same type to the compiler; a function expecting Score will accept an int64 without complaint.
No parameterized aliases. type Pair[T] = ... is not supported. Type parameters are outside the scope of this chapter.
No re-export or scoping. All aliases are global to the module. There is no way to limit visibility of an alias to a single function.
What's Next
Chapter 23 adds fixed-size arrays (T[N]), stack allocation, indexing, and array literals — completing the types and memory phase.
Need Help?
Build issues? Questions?
- GitHub Issues: Report problems
- Discussions: Ask questions
Include:
- Your OS and version
- Full error message
- Output of
cmake --version,ninja --version, andllvm-config --version
We'll figure it out.