39. pyxc: Assignment as Expression
Where We Are
Chapter 38 added unsigned integer types. pyxc can call getchar(), but the canonical K&R idiom for reading until EOF still doesn't compile:
# What we want to write:
while (c = getchar()) != EOF:
...
Error: Assignment target must be assignable
The problem is that = is a statement in pyxc — it cannot appear inside an expression like a while condition. After this chapter it can:
extern def getchar() -> int32
extern def printd(x: float64)
var EOF: int32 = -1
def main() -> int:
var c: int32
var blanks: int
while (c = getchar()) != EOF:
if c == ' ':
blanks += 1
printd(float64(blanks))
return 0
Source Code
git clone --depth 1 https://github.com/alankarmisra/pyxc-llvm-tutorial
cd pyxc-llvm-tutorial/code/chapter-39
ParseExpression — Tail Assignment Check
ParseExpression already parsed unaryexpr binoprhs. The change adds a check after ParseBinOpRHS resolves: if the next token is = or a compound-assign operator, treat the result as an lvalue and parse the right-hand side recursively:
static unique_ptr<ExprAST> ParseExpression() {
auto LHS = ParseUnary();
if (!LHS)
return nullptr;
LHS = ParseBinOpRHS(0, std::move(LHS));
if (!LHS)
return nullptr;
// Assignment tail — fires only if '=' or '+=' etc. follows
if (CurTok != '=' && !IsCompoundAssignTok(CurTok))
return LHS;
int AssignTok = CurTok;
getNextToken(); // eat assignment operator
ExpectedLiteralTypeGuard Guard(LHS->getType(), LHS->getStructName());
auto RHS = ParseExpression(); // right-associative: recurse
if (!RHS)
return nullptr;
return BuildAssignmentExpr(AssignTok, std::move(LHS), std::move(RHS));
}
Because the right-hand side recurses to ParseExpression, chained assignment is automatically right-associative: a = b = 4 parses as a = (b = 4).
ExpectedLiteralTypeGuard propagates the lvalue's type into the RHS parse, so x = 5 when x is int32 will treat 5 as int32 without an explicit cast.
BuildAssignmentExpr — Lvalue Validation and Node Construction
All lvalue validation is done in BuildAssignmentExpr, a new helper extracted from the old statement-level assignment parser. It pattern-matches on the node type of LHS:
static unique_ptr<ExprAST> BuildAssignmentExpr(int AssignTok,
unique_ptr<ExprAST> LHS,
unique_ptr<ExprAST> RHS) {
if (!LHS || !RHS)
return nullptr;
if (auto *Var = dynamic_cast<VariableExprAST *>(LHS.get())) {
// plain variable: produce AssignmentExprAST or CompoundAssignmentExprAST
...
}
if (auto *Field = dynamic_cast<FieldExprAST *>(LHS.get())) {
// field access: produce FieldAssignmentExprAST or FieldCompoundAssignmentExprAST
...
}
if (auto *Idx = dynamic_cast<IndexExprAST *>(LHS.get())) {
// array index: produce IndexAssignmentExprAST or IndexCompoundAssignmentExprAST
...
}
if (auto *IdxField = dynamic_cast<IndexedFieldExprAST *>(LHS.get())) {
// indexed field: produce IndexedFieldAssignmentExprAST or variant
...
}
return LogError("Assignment target must be assignable");
}
Four lvalue kinds are recognised: variable, field, array index, and indexed field. Any other expression kind — including x + 1 or a function call — reaches the final LogError. The existing AssignmentExprAST, CompoundAssignmentExprAST, and their field/index variants are reused unchanged; BuildAssignmentExpr is the only new code.
The Value of an Assignment Expression
AssignmentExprAST::codegen already stores the value and returns it. That return value is now visible to the caller because ParseExpression can produce an assignment node at any nesting level:
var result: int = (c = 5) + 1 # result is 6; c is 5
Compound assignment also works as an expression. (n -= 1) produces the new value of n.
Right Associativity
The recursive call to ParseExpression (not ParseBinOpRHS) means = binds more loosely than all binary operators and chains right:
a = b = 4 # parsed as: a = (b = 4)
b = 4 evaluates first — stores 4 into b, produces 4 — then a = 4 stores that value into a.
Parentheses Are Transparent for Lvalues
The parens in (c = getchar()) are parsed as a ParenExprAST that wraps a VariableExprAST. BuildAssignmentExpr sees through the paren by the time it runs because ParsePrimary returns the inner expression node from ParseParenExpr. So assignment through (x) works correctly:
(x) = 2 # valid: same as x = 2
(p[i]) = v # valid: same as p[i] = v
Grammar
expression = lvalue assignop expression | unaryexpr binoprhs ; -- changed
The assignstmt rule is unchanged — assignment as a statement still works as before.
Error Cases
Non-lvalue target:
(x + 1) = 3 # Error: Assignment target must be assignable
Precedence trap without parens:
while c = getchar() != EOF:
...
# Parses as: c = (getchar() != EOF)
# If c is int32, this is a type error (assigning bool to int32).
# Always use: while (c = getchar()) != EOF:
Things Worth Knowing
Parentheses required when combining assignment with another operator. (c = getchar()) != EOF is correct. c = getchar() != EOF assigns a bool to c.
Assignment expressions do not print in the REPL. Top-level assignments set variables silently, exactly as before.
Compound assignment also works as an expression. (n -= 1) produces the new value of n. Idioms like while (n -= 1) > 0: work.
What's Next
Chapter 40 adds variadic extern declarations, enabling printf, scanf, and other C variadic functions.
Need Help?
Build issues? Questions?
- GitHub Issues: Report problems
- Discussions: Ask questions
Include:
- Your OS and version
- Full error message
- Output of
cmake --version,ninja --version, andllvm-config --version
We'll figure it out.