DSL internals

The DSL pipeline: string → AST → polars.Expr. See Expression DSL for the expression language reference.

Entry point

schemashift.dsl.parse_and_compile(expression)[source]

Parse a DSL expression string and compile it to a polars.Expr.

Return type:

Expr

Parser

Recursive-descent parser for the schemashift DSL.

Converts a DSL expression string into an AST composed of nodes from schemashift.dsl.ast_nodes. Raises schemashift.errors.DSLSyntaxError for any invalid input.

class schemashift.dsl.parser.TT(*values)[source]

Bases: Enum

Token type.

AMP = 19
COLON = 23
COMMA = 7
DOT = 4
EOF = 24
EQ = 13
GE = 17
GT = 15
IDENT = 3
LBRACE = 21
LE = 18
LPAREN = 5
LT = 16
MINUS = 9
NE = 14
NUMBER = 1
PERCENT = 12
PIPE = 20
PLUS = 8
RBRACE = 22
RPAREN = 6
SLASH = 11
STAR = 10
STRING = 2
class schemashift.dsl.parser.Token(type, value, pos)[source]

Bases: NamedTuple

pos: int

Alias for field number 2

type: TT

Alias for field number 0

value: object

Alias for field number 1

schemashift.dsl.parser.parse_dsl(expression)[source]

Parse expression and return the root AST node.

Raises schemashift.errors.DSLSyntaxError on any syntax error.

Return type:

Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup

schemashift.dsl.parser.tokenize(expression)[source]

Convert expression to a flat list of tokens (excluding whitespace).

Return type:

list[Token]

Compiler

schemashift.dsl.compiler.compile_dsl(node)[source]

Compile node to a polars.Expr.

Raises schemashift.errors.DSLSyntaxError for invalid node structure and schemashift.errors.DSLRuntimeError for unsupported operations.

Return type:

Expr

AST nodes

AST node definitions for the schemashift DSL.

class schemashift.dsl.ast_nodes.BinaryOp(op, left, right)[source]

Bases: object

A binary operator expression.

left: Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup
op: str
right: Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup
class schemashift.dsl.ast_nodes.Coalesce(exprs)[source]

Bases: object

Return the first non-null value across a list of expressions.

exprs: tuple[Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup, ...]
class schemashift.dsl.ast_nodes.ColRef(name)[source]

Bases: object

A reference to a DataFrame column by name.

name: str
class schemashift.dsl.ast_nodes.CustomLookup(expr, mapping, base_table=None)[source]

Bases: object

Map column values through a user-defined dict, optionally extending a named table.

base_table: str | None = None
expr: Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup
mapping: tuple[tuple[Literal, Literal], ...]
class schemashift.dsl.ast_nodes.Literal(value)[source]

Bases: object

A literal value: int, float, str, bool, or None.

value: Any
class schemashift.dsl.ast_nodes.Lookup(expr, table_name)[source]

Bases: object

Map column values through a named built-in lookup table.

expr: Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup
table_name: str
class schemashift.dsl.ast_nodes.MethodCall(obj, method, args=<factory>)[source]

Bases: object

A method call on an expression object.

The method field uses dot-prefixed namespacing for sub-namespaces: e.g. "str.lower", "dt.year". Top-level methods use plain names such as "round", "abs".

args: tuple[Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup, ...]
method: str
obj: Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup
class schemashift.dsl.ast_nodes.UnaryOp(op, operand)[source]

Bases: object

A unary operator expression.

op: str
operand: Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup
class schemashift.dsl.ast_nodes.WhenChain(whens, otherwise)[source]

Bases: object

A complete when/otherwise conditional expression.

otherwise: Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup
whens: tuple[WhenClause, ...]
class schemashift.dsl.ast_nodes.WhenClause(condition, value)[source]

Bases: object

A single when/then pair inside a when-chain.

condition: Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup
value: Literal | ColRef | BinaryOp | UnaryOp | MethodCall | WhenClause | WhenChain | Coalesce | Lookup | CustomLookup