Introduction

Overview

YourLang is a dynamically-typed, interpreted language built entirely from scratch. This documentation covers both the language itself and the internals of the interpreter — useful for anyone wanting to understand the implementation.

this language is based on the java as the core and Aot for executable generation ,Ast based compliation.The primary objective of the project is to learning the basic of the working of the interperter?]

Installation

# TODO: your install steps Dowload the direct release of Machine exectuable file from release https://github.com/Sunny-esc/Sun-Lang/releases
Dependency requriement Jdk version 8+
# TODO: Running from the source itself steps git clone https://github.com/Sunny-esc/Sun-Lang cd Sun-Lang && cd Sun java Lox.java

Hello, World

// hello world in Sun language print"Hello, world!";

Syntax

Basics

The basic syntax of Sun includes comments, whitespace, identifiers, and literals. Whitespace is ignored and used only for readability. Comments help document code, and literals represent fixed values such as numbers, strings, and booleans.

// Single-line comment /* Block comment */ // Variable declarations with literals var x = 42; // number var name = "YourLang"; // string var flag = true; // boolean var nothing = nil; // null value

Expressions

Expressions produce values.Sun supports arithmetic, comparison, and logical expressions to perform calculations and make decisions in programs.

Arithmetic expressions
1 + 2; // 3 5 - 3; // 2 4 * 2; // 8 8 / 2; // 4
Comparison expressions
3 < 5; // true 5 <= 5; // true 7 > 2; // true 4 >= 6; // false 1 == 2; // false "cat" != "dog"; // true
Logical expressions
!true; // false !false; // true true and false; // false true or false; // true

Statements

Statements perform actions in a program. In Sun, common statements include variable declarations, printing values, expressions, and block statements for grouping multiple operations together.

// Variable declaration var x = 10; var name = "Sunny"; // Print statement print x; print name; // Expression statement x = x + 5; // Block statement (scope) { var y = 20; print y; } // Conditional statement if (x > 10) { print "x is greater than 10"; } else { print "x is small"; }

Functions

Functions allow you to group reusable logic. In Sun, functions are declared using the fun keyword. They can take parameters, perform operations, and optionally return values.

// Function without return value fun printSum(a, b) { print a + b; } // Function with return value fun returnSum(a, b) { return a + b; } // Calling functions printSum(1, 2); var result = returnSum(2, 3); print result;

Control Flow

if (x > 0) { // TODO } else { // TODO } while (x < 10) { x = x + 1; } for (var a = 1; a < 10; a = a + 1) { print a; }

Class

A class defines a blueprint for creating objects by bundling behavior (methods) and state (fields). In Sun, classes are first-class values, meaning they can be stored in variables, passed to functions, and invoked like functions to create instances.

class Breakfast { cook() { print "Eggs a-fryin'!"; } serve(who) { print "Enjoy your breakfast, " + who + "."; } } // Classes are first-class values var someVariable = Breakfast; someFunction(Breakfast); // Creating an instance var breakfast = Breakfast(); print breakfast; // "Breakfast instance". // Adding fields dynamically breakfast.meat = "sausage"; breakfast.bread = "sourdough"; // Using 'this' inside methods class Breakfast { serve(who) { print "Enjoy your " + this.meat + " and " + this.bread + ", " + who + "."; } } // Initializer (constructor) class Breakfast { init(meat, bread) { this.meat = meat; this.bread = bread; } } var baconAndToast = Breakfast("bacon", "toast"); baconAndToast.serve("Dear Reader"); // "Enjoy your bacon and toast, Dear Reader."

Inheritance

Inheritance allows a class to reuse behavior from another class. In Sun, a subclass is defined using the < operator, where the subclass inherits all methods from its superclass. This enables code reuse and extension of existing behavior.

class Brunch < Breakfast { drink() { print "How about a Bloody Mary?"; } } // Creating an instance of subclass var benedict = Brunch("ham", "English muffin"); benedict.serve("Noble Reader"); // Using super to call superclass methods class Brunch < Breakfast { init(meat, bread, drink) { super.init(meat, bread); this.drink = drink; } }

Dev Notes / parsing

Ambiguity & Expression Parsing

Design note

Ambiguity in Parsing
When parsing expressions, ambiguity arises when a single sequence of tokens can be interpreted in multiple ways. The parser’s role is not only to validate syntax but also to determine how different parts of the input relate to the grammar. Without clear rules, the parser may construct different syntax trees for the same expression, leading to different evaluation results.

Operator Precedence
Precedence defines the order in which different operators are evaluated. Operators with higher precedence are evaluated before those with lower precedence—they “bind tighter” to their operands. For example, in an expression combining division and subtraction, division is evaluated first due to its higher precedence.

Associativity
Associativity determines evaluation order when multiple operators of the same type appear in sequence.

  • Left-associative: evaluation proceeds from left to right
  • Right-associative: evaluation proceeds from right to left
For instance:
5 - 3 - 1
is interpreted as:
(5 - 3) - 1

Without well-defined precedence and associativity rules, expressions become ambiguous and unreliable.

Operator Hierarchy
Name Operators Associativity
Equality == != Left
Comparison > >= < <= Left
Term - + Left
Factor / * Left
Unary ! - Right

Dev Notes / parsing-technique

Recursive Descent Parsing

Design note

There are many parsing techniques—such as LL, LR, LALR, parser combinators, and others—but for this interpreter, a simpler and highly effective approach is used: recursive descent parsing.

Recursive descent is a top-down parsing technique. It starts from the highest-level grammar rule (typically expression) and progressively breaks it down into smaller sub-expressions until reaching the most basic elements of the syntax tree.

This method relies on straightforward, handwritten code instead of parser generators like Yacc, Bison, or ANTLR. Despite its simplicity, recursive descent is:

  • Efficient and fast
  • Easy to understand and maintain
  • Capable of handling complex grammar structures
  • Well-suited for detailed error reporting
The parser’s control flow naturally mirrors the structure of the grammar, making the implementation intuitive and closely aligned with the language design.

Dev Notes / errors

Syntax Errors & Recovery

Design note

Role of the Parser
A parser has two primary responsibilities:

  • Generate a syntax tree for valid input
  • Detect and report errors for invalid input
In real-world development environments, parsers frequently process incomplete or incorrect code. Therefore, robust error handling is essential for a good user experience.

Error Handling Requirements
A well-designed parser should:
  • Detect and clearly report syntax errors
  • Avoid crashing or entering infinite loops
  • Continue parsing after encountering errors when possible
  • Report multiple errors in a single pass
  • Minimize cascading errors caused by earlier failures

Error Recovery
Error recovery is the mechanism that allows the parser to continue processing after encountering an error.

Panic Mode Recovery
In panic mode, the parser immediately stops processing the current construct when an error is detected. It then skips tokens until it reaches a point where parsing can safely resume. This process is called synchronization.

Entering Panic Mode
For example, while parsing a parenthesized expression, if the parser fails to find the expected closing ), it triggers an error and enters panic mode.

Synchronization in Recursive Descent
In recursive descent parsing, the parser’s state is implicitly stored in the call stack. Each active grammar rule corresponds to a function call. To recover from an error:
  • The parser unwinds the call stack
  • Skips tokens until a safe synchronization point is found
  • Resumes parsing from a stable state
This strategy allows the parser to recover gracefully while continuing to provide useful feedback to the user.

Dev Notes / statements

Statements & Expressions

Design note

Expression Statements
An expression statement allows an expression to appear where a statement is expected. These are primarily used when evaluating expressions that produce side effects, such as function calls.

someFunction();

Print Statements
A print statement evaluates an expression and displays its result to the user.
print 2 + 1;

Grammar

program   → statement* EOF ;

statement → exprStmt
          | printStmt ;

exprStmt  → expression ";" ;
printStmt → "print" expression ";" ;
      

Expressions vs Statements
  • Expressions produce values and can be nested.
  • Statements perform actions and control execution.
This separation ensures a clean and predictable structure in both parsing and execution.

Dev Notes / ast

Statement Syntax Trees

Design note

Expressions and statements are represented using separate class hierarchies in the AST.

  • Expr → represents expressions (value-producing)
  • Stmt → represents statements (execution-oriented)
This separation improves:
  • Type safety (compile-time validation)
  • Code clarity and maintainability
  • Clear separation of responsibilities

Base Class
abstract class Stmt {}
The AST generator is extended to include statements, producing specific subclasses such as print and expression statements.

Dev Notes / variables

Variables & Declarations

Design note

Variable Declaration
A variable declaration introduces a new binding between a name and a value.

var beverage = "espresso";

Variable Access
A variable expression retrieves the value associated with a name.
print beverage;

Grammar

program      → declaration* EOF ;

declaration  → varDecl
             | statement ;

varDecl      → "var" IDENTIFIER ( "=" expression )? ";" ;

primary      → IDENTIFIER | ... ;
      

If no initializer is provided, the variable is assigned a default value (nil). Accessing a variable before it is defined results in a runtime error.

Dev Notes / environment

Environment & Variable Storage

Design note

Variable bindings are stored in a structure called an environment. Internally, this behaves like a map:

  • Keys → variable names
  • Values → runtime values
The interpreter maintains an environment instance to store variables during execution.

private Environment environment = new Environment();
      

Operations supported:
  • Define a variable
  • Retrieve a variable’s value
  • Update an existing variable
This structure allows variables to persist throughout program execution.

Dev Notes / assignment

Assignment

Design note

Assignment allows updating the value of an existing variable.

a = 2;

In this language, assignment is an expression, not a statement. It has the lowest precedence and is right-associative.

expression → assignment ;

assignment → IDENTIFIER "=" assignment
           | equality ;
      

Key Concept
  • l-value: location being assigned to
  • r-value: value being assigned
Only valid assignment targets (like variables) are allowed. Invalid targets result in a syntax error. Assignment expressions return the assigned value, allowing chaining:
print a = 2; // prints 2

Dev Notes / scope

Scope & Block Execution

Design note

A scope defines where a variable is accessible. This language uses lexical scope, meaning variable resolution is determined by the structure of the code.

Block Scope


{
  var a = "inside";
}
print a; // Error
      
Variables declared inside a block are only accessible within that block.

Nested Scope & Shadowing

var a = "global";
{
  var a = "local";
  print a; // local
}
print a; // global
      
Inner variables can shadow outer variables without modifying them.

Environment Chaining
Each block creates a new environment linked to its enclosing one. Variable lookup proceeds from:
  • Current (innermost) scope
  • Outward through enclosing scopes
This ensures correct resolution of both local and global variables.

Block Grammar

statement → exprStmt
          | printStmt
          | block ;

block     → "{" declaration* "}" ;
      

Dev Notes / control-flow

Control Flow Overview

Design note

Any sufficiently expressive programming language is capable of performing arbitrary computation. This idea is formalized through models like Turing machines and lambda calculus.

In practice, control flow determines how a program executes. It can be broadly divided into:

  • Conditional flow: executes code selectively
  • Looping flow: repeats execution of code
These constructs allow programs to make decisions and perform repeated computations. :contentReference[oaicite:0]{index=0}

Dev Notes / conditional

Conditional Execution

Design note

The if statement enables conditional execution based on a boolean expression.


statement → exprStmt
          | ifStmt
          | printStmt
          | block ;

ifStmt    → "if" "(" expression ")" statement
           ( "else" statement )? ;
      

Behavior
  • If the condition is truthy → execute the first statement
  • If falsey and else exists → execute the alternative statement

Dangling Else Problem
When nested if statements omit braces, it can be unclear which if an else belongs to. This ambiguity is resolved by associating the else with the nearest preceding if.

Dev Notes / logical

Logical Operators

Design note

Logical operators and and or are also control flow constructs, as they determine whether expressions are evaluated.


expression → assignment ;
assignment → IDENTIFIER "=" assignment
           | logic_or ;

logic_or   → logic_and ( "or" logic_and )* ;
logic_and  → equality ( "and" equality )* ;
      

These operators typically use short-circuit evaluation:
  • or stops when a truthy value is found
  • and stops when a falsey value is found

Dev Notes / loops

While Loop

Design note

The while loop repeatedly executes a statement as long as its condition remains truthy.


statement  → exprStmt
           | ifStmt
           | printStmt
           | whileStmt
           | block ;

whileStmt  → "while" "(" expression ")" statement ;
      

Behavior
  • Evaluate condition
  • If truthy → execute body
  • Repeat until condition becomes false

Dev Notes / for-loop

For Loop

Design note

The for loop provides a compact way to write iteration logic.


forStmt → "for" "(" ( varDecl | exprStmt | ";" )
          expression? ";"
          expression? ")" statement ;
      

A for loop consists of three parts:
  • Initializer: runs once before the loop starts
  • Condition: checked before each iteration
  • Increment: executed after each iteration

Example:
for (var i = 0; i < 10; i = i + 1) print i;

Internally, a for loop can be transformed into an equivalent while loop, making it a higher-level construct built on top of simpler control flow.

Dev Notes / functions

Function Calls

Design note

A function call evaluates a callee expression and invokes it with arguments.

callee(arguments);
The callee is not limited to identifiers—it can be any expression that evaluates to a callable object.

Grammar

unary → ( "!" | "-" ) unary | call ;
call  → primary ( "(" arguments? ")" )* ;
      
This allows chained calls such as:
fn(1)(2)(3);

Evaluation Process
  • Evaluate the callee expression
  • Evaluate each argument (left to right)
  • Invoke the callable with evaluated arguments

Callable objects implement a common interface (e.g., SunCallable) which defines how calls are executed. :contentReference[oaicite:0]{index=0}

Dev Notes / call-errors

Call Errors & Arity

Design note

Function calls must be validated before execution.

Invalid Call Targets
If the callee is not callable, a runtime error is raised:

"not a function"();

Arity Checking
Each function defines an arity (number of expected arguments).

fun add(a, b, c) {
  print a + b + c;
}
      
Calling with incorrect argument count results in an error:

add(1, 2);       // too few
add(1, 2, 3, 4); // too many
      

The interpreter validates argument count before invocation to ensure correctness.

Dev Notes / native

Native Functions

Design note

Native functions are implemented in the host language but exposed to user programs. They are useful for:

  • Accessing system features (time, IO, etc.)
  • Providing built-in functionality

Example:
clock();
This function returns the current time, allowing programs to measure execution duration.

Native functions are part of the runtime and are typically registered in the global environment.

Dev Notes / declaration

Function Declarations

Design note

Functions are declared using the fun keyword.


declaration → funDecl
            | varDecl
            | statement ;

funDecl     → "fun" function ;
function    → IDENTIFIER "(" parameters? ")" block ;
      

A function declaration:
  • Binds a name to a callable object
  • Defines parameters and a body

Dev Notes / return

Return Statements

Design note

A return statement exits a function and optionally provides a value.

return expression;
If no value is provided, the function returns nil.

Execution Behavior
Return statements may appear inside nested constructs, but they must immediately exit the entire function. To implement this, the interpreter uses a controlled mechanism (such as exceptions) to unwind execution until the function boundary is reached.

This ensures:
  • Immediate function exit
  • Correct value propagation
  • Consistent execution behavior

Dev Notes / eavluation

Evaluate expressions

Design note

To evaluate expressions, the interpreter needs executable logic associated with each type of syntax node. One possible design is to embed this logic directly into the syntax tree classes using a method like interpret(), allowing each node to evaluate itself. This approach is similar to how the AstPrinter works. That class traverses the syntax tree recursively and builds a string representation. An interpreter follows the same traversal pattern, but instead of producing strings, it computes and returns runtime values.

Evaluating Literals
Literals are the most basic building blocks of expressions. They represent fixed values written directly in the source code.

  • A literal is a piece of syntax that produces a value.
  • It always originates from the user’s source code.
  • It belongs to the parser’s domain, not the runtime.
While literals resemble values, the distinction is important: a literal is syntax, whereas a value exists at runtime. During evaluation, the interpreter converts literal syntax tree nodes into actual runtime values. This step is straightforward since literals directly map to their corresponding values.

Evaluating Parentheses (Grouping)
Grouping expressions—created using parentheses—are evaluated by first evaluating the enclosed expression and then returning its result. The grouping itself does not introduce new computation; it only controls evaluation order.

Evaluating Unary Expressions
Unary expressions operate on a single operand. The interpreter first evaluates the operand and then applies the operator. This evaluation follows a post-order traversal:
  • First evaluate child expressions
  • Then apply the operator at the current node
This recursive strategy ensures that all required values are computed before an operation is performed.

Dev Notes / Runtime error

Runtime error logic

Design note

Runtime Errors
During evaluation, expressions may produce values of unexpected types. For example, attempting to perform numeric operations on a string leads to invalid behavior. In a naive implementation, such mismatches result in runtime crashes—for example, a ClassCastException in Java—which terminates the interpreter and prints an internal stack trace. This is undesirable for a user-facing language. Consider the expression:

2 * (3 / -"muffin")
The unary - operator cannot be applied to a string. This error occurs deep within the expression, making the entire computation invalid. As a result:
  • The unary operation fails
  • The division cannot proceed
  • The multiplication also becomes invalid
Instead of terminating the entire interpreter, runtime errors should:
  • Stop evaluation of the current expression
  • Report a meaningful error to the user
  • Allow the interpreter to continue running

Detecting Runtime Errors
Since the interpreter evaluates expressions recursively, an error occurring deep in the evaluation stack must propagate outward. The preferred approach is to use controlled exception handling:
  • Throw a custom runtime error when an invalid operation is detected
  • Include source-level information (such as the token) in the error
  • Catch the error at a higher level to prevent interpreter termination
Unlike generic system exceptions, custom runtime errors provide clear and relevant feedback, helping users locate and fix issues in their code effectively.

Dev Notes / Variables

Scope & Binding

Design note

Environment Chain

// TODO: pseudocode or real code showing your env structure
Dev Notes / Functions

Call Stack

Closures

Design note

Native Functions

Dev Notes / Classes

Instances & Fields

Inheritance

Design note

Dev Notes / Types & Values

Internal Representation

Coercion Rules

Dev Notes / Memory & GC

Allocation