How XandAI CLI Handles Code Encoding and Tags: A Deep Dive

XandAI CLI is a terminal-based AI assistant that combines natural language interaction with code execution capabilities. As a tool that lives in the terminal yet deals extensively with code, understanding how it handles code encoding and markup tags is crucial for effective usage.

What is XandAI CLI?

XandAI CLI is an interactive assistant that allows you to:

Execute terminal commands naturally
Create and edit files through conversational AI
Run code in multiple languages (Python, JavaScript, Bash, etc.)
Review code changes with AI-powered analysis
Handle complex multi-step tasks through the /agent command

It supports local LLM providers like Ollama and LM Studio, giving you offline AI capabilities directly in your terminal.

The Challenge: Code in a Text-Based Interface

Unlike web applications that can render HTML, terminals work with plain text. This creates an interesting challenge: how do you distinguish between:

Natural language instructions
Code that should be preserved literally
Commands that should be executed
File content that needs to be written

XandAI CLI solves this through a combination of detection heuristics, markup conventions, and intelligent parsing.

How `<code>` Tags and Markdown Work

Standard Markdown Conventions

XandAI CLI recognizes standard Markdown syntax for code:

Inline code using backticks:

Tell me about the `print()` function

Code blocks using triple backticks:

```python
def add(a, b):
    return a + b
```

HTML-Style Tags

When you include HTML-style tags like <code>, <pre>, or even custom tags in your prompts, they're treated as literal text rather than rendered markup. However, the AI model can understand their semantic meaning:

What does <code>x = 5</code> do?

The model recognizes that you're emphasizing code, even though the terminal doesn't render the <code> tag visually.

The `<code edit>` Tag: XandAI's Structured Format

Here's where it gets interesting: XandAI CLI actually uses a specialized <code edit> tag format internally for structured task processing. This is documented in the task_processor.py file.

The format:

<code edit filename="path/to/file.py">
# Complete file content goes here
def main():
    print("Hello, World!")
</code>

Key characteristics:

Structured wrapper: The <code edit> tag wraps complete file content
Filename attribute: Specifies the target file via filename="..."
Parseable format: XandAI uses regex to extract content programmatically
Complete content: Contains entire file, not snippets

How it's parsed:

# From task_processor.py
pattern = rf'=== STEP {step.step_number}:.*?===\s*\n<code edit filename="[^"]*">\s*\n(.*?)\n</code>'
match = re.search(pattern, content, re.DOTALL)
if match:
    step.content = match.group(1).strip()

Similar pattern for commands:

<commands>
pip install flask
python -m flask run
</commands>

Example in Task Mode (deprecated):

PROJECT: REST API with Authentication
TYPE: python
COMPLEXITY: medium
ESTIMATED TIME: 2-3 hours

STEPS:
1 - create src/app.py
2 - edit src/app.py
3 - command setup

STEP DETAILS:

=== STEP 1: create src/app.py ===
<code edit filename="src/app.py">
from flask import Flask

app = Flask(__name__)

@app.route('/')
def hello():
    return 'Hello, World!'
</code>

=== STEP 3: command setup ===
<commands>
pip install flask
python -m flask run
</commands>

Why this matters:

Machine-parseable: Regex can reliably extract file content
Preserves structure: Clear boundaries between content and metadata
Multiple files: Can handle many files in one response
Encoding safety: Content inside tags is treated literally
Automation-friendly: Perfect for CI/CD or automated workflows

Important note: The /task command using this format is now deprecated. XandAI recommends natural conversation mode instead, which is more flexible and produces better results. However, understanding this tag structure reveals how XandAI internally structures code for parsing.

Encoding Flow: From Input to Output

Let's trace how code and tags flow through XandAI CLI:

1. User Input Stage

When you type a prompt:

xandai> create a calculator.py with basic arithmetic functions

The CLI captures this as raw text, preserving every character including backticks, angle brackets, and special characters.

2. Intent Detection

XandAI CLI analyzes the input to determine intent:

Command execution: Starts with typical shell commands (ls, cd, git, etc.)
File operations: Contains keywords like "create", "edit", "save" with filenames
AI conversation: Everything else goes to the LLM

3. Code Detection

When generating responses, XandAI uses pattern matching to detect complete code files:

# Pattern detection looks for:
# - Complete function definitions
# - Import statements
# - Proper indentation
# - File-like structure

This looks like a complete python file. Save it? (y/N):

4. Provider Stage (LLM Processing)

The prompt is sent to your chosen provider (Ollama or LM Studio). At this stage:

Text is tokenized - backticks, <, >, and tag names become tokens
No preprocessing strips tags (they're kept for context)
The model uses tags as semantic hints

5. Output Rendering

The LLM returns formatted text, which XandAI processes:

# Pseudo-code for rendering logic
if output.contains_code_fence():
    extract_code_block()
    apply_syntax_highlighting()
    
if output.matches_complete_file():
    prompt_user_to_save()
    
display_in_terminal(formatted_output)

Special Tag Behaviors

Inline Code: `code`

What it is: Markdown inline code syntax

How XandAI treats it:

Preserved literally in prompts
Passed to the LLM as-is
Helps the model distinguish code from prose
May be colorized in terminal output

Example:

xandai> How do I use `os.path.join()` in Python?

Fenced Code Blocks

What it is: Multi-line code with triple backticks

How XandAI treats it:

```python
def hello():
    print("Hello, World!")
```

Clearly delineates code boundaries
Language tag enables syntax-aware processing
Content is never executed unless explicitly requested
Whitespace and formatting preserved exactly

Best for: Sharing complete functions, showing examples, code review

`<code>` Tags

What it is: HTML-style inline code markup

How XandAI treats it:

<code>some_function()</code>

Passed literally as text to the LLM
Not rendered visually in terminal
Model may infer code emphasis
Less preferred than backticks

When to use: When discussing HTML/XML, or when you specifically want to show tag structure

File Operations: Where Encoding Matters Most

Creating Files

When you ask XandAI to create a file:

xandai> create tokens.py with authentication functions

What happens:

AI generates complete file content
System detects it looks like a complete file
Extracts filename from your request
Prompts you to save
Encoding preserved: Special characters, indentation, line endings all maintained

Editing Files

When editing existing files:

xandai> edit index.py adding a health endpoint

What happens:

System reads current file content (preserving encoding)
Sends both content and your request to LLM
AI generates updated version
Preserves original file encoding (UTF-8, etc.)
Maintains formatting and structure

Critical: Character Encoding

XandAI must handle:

UTF-8 characters: Emojis, international characters
Special characters: <, >, &, " in code strings
Escape sequences: \n, \t, etc.
Raw bytes: Binary data (avoided, but handled)

The `/agent` Command: Multi-Step Processing

The /agent command showcases sophisticated code handling:

xandai> /agent refactor this code into modular components

Pipeline stages:

Intent Analysis: Classifies task (refactoring)
Context Gathering: Reads relevant files
Task Execution: Generates refactored code
Validation: Checks syntax, structure
Refinement: Improves based on validation

Each stage preserves code encoding perfectly through multiple LLM calls.

Code Execution: From Text to Running Process

XandAI can detect and execute code:

xandai> python math.py 2 3

Execution flow:

User Input → Intent Detection → Shell Execution → Output Capture → Display

Encoding considerations:

stdin: Interactive inputs sent with proper encoding
stdout/stderr: Captured and displayed with UTF-8
Arguments: Shell-escaped to prevent injection
Environment: Inherits terminal encoding settings

Best Practices for Using Tags in XandAI

✅ DO:

Use triple backticks for code blocks - most reliable:
```
```javascript
const greeting = "Hello";
```
```
Use single backticks for inline code - clearer than tags:
```
The `useState` hook manages state
```
Specify language in code fences - enables better processing:
```
```python  # ← Specify language
def foo(): pass
```
```

Use natural language primarily - clearer than markup:

✅ "Create a function that adds two numbers"
❌ "<instruction>Create <code>add()</code> function</instruction>"

❌ DON'T:

Mix markup styles - confusing:
```
❌ `<code>mixed()</code>`
```

Use HTML tags for emphasis - terminal can't render:

❌ <strong>Important:</strong> This won't be bold

Nest tags unnecessarily - can confuse parsing:

❌ <pre><code><div>nested</div></code></pre>

Assume visual formatting - it's plain text:

❌ Expecting <em>italics</em> or **bold** to render

Under the Hood: Implementation Details

Thanks to the open-source nature of XandAI CLI, we can examine the actual implementation. Here's how the code tag parsing works:

The `<code edit>` Parser

From task_processor.py:

def _associate_step_content(self, steps: List[TaskStep], content: str):
    """Associates detailed content to steps"""
    for step in steps:
        if step.action in ["create", "edit"]:
            # Look for corresponding <code edit> block
            pattern = rf'=== STEP {step.step_number}:.*?===\s*\n<code edit filename="[^"]*">\s*\n(.*?)\n</code>'
            match = re.search(pattern, content, re.DOTALL)
            if match:
                step.content = match.group(1).strip()

        elif step.action == "command":
            # Look for corresponding <commands> block
            pattern = (
                rf"=== STEP {step.step_number}:.*?===\s*\n<commands>\s*\n(.*?)\n</commands>"
            )
            match = re.search(pattern, content, re.DOTALL)
            if match:
                commands = [
                    cmd.strip() for cmd in match.group(1).strip().split("\n") if cmd.strip()
                ]
                step.commands = commands

Key parsing insights:

re.DOTALL flag: Allows . to match newlines, capturing multi-line content
Non-greedy matching: .*? ensures it stops at the first closing tag
Whitespace handling: \s* allows flexible formatting
Extraction: match.group(1) gets only the content, not the tags

System Prompt Engineering

The Task Processor includes this in its system prompt:

self.system_prompt = """You are XandAI in TASK mode - an expert in breaking down complex projects into executable steps.

MANDATORY RESPONSE FORMAT:

PROJECT: [name/description] TYPE: [python|javascript|web|react|api|etc]

STEP DETAILS:

=== STEP 1: create src/app.py ===

complete file content

=== STEP 2: command setup ===

pip install flask


CRITICAL RULES:
1. ALWAYS use the exact format above
2. Include COMPLETE file content, not just snippets
3. Use <code edit> for files, <commands> for shell
"""

This shows prompt engineering at work: the LLM is explicitly instructed to use these tags, ensuring consistent, parseable output.

Detection Engine

# Based on XandAI's approach
def detect_code_type(text):
    if has_shebang(text):
        return 'executable_script'
    if has_imports_and_functions(text):
        return 'complete_file'
    if has_code_fence(text):
        return 'code_block'
    if has_backticks(text):
        return 'inline_code'
    return 'natural_language'

Data Structures for Structured Processing

XandAI uses dataclasses to represent parsed content:

from dataclasses import dataclass
from typing import Optional, List

@dataclass
class TaskStep:
    """Structured step of a task"""
    step_number: int
    action: str  # 'create', 'edit', 'command'
    description: str
    target: str  # file or command
    content: Optional[str] = None
    commands: Optional[List[str]] = None

@dataclass
class TaskResult:
    """Structured result of task processing"""
    description: str
    steps: List[TaskStep]
    project_type: str
    estimated_time: str
    complexity: str  # 'low', 'medium', 'high'
    dependencies: List[str]
    notes: List[str]

This structure allows XandAI to:

Track progress through numbered steps
Execute sequentially or skip steps
Validate each step's requirements
Store metadata (complexity, time estimates)

Encoding Preservation

# Based on XandAI's file operations approach
def save_file(filename, content):
    # Preserve original encoding
    encoding = detect_encoding(content) or 'utf-8'
    
    with open(filename, 'w', encoding=encoding) as f:
        # Write exactly as received from LLM
        # Content extracted from <code edit> tags
        f.write(content)

Safe Execution

# Pseudo-code for command execution
def execute_command(cmd):
    # Shell-escape arguments
    safe_cmd = shlex.quote(cmd)
    
    # Capture with proper encoding
    result = subprocess.run(
        safe_cmd,
        capture_output=True,
        text=True,
        encoding='utf-8'
    )
    
    return result.stdout, result.stderr

Comparison: Tags in Different Contexts

Context	`<code>` Behavior	Backticks Behavior	Best Choice
Terminal (XandAI)	Literal text, semantic hint	Preserved, syntax aware	Backticks
Web Browser (MDX)	Rendered as inline code	Parsed to `<code>` element	Either works
Markdown Files	Shown literally	Rendered as code	Backticks
LLM Prompts	Token sequence	Token sequence	Backticks (clearer)

Real-World Examples

Example 1: Creating a Python Script

xandai> create a calculator.py with add, subtract, multiply, divide functions

# AI Response:
def add(a, b):
    return a + b

def subtract(a, b):
    return a - b

def multiply(a, b):
    return a * b

def divide(a, b):
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b

This looks like a complete python file. Save it? (y/N): y
Filename: calculator.py
File 'calculator.py' created successfully!

Encoding preserved: Function signatures, docstrings, spacing all intact.

Example 2: Editing with Inline Code

xandai> edit calculator.py to add error handling to `divide()` function

# AI reads file, generates updated version with try-except blocks
# Preserves original formatting and encoding
Edit file 'calculator.py'? (y/N): y
File 'calculator.py' updated successfully!

Example 3: Code Review with Context

xandai> /review

# Automatically detects Git changes
# Preserves code formatting in analysis
# Displays diffs with proper encoding

Changes detected in calculator.py:
+ Added error handling
+ Improved docstrings
✓ Code quality improved

Why Task Mode Was Deprecated

The task_processor.py file includes a deprecation warning:

"""
⚠️  DEPRECATED: This processor and /task command are deprecated.
Use natural conversation mode for better results.
"""

Reasons for deprecation:

Too rigid: The structured format constrained the AI's flexibility
Over-engineered: Most tasks don't need multi-step parsing
Natural language wins: Modern LLMs are better at understanding intent directly
Maintenance burden: Complex regex parsing is fragile

What replaced it:

Natural conversation: Just describe what you want
/agent mode: Multi-step orchestration without rigid structure
Smart detection: Automatic file/command detection
Iterative approach: Better for complex, evolving tasks

The lesson: While <code edit> tags enabled structured parsing, conversational AI is often more effective than trying to force structured output formats.

However, the tag system remains valuable for:

API integrations: Machine-to-machine communication
Automation pipelines: CI/CD workflows
Batch processing: Multiple file operations
Educational purposes: Understanding how to structure LLM output

Why Encoding Matters

Proper code encoding affects:

Correctness: Wrong encoding = broken code
Portability: Files work across systems
Security: Prevents injection attacks
User Experience: No garbled characters
Reliability: Code executes as intended

Practical Example: Tag Format in Action

Here's how the <code edit> format would be used (from the deprecated Task Mode):

User request:

xandai> /task create a simple Flask API with health check

AI response (structured):

PROJECT: Simple Flask API
TYPE: python
COMPLEXITY: low
ESTIMATED TIME: 30 minutes

DEPENDENCIES:
- Flask

STEPS:
1 - create app.py
2 - create requirements.txt
3 - command install

STEP DETAILS:

=== STEP 1: create app.py ===
<code edit filename="app.py">
from flask import Flask, jsonify

app = Flask(__name__)

@app.route('/health')
def health_check():
    return jsonify({"status": "healthy"}), 200

if __name__ == '__main__':
    app.run(debug=True)
</code>

=== STEP 2: create requirements.txt ===
<code edit filename="requirements.txt">
Flask==2.3.0
</code>

=== STEP 3: command install ===
<commands>
pip install -r requirements.txt
python app.py
</commands>

NOTES:
- Health check endpoint available at /health
- Development server runs on port 5000

How XandAI processes this:

Parses structure: Extracts project metadata
Creates TaskResult: With 3 steps
Extracts file content: From <code edit> tags using regex
Extracts commands: From <commands> tags
Presents to user: Shows summary and prompts for execution

Conclusion

By examining XandAI CLI's actual source code, we've uncovered how it handles code and tags:

<code edit> tags: Structured format for machine-parseable file content
Regex parsing: Using re.DOTALL and non-greedy matching to extract content
Data structures: Typed dataclasses for validation and tracking
System prompts: Explicit instructions to LLM for consistent output format
Deprecation lesson: Sometimes natural conversation beats rigid structure

Key takeaways:

For users: Stick to natural conversation and standard Markdown (backticks, fenced blocks)
For developers: The <code edit> pattern shows how to structure LLM output for parsing
For automation: Tag-based formats enable reliable extraction but may limit flexibility
For encoding: XandAI preserves content exactly through all processing stages

The evolution from structured /task mode to natural conversation shows that the best interface often mimics human communication rather than forcing machine-readable formats.

Whether you're creating files, executing code, or having the AI explain concepts, XandAI CLI ensures your code stays intact from input to output—a critical feature for a terminal-based coding assistant.