Research

Basilisk: An Evolutionary AI Red-Teaming Framework for Systematic Security Evaluation of Large Language Models

Abstract

This paper presents Basilisk, an open-source evolutionary framework for automated security evaluation of large language models (LLMs). Unlike static prompt-injection databases, Basilisk applies genetic algorithms to natural language attack payloads — evolving adversarial prompts across generations using 15 mutation operators, 5 crossover strategies, and multi-signal fitness scoring. The Smart Prompt Evolution engine (SPE-NL) discovers novel jailbreaks, guardrail bypasses, and data exfiltration vectors that no human-curated list would contain.

Basilisk provides 32 attack modules mapped to the OWASP LLM Top 10, including multi-turn attack chains (prompt cultivation, authority escalation, sycophancy exploitation), differential multi-model testing, guardrail posture assessment, and forensic audit logging. The framework supports GPT-4, Claude, Gemini, Llama, and any custom LLM endpoint, with native C/Go extensions for 10-100x speedup on performance-critical operations.

Archived On

Key Contributions

🧬

Smart Prompt Evolution

First application of genetic algorithms to natural language attack payloads for LLM security testing, with 15 mutation operators and population diversity tracking.

🎯

Multi-Turn Attack Chains

Novel multi-phase conversational attacks including prompt cultivation (5-phase guardrail conditioning) and sycophancy exploitation leveraging LLM agreement bias.

📊

Systematic OWASP Coverage

32 attack modules providing complete OWASP LLM Top 10 coverage with automated differential testing and guardrail posture grading.

Native Performance

C and Go extensions providing 10-100x speedup for token analysis, pattern matching, and mutation operations via a ctypes bridge with Python fallbacks.

Citation

BibTeX
@misc{regaan2026basilisk,
  author       = {Regaan},
  title        = {Basilisk: An Evolutionary AI Red-Teaming
                  Framework for Systematic Security Evaluation
                  of Large Language Models},
  year         = {2026},
  version      = {1.0.8},
  publisher    = {ROT Independent Security Research Lab},
  doi          = {10.5281/zenodo.18909538},
  url          = {https://doi.org/10.5281/zenodo.18909538}
}

Related