Super Analyzer: Combining Reasoning and Coding Capabilities to Improve Code Performance
Introduction
As code bases grow in size and complexity, identifying bottleneck patterns often becomes a challenging task. While coding LLMs and vibe coding have been in the spotlight, not much has been talked about using LLMs for identifying performance bottlenecks in existing code bases.
Traditional linters and static analyzers catch many issues, but performance bottlenecks often hide in code structure: an O(n²) loop that only appears under certain data distributions, lock contention caused by I/O inside critical sections, or memory churn that quietly degrades throughput. Reasoning LLMs are surprisingly good at spotting these patterns. While coding LLMs can be used to fix the detected anti-patterns, we found that using the Nvidia Nemotron 3 Super reasoning LLM in an actor-critic pattern can often achieve comparable results. Nvidia Nemotron 3 Super is a hybrid MoE (Mixture of Experts Model) with highest compute efficiency and leading accuracy for multi-agent applications and specialized agentic AI systems.
To leverage this capability of Nemotron 3 Super, we propose a system that analyzes existing code in C+/Python/Java/Rust to identify known anti-patterns and fix them.
The system uses a multi-agent actor-critic framework to analyze and fix a predetermined set of anti-patterns in four different languages. One of the challenges in analyzing large coding files in this format is the size of the context. As the actor critic pattern reiterates, the size of the context might become unwieldy. The large context window of Nemotron 3 Super helps us overcome this challenge.
The system also offers several interaction capabilities: web UI, Python API, Rest API. The Web API exposes a chatbot style interface for analyzing and fixing the code. The framework can handle multi turn conversations and persist chats between conversations. A user / password authentication is required for all end points exposed by the app.
The overall system architecture is shown below
Figure 1 Architectural diagram of the system
An architectural deep dive: A three agent solution
The problem at hand involves three layers. One is a general understanding of what the user wants in terms of the analysis. The second layer analyzes the known anti-patterns, and the third one is fixes them. Depending on the user’s need, we might do a simple analysis or do a simple analyze + fix pattern (one shot) or go deeper and try to create a more robust fix with an actor-critic pattern. It is known that the one-shot fix can often produce code that strays away in scope and intent from the original code or may not even fix the code correctly. Still, we include this to support A/B testing for the proposed framework.
The multi-agent mechanism
The actor-critic pattern involves an actor (a coding agent) proposing a fix for the bottlenecks identified by the analyzer. The critic then validates the actor’s output and decides to accept the code or send it back to the actor with suggestions to improve the code. With this architecture in place, we use a three-tiered agentic architecture where each class of agents has a different intended role.
Primary Agent: Orchestrates high-level analysis, manages critics, and serves as the "Scope Guardian" to ensure fixes don't alter intended behavior.
Fixer Agent: A specialist "Actor" that focuses on code generation, often utilizing Fill-In-the-Middle (FIM) techniques.
Chat Agent : Handles user interaction and explains the rationale behind proposed changes
The LLMs used by the different classes could all be the same or they could all be different. It depends on the cost / accuracy considerations and the specializations of various LLMs available.
Multi-agent fix: The pragmatic design
Figure 2 End to End fix pipeline
Pipeline stages (what happens in each phase)
- Upload and Parse :
Web UI / CLI / API accepts a file (or ZIP). Language‑specific parsers extract functions (AST for Python; brace matching for C++/Java/Rust).
- Issues Detection / Analysis and Domain Categorization
The system scans for specific language-specific anti-patterns: • Python: Detecting redundant I/O inside loops or inefficient data structure usage. • C++/Rust: Identifying memory management overhead or missing compiler optimizations. • Java: Spotting inefficient stream operations or excessive object allocation.
Agent Routing / Specialist Agent Intervention
Rather than one agent doing everything, the system uses "Domain Isolation." An Algorithm Agent may restructure a loop, while an I/O Agent specifically targets side-effecting operations like print statements or database calls.
- Actor–critic loop
Actor proposes a patch; validators run; critic approves/rejects with scope checks; retries occur.
- The Scope Guardian and Validation
Final two‑layer validation (programmatic + LLM) with up to two repair attempts. This component ensures that the "intent" of the code remains unchanged. If a fix accidentally removes necessary logic, the Guardian repairs the scope or triggers a fallback to a safer version of the fix.
- Output
Reassemble file, add imports/includes, strip advisory comments, emit warnings + architectural suggestions.
Actor–critic mechanics (and why it works)
In Super Analyzer, “actor” and “critic” are explicit roles with separate prompts and—optionally—separate models. The actor is optimized for code generation. The critic is optimized for validation and is allowed to be strict.
The pre‑critic validation stage is essential: cheap, deterministic checks reject obviously broken output before spending tokens on the critic. This drives down cost and prevents the critic from becoming a garbage‑in/garbage‑out filter.
Figure 3 Actor-Critic Loop
Surprisingly implementing the actor critic pattern can improve the overall coding performance of a model. By using the actor critic pattern with Nemotron Super model, the analyzer was able to uncover and fix some complicated logic in C++ code and also provided an architectural suggestion to improve the code performance.
Coding examples
The coding examples have been uploaded as a Huggingface dataset: https://huggingface.co/datasets/gganesan74/sample_data_code_analyzer
The repo also contains a python script to access the API and a usage guide.
Concrete example: O(n²) duplicates + logging in a loop
A common pattern in data pipelines is finding duplicates while emitting debug output. The analyzer can flag both an algorithmic bottleneck (nested loops) and an I/O bottleneck (print/log in the inner loop). Because agents run sequentially, the Algorithm Agent can first reduce complexity, and the I/O Agent can then remove or batch the hot‑loop logging.
python (before)
def find_duplicates(transactions):
duplicates = []
for i in range(len(transactions)):
for j in range(i + 1, len(transactions)):
if transactions[i]["transaction_id"] == transactions[j]["transaction_id"]:
print("dup", transactions[i]["transaction_id"])
duplicates.append(transactions[i])
return duplicates
python (after)
def find_duplicates(transactions):
seen = set()
duplicates = []
for tx in transactions:
tid = tx["transaction_id"]
if tid in seen:
duplicates.append(tx)
else:
seen.add(tid)
return duplicates
Safety and validation behavior
The system uses a two-tier validation approach. Programmatic checks (syntax validation, hardcoded dimension detection, symbol reference checks) are authoritative — they can trigger retries with corrective feedback to the actor. An advisory LLM review runs after each fix is accepted and surfaces potential logic concerns to suggestions.md for developer review, but never blocks or reverts a fix. The Scope Guardian similarly runs a programmatic check as its primary gate, with an advisory LLM pass for deeper analysis. If programmatic scope fix attempts are exhausted the fix is accepted anyway. Each function reports a resolution status of success or partial.
What the agents fix: a pragmatic anti‑pattern catalog
Observability
The Super Analyzer produces a trace of the LLM calls, tokens used etc. This can be seen from the Traces button in the UI. When using a CLI it has the option of saving the trace in a json file. The web UI also has a test functionality. It checks whether all the LLM endpoints are reachable from the app.
Conclusion
Super Analyzer demonstrates a practical pattern for “safe refactoring agents”: specialize by issue category, generate patches with constrained actors (FIM where possible), filter aggressively with cheap validators, and require explicit approval gates (critic + scope guardian). This architecture makes it easier to A/B test agent behavior, plug in different models and endpoints, and scale the system from local experiments to production deployments.



