Automatically generated understanding of complex applications is the dream for most developers – especially those working at organizations that struggle with poorly understood legacy code. However, the rise of LLMs and AI tools for developers has introduced a new challenge: their tendency to “hallucinate”, producing highly convincing but incorrect information. 

In this post, we’ll break down Swimm’s approach that addresses hallucinations head-on, combining static analysis with the controlled usage of AI and LLMs. 

The foundation of our approach

Our solution is a deterministic, explaining methodology that prioritizes reliability over AI generation. This process involves 3 steps: 

  1. Code mapping: Using deterministic static analysis, we identify all relevant flows and logical components within the codebase. This creates a solid foundation for understanding the code’s structure and relationships.
  2. Retrieval: The system deterministically retrieves relevant context for specific topics or questions. This ensures that all documentation is grounded in actual code rather than AI-generated assumptions.
  3. Generation: Only in this step do we introduce LLMs, transforming the accurately retrieved context into coherent explanations and diagrams. The crucial distinction here is that every part of the model’s output is anchored to specific parts of your codebase, preventing hallucinations. 

Quality assurance through multiple channels 

The commitment we’ve made to quality extends beyond our processes. In fact, we maintain a comprehensive set of evaluators, curated repositories and knowledge topics representing diverse use cases. Every AI feature update or release undergoes rigorous testing against these evaluators. 

We also incorporate feedback mechanisms to continuously improve the quality of generated documentation:

  • Acceptance rate tracking: Monitors how often generated documents are committed to codebases
  • User satisfaction metrics: Implements a simple yet effective thumbs up/down system with optional feedback 
  • Collaborative improvement: Features like “ask a teammate” enable users to source unclear or internal logic, with added information being incorporated back into the generation context

We’re LLM agnostic with broad language support

Our core technology works with various LLMs, allowing you to use approved internal models, access the latest advancements, and enabling us to test and deploy different models as needed.

Additionally, the Swimm platform features broad code language support and is designed to adapt to custom frameworks and new languages quickly. This is how we’re able to handle the large number of legacy language dialects effectively.

Why it matters

Traditional understanding methods are time-consuming and often outdated as soon as they’re written. Meanwhile, pure LLM-based solutions risk introducing incorrect information that could mislead developers.

At Swimm, we leverage the power of AI while maintaining the accuracy and reliability of traditional analysis methods. By grounding every piece of generated documentation in actual code through static analysis, we ensure that teams can trust the information they’re working with.