Questa è una versione PDF del contenuto. Per la versione completa e aggiornata, visita:
https://blog.tuttosemplice.com/en/refactoring-legacy-banking-systems-ai-and-static-analysis-guide/
Verrai reindirizzato automaticamente...
In the financial landscape of 2026, legacy system refactoring is no longer an optional choice, but an operational survival necessity. Banking institutions find themselves squeezed between the need to innovate rapidly (to compete with digital-native Fintechs) and the weight of monolithic codebases, often written in COBOL or old versions of Java, that have been handling critical transactions for decades. This technical guide explores how the integration of Generative Artificial Intelligence (GenAI) and deterministic static code analysis tools is revolutionizing the way we approach software modernization.
The main obstacle in refactoring banking systems is not writing new code, but understanding the old one. We are talking about millions of lines of code where business logic is intertwined with infrastructure management, and where documentation is often missing or obsolete. In this context, a manual approach is risky and unsustainably slow.
The modern solution lies in a hybrid approach: using static analysis to create a certain map of dependencies and LLMs (Large Language Models) specialized in code understanding to decipher the semantic intent of functions.
Before touching a single line of code, it is necessary to illuminate the shadow zones of the monolith. Here is how to structure the discovery phase:
Static analysis tools (like advanced SonarQube or proprietary mainframe analysis tools) must be configured to generate not just quality reports, but complete dependency graphs. The goal is to identify:
Once the codebase is indexed, we can use a RAG architecture. Instead of asking a generic LLM to “explain this file”, we insert the entire vectorized codebase into a database. This allows us to query the system with high-level questions:
“Show me all functions that calculate compound interest and have direct dependencies with the DB_CUSTOMER_HISTORY table.”
The AI returns not only the files but the logical flow connecting them, reducing analysis time from weeks to minutes.
Once the territory is mapped, the goal is legacy system refactoring towards a microservices or modular architecture. The reigning technique remains the Strangler Fig Pattern, powered by AI.
This is where the experience gained in developing the BOMA CRM comes into play. During the creation of BOMA, we faced the need to migrate complex customer management logic from an old VB6 management system. The common mistake is attempting to rewrite everything from scratch (Big Bang Rewrite). Instead, we used AI to extract the “pure rules” of business, separating them from the user interface and data access code.
The applied process:
In the banking sector, security is non-negotiable. Using AI to generate code introduces new risks (e.g., insecure code or hallucinations). It is imperative to integrate security controls into the refactoring pipeline.
When asking an LLM to refactor a function, the prompt must include explicit security constraints. Example of a structured prompt:
ROLE: Senior Security Architect TASK: Refactoring of the function 'processTransaction' from COBOL to Java Spring Boot. CONSTRAINTS: 1. Use Prepared Statements to prevent SQL Injection (OWASP A03:2021). 2. Implement rigorous input validation. 3. Ensure logs do not contain PII (Personally Identifiable Information). 4. Add Javadoc comments explaining the preserved business logic.
Code generated by AI must never go into production without validation. The CI/CD pipeline must include:
The experience with CRM BOMA was enlightening in defining this protocol. In that project, the challenge was not just technological, but semantic. The old system used obscure nomenclature (e.g., variables like var1, x_temp). Using LLMs to analyze the data flow, we managed to rename and refactor variables with speaking names based on the real usage context (e.g., customerLifetimeValue, lastInteractionDate).
This process of “semantic enrichment” during refactoring allowed us not only to update the technology stack but to make the code maintainable for future developers, reducing technical debt by 60% in the first 6 months post-migration.
Even in 2026, LLMs can “hallucinate”, inventing non-existent libraries or methods. To mitigate this risk:
Legacy system refactoring in the banking sector is open-heart surgery. The adoption of static analysis tools combined with artificial intelligence allows for reducing operational risks and accelerating time-to-market. However, technology is only an accelerator: deep understanding of banking domains and software architecture, as demonstrated in the BOMA case, remains the irreplaceable foundation for project success.
The integration of Generative AI and static analysis allows for rapidly deciphering obsolete codebases, reducing discovery times from weeks to minutes. Thanks to the RAG architecture, it is possible to query the vectorized code to understand complex logical flows and hidden dependencies, facilitating the extraction of business rules without having to manually analyze millions of lines of code.
The recommended technique is the Strangler Fig Pattern powered by AI. This approach avoids immediate total rewriting, preferring the gradual isolation of bounded contexts. LLMs are used to extract pure logic from the old system and rewrite it in modern languages, while facade interfaces are created to route traffic to the new microservices progressively.
Security is achieved by imposing explicit constraints in prompts, such as the use of Prepared Statements to prevent SQL Injection according to OWASP standards, and by integrating automatic controls into the CI/CD pipeline. It is essential to maintain a human-in-the-loop approach for reviewing critical code and to use SAST tools to scan for vulnerabilities before the software goes into production.
A semantic enrichment approach is used via LLMs specialized in code understanding. These models analyze the data flow and suggest renaming obscure variables with terms based on the real context, as seen in the CRM BOMA case study. This process transforms the «black box» code into a readable and maintainable structure for future developers.
Hallucinations occur when the AI invents non-existent libraries or methods. To mitigate them, strategies such as immediate compilation of the suggested code directly in the IDE and the use of multiple models to compare solutions (Mixture of Experts) are adopted. Furthermore, the automatic generation of unit tests ensures that the new code strictly respects the functionalities of the original system.