🤖🩺 LOG DOCTOR: WHEN FULL-TEXT SEARCH MEETS SPRING AI

🤖🩺 LOG DOCTOR: WHEN FULL-TEXT SEARCH MEETS SPRING AI

· spring

🔸 TL;DR

I built a small POC to analyze multiple log files and help identify the source of a user issue.

But the key lesson is important:

 ▪️ The LLM should not scan all logs blindly

 ▪️ Full-text search should first retrieve relevant evidence

 ▪️ Spring AI / LLM should then analyze that evidence

 ▪️ The output should be: probable root cause + suggested fix + file/line evidence

In short:

User issue

→ time-window filter

→ full-text search

→ matching lines + surrounding context

→ LLM analysis

🔸 THE IDEA

Imagine a user reports:

“I cannot validate my payment. Checkout fails after clicking pay.”

The app searches across multiple log files, retrieves relevant lines and context, then asks the LLM to explain what likely happened.

Example result:

 ▪️ order-service.log shows the order expired

 ▪️ payment-service.log shows payment failed because the order was already CANCELLED

 ▪️ The LLM suggests checking timeout configuration, payment timing, and defensive handling of terminal order states

That is where AI becomes useful: not as a magic grep replacement, but as an explanation layer on top of retrieved evidence.

🔸 WHY NOT VECTOR SEARCH FIRST?

Good feedback I received: for logs, full-text search is often the best first step.

And I agree.

Logs contain strong exact signals:

 ▪️ exception names

 ▪️ error codes

 ▪️ trace IDs

 ▪️ timestamps

 ▪️ endpoints

 ▪️ service names

 ▪️ user IDs

So for this V1, no pgvector. No embeddings. No semantic search.

Just:

 ▪️ PostgreSQL full-text search

 ▪️ time-window filtering

 ▪️ surrounding context extraction

 ▪️ Spring AI for the final analysis

🔸 WHY SPRING AI THEN?

Spring AI is not needed to find the log line.

But it becomes useful after retrieval:

 ▪️ summarize the evidence

 ▪️ explain the probable root cause

 ▪️ suggest a fix

 ▪️ propose next investigation steps

 ▪️ keep the integration clean inside a Spring Boot app

 The search finds the evidence.

 The LLM explains the evidence.

🔸 TAKEAWAYS

 ▪️ Don’t send thousands of log lines directly to an LLM

 ▪️ Start with deterministic retrieval

 ▪️ Add a time window to avoid mixing old incidents with current ones

 ▪️ Retrieve surrounding context, not only one line

 ▪️ Ask the LLM to reason only from the provided evidence

 ▪️ Keep pgvector as a possible V2, not a mandatory V1

 ▪️ ⚠️ For production, be careful with sensitive logs and external LLM providers #gdpr

🔸 FINAL THOUGHT

AI in observability should not be sold as magic.

A better framing is:

1️⃣ Search systems retrieve the facts.

2️⃣ LLMs help humans understand them faster.

That is much more realistic, and much more useful. 🧠

#SpringAI #Java #SpringBoot #PostgreSQL #FullTextSearch #Logs #Observability #SupportTeam #AI #LLM #DeveloperTools #SoftwareEngineering