End-to-End Trace to LLM Agent Pipeline

End-to-End Trace to LLM Agent Pipeline | Generated by AI

Home 2025.09

You’re on a solid track: AOP (AspectJ), servlet filters, and Log4j2 each shine at different layers. Here’s a compact, end-to-end way to use them together—first to harvest execution traces and HTTP I/O (so you can auto-mint Python tests), then to feed rich context to an LLM “agent” endpoint in Spring.

1) Capture what you need (AOP + Filter + Log4j2)

a) Method interception with AOP (timings, args, return/throwables).

Use @Around advice to wrap service methods; store a traceId in MDC so logs from all layers join up.

@Aspect
@Component
public class TraceAspect {
  @Around("execution(* com.example.service..*(..))")
  public Object around(ProceedingJoinPoint pjp) throws Throwable {
    String traceId = java.util.UUID.randomUUID().toString();
    org.apache.logging.log4j.ThreadContext.put("traceId", traceId);
    long t0 = System.nanoTime();
    try {
      Object out = pjp.proceed();
      org.apache.logging.log4j.LogManager.getLogger(pjp.getTarget().getClass())
        .info("AOP", "method={},args={},ret={},ms={},traceId={}",
              pjp.getSignature().toShortString(),
              java.util.Arrays.toString(pjp.getArgs()), out,
              (System.nanoTime()-t0)/1_000_000.0, traceId);
      return out;
    } catch (Throwable e) {
      org.apache.logging.log4j.LogManager.getLogger(pjp.getTarget().getClass())
        .error("AOP_ERR", "method={},args={},err={},traceId={}",
               pjp.getSignature().toShortString(),
               java.util.Arrays.toString(pjp.getArgs()),
               e.toString(), traceId);
      throw e;
    } finally {
      org.apache.logging.log4j.ThreadContext.remove("traceId");
    }
  }
}

(“Around advice” is the right tool when you need pre/post logic and control of the call. (Home, DigitalOcean))

b) HTTP logging with a Filter (request/response body + status).

Wrap the request/response once per call; attach the same traceId (from header or generate). Spring’s OncePerRequestFilter pattern is standard. Good guides show how to buffer body safely and avoid double-consumption. (Baeldung on Kotlin, SigNoz)

c) Route logs per user/tenant/test using Log4j2 RoutingAppender.

This lets you send matched logs (e.g., by traceId or tenantId in MDC) into separate files you’ll later convert into tests.

<Appenders>
  <Routing name="ByTrace">
    <Routes pattern="${ctx:traceId}">
      <Route ref="RollingTemplate" key="${ctx:traceId}"/>
    </Routes>
  </Routing>

  <RollingFile name="RollingTemplate"
               fileName="logs/${ctx:traceId}.log"
               filePattern="logs/${ctx:traceId}-%d{yyyy-MM-dd}.gz">
    <PatternLayout pattern="%d %p %c %X{traceId} - %m%n"/>
    <Policies><TimeBasedTriggeringPolicy/></Policies>
  </RollingFile>
</Appenders>
<Loggers>
  <Root level="info">
    <AppenderRef ref="ByTrace"/>
  </Root>
</Loggers>

(Log4j2’s RoutingAppender + MDC pattern is the canonical way to split logs by key. (Apache Logging, Roy Tutorials, Stack Overflow))

2) Turn those logs into Python tests (pytest + requests)

Once logs are JSON-ish or parseable lines, a tiny generator can emit deterministic tests:

# gen_tests_from_logs.py
import json, re, pathlib

def extract_calls(log_text):
    calls = []
    for line in log_text.splitlines():
        if '"HTTP_IN"' in line or 'HTTP_IN' in line:
            d = json.loads(re.search(r'({.*})', line).group(1))
            calls.append({
              "method": d["method"],
              "url": d["path"],
              "headers": d.get("headers", {}),
              "body": d.get("body", None),
              "expect_status": d.get("status", 200)
            })
    return calls

def emit_pytest(calls):
    lines = [
      "import requests",
      "import pytest",
      "",
      "@pytest.mark.parametrize('call', ["
    ]
    for c in calls:
      lines.append(f"  {json.dumps(c)},")
    lines += ["])",
      "def test_replay(call):",
      "    resp = requests.request(call['method'], 'http://localhost:8080'+call['url'],",
      "                             headers=call.get('headers'),",
      "                             json=call.get('body'))",
      "    assert resp.status_code == call['expect_status']",
      "    # optionally assert on structured fields from response JSON",
    ]
    return "\n".join(lines)

def main():
    all_calls = []
    for p in pathlib.Path('logs').glob('*.log'):
      all_calls += extract_calls(p.read_text(encoding='utf-8'))
    pathlib.Path('tests/test_replay_generated.py').write_text(emit_pytest(all_calls), encoding='utf-8')

if __name__ == "__main__":
    main()

Tips:

Prefer JSON logs for requests/responses; it makes test generation trivial.
Keep sensitive headers out (e.g., Authorization).
If bodies are large, store a hash in logs and assert hash equality rather than full string.

3) Use Spring to provide context to LLM “agents”

If you’re on Spring Boot today, the shortest path is Spring AI:

a) Set up a ChatClient and prompt templates. Spring AI gives you ChatClient/PromptTemplate abstractions and tool-calling so the model can ask your app to fetch data. (Home, Home)

@Configuration
public class AiConfig {
  @Bean ChatClient chatClient(org.springframework.ai.chat.ChatModel model) {
    return ChatClient.builder(model).defaultSystem("You are a helpful banking agent.").build();
  }
}

b) Provide context via Spring beans (services) and “tools.” Expose domain lookups as tools so the model can call them during a chat turn. (Spring AI supports tool calling—model decides when to invoke; result flows back as extra context.) (Home)

@Component
public class AccountTools {
  @org.springframework.ai.tool.annotation.Tool
  public String fetchBalance(String userId){
    // query DB or downstream service
    return "{\"balance\": 1234.56, \"currency\": \"HKD\"}";
  }
}

c) Add retrieval (RAG) for documents/code. Wire a VectorStore (PGVector, etc.) and stuff it with embeddings of your knowledge. At runtime, retrieve top-k chunks and attach them to the prompt. There’s a current, practical tutorial building a full RAG stack with Spring Boot + PGVector. (sohamkamani.com)

d) Thread user/session context through Interceptors. Use a HandlerInterceptor (or your Filter) to resolve userId, tenantId, roles, locale, last-N actions—put them into:

request attributes,
a scoped bean, or
the system and user parts of your prompt.

e) One HTTP endpoint to rule them all.

@RestController
@RequestMapping("/agent")
public class AgentController {
  private final ChatClient chat;
  private final UserContextProvider ctx;
  private final Retriever retriever;

  public AgentController(ChatClient chat, UserContextProvider ctx, Retriever retriever) {
    this.chat = chat; this.ctx = ctx; this.retriever = retriever;
  }

  @PostMapping
  public String chat(@RequestBody Map<String,Object> req, Principal principal) {
    var userCtx = ctx.fromPrincipal(principal);          // ids, roles, preferences
    var docs   = retriever.findRelevant((String)req.get("message")); // RAG
    return chat.prompt()
      .system("Use tools if needed. Respect tenant="+userCtx.tenant()+" and role="+userCtx.role()+".")
      .user("Context:\n"+docs+"\n\nUser says: "+req.get("message"))
      .call().content();                                  // tool calls happen implicitly
  }
}

f) Optional: Deploy “skills” with Spring Cloud Function. If you want each agent skill as a single function (great for serverless and routing by headers), Spring Cloud Function lets you write pure Function<T,R> and route dynamically—nice for plugging into AWS Lambda/Azure and for internal “agent tool” endpoints. (cloud.spring.io, Home, Baeldung on Kotlin)

4) Putting it all together (flow)

Filter adds traceId, logs HTTP I/O.
AOP wraps service methods and emits structured events.
Log4j2 RoutingAppender writes per-trace files.
Python generator reads those files → emits pytest that replays real calls.
Spring AI endpoint uses:
- request/tenant/user context from Interceptor/Filter,
- domain “tools” for live data,
- vector store (RAG) for doc/code context,
- ChatClient for chat orchestration.

That gives you reproducible tests and a robust, contextual agent interface—both driven by the same telemetry.

If you want, I can sketch a minimal repo structure with the Spring Boot pieces, a Log4j2 config, and the Python generator script wired into a make testgen && pytest workflow.

Back Donate