Return to site

No Keys, No LLM: Building a Wikidata Definition API with Embabel

· spring

TL;DR

  • I built a Spring Boot 4 API that defines terms via Wikidata.
  • The app is fully reproducible: no API keys and no model installation needed.
  • Embabel orchestrates the pipeline as a sequence of actions to achieve the goal DefinitionResult.
  • The logs show planning, execution, and typed object binding—the most useful part for teaching agentic flows.

No Keys, No LLM: Building a Wikidata Definition API with Embabel

I wanted a demo that is simple, reproducible, and still shows agentic orchestration in a way that’s easy to explain on video.

So I built a small Spring Boot 4 app that exposes a single endpoint:

  • GET /api/wiki/define?term=...

It returns a compact JSON “definition” fetched from Wikidata (no authentication, no API keys).
The important part: I used Embabel to orchestrate the workflow, even though the workflow is deterministic and does not need an LLM.

Part I — Concepts

I.1 Embabel

Embabel is an agent framework for the JVM. I like to think of it as a way to model a workflow as:

  • Actions: steps the agent can execute
  • Goals: what the workflow should produce
  • State / facts: typed objects available at each moment
  • Planning: decide which actions to run and in which order to achieve the goal

In practice, that means I don’t call methods in a fixed chain. I provide an initial input (a domain object), tell Embabel what type I want as the result, and Embabel plans and runs the required actions.

I.2 Spring AI (even in a “no LLM” demo)

Spring AI provides an abstraction layer for interacting with chat models (and other AI components) using Spring-friendly APIs.

In this project, I implemented a tiny NOOP chat model. It’s not used to generate anything. It exists because the Embabel starter expects a default model entry to be configured at startup.

This kept the demo:

  • fully runnable without credentials,
  • focused on orchestration,
  • and easy to extend later with a real model.

I.3 Role of Embabel in this application

A reasonable question is: “What’s the point of using Embabel just to query a REST API?”

The REST call is not the point. The point is to demonstrate a workflow that:

  1. starts from a DefinitionRequest(term)
  2. resolves a Wikidata entity ID (Q-id)
  3. fetches entity details
  4. builds a typed DefinitionResult

Embabel makes these steps explicit, typed, and observable, and it can re-plan as the state evolves. That’s a much better foundation than packing everything into one big service method—especially when the demo grows.

I.4 Wikidata: definition and why it’s ideal for demos

Wikidata is a public, open knowledge base. It’s perfect for demos because:

  • it’s online,
  • it’s free to read,
  • and the APIs are easy to call from a small Java project.

I used two endpoints:

  • wbsearchentities to search for a term and retrieve the most relevant Q-id
  • Special:EntityData/{QID}.json to fetch structured entity data (labels, descriptions, and Wikipedia sitelinks)

This gives a nice “definition API” in a few lines of code, with zero setup for viewers.

Part II — App building (code + explanations)

II.1 Maven setup (

pom.xml)

I used Spring Boot 4.0.3 with Java 25, and Embabel 0.3.4.

Because this is Boot 4, I added spring-boot-starter-restclient so RestClient.Builder is auto-configured.

I also forced Jackson 2 compatibility (spring-boot-jackson2) and excluded spring-boot-starter-json, because the Embabel starter wiring in this setup expects Jackson2ObjectMapperBuilder.

II.2 Configuration (application.yml)

I set the server port and configured the default Embabel model name to noop.

II.3 App launcher + Embabel enablement + NOOP LLM registration

The application entrypoint enables agent scanning using @EnableAgents, then registers a “noop” model so the platform boots without external dependencies.

II.4 The NOOP ChatModel (Spring AI)

This is intentionally minimal. If Embabel ever calls it, it returns a predictable message.

II.5 Domain model (Java records)

I used records for the request, intermediate agent objects, and final result.

The key idea is that Embabel “stores” and “reuses” these typed objects during execution. They become the agent’s working memory.

II.6 Repository: Wikidata calls with RestClient

The repository is responsible for the data access logic only:

  • search for the best match (Q-id)
  • fetch details for that Q-id
  • build stable URLs

I kept DTO mappings minimal and resilient with @JsonIgnoreProperties(ignoreUnknown = true).

II.7 The Embabel agent (actions + goal)

The agent defines the workflow. Each method is a step (@Action). The final step is tagged as a goal (@AchievesGoal) because it produces the desired output type DefinitionResult.

I like this structure because it stays small and readable. More importantly, it becomes easy to extend later:

add a disambiguation action,

  • add a caching action,
  • add alternative paths,
  • add optional post-processing.

II.8 Service: running the agent via AgentInvocation

The service is the bridge between the web layer and Embabel. It creates an AgentInvocation and calls it with a DefinitionRequest.

II.9 Controller: a single endpoint

The controller stays boring on purpose. All the interesting logic is in the agent and repository.

Part III — Demo

III.1 Curl request

III.2 Response

This is intentionally “small JSON”: label + description + canonical links.

III.3 Logs: the agentic part

These logs are the best part to show on screen, because they reveal Embabel’s planning and execution.

What stands out:

  • Embabel starts with DefinitionRequest
  • It formulates a plan (sequence of actions)
  • It executes each action
  • It binds the produced objects (WikidataEntityId, WikidataEntityDetails, then DefinitionResult)
  • It declares the goal achieved

This is the “agentic” angle: Embabel is not just calling methods—it's planning against typed state.

Part IV — Conclusion and extensions

This application intentionally starts simple. It’s a demo designed to be reproduced in minutes.

However, the Embabel structure is already useful because it’s an orchestrator. Extending the system becomes a matter of adding actions and (optionally) conditions, not rewriting a monolithic service method.

Here are extensions that make the demo evolve naturally:

1) Disambiguation

Instead of limit=1, fetch the top N hits and add an action to pick the best match. For example:

  • exact label match
  • description keyword match
  • “instance of” filtering (person vs concept vs product)

2) Multi-language

Add lang to DefinitionRequest and propagate it into:

  • wbsearchentities&language=...
  • selecting labels/descriptions by language

3) Confidence score

Add a ConfidenceScore record and an action that computes a score based on:

  • match quality
  • label similarity
  • number of aliases
  • presence of sitelinks

Return it to consumers to make the API safer to use.

4) Caching and rate limiting

Add an action that checks a cache before querying Wikidata. This is a classic production step and it fits nicely as an independent action.

5) Multi-source enrichment

Add an alternative source for definitions:

  • DBpedia
  • Wikipedia summary API
  • internal enterprise knowledge base

Embabel becomes more valuable as the number of sources increases, because orchestration becomes a first-class concept.

6) Optional LLM post-processing (when needed)

A good, minimal LLM use case is last-mile text rewriting:

  • convert the Wikidata description into a more “dictionary-like” sentence
  • add examples
  • translate to French
  • generate a short TL;DR

This keeps the retrieval deterministic and makes the LLM optional, which is often a safer architecture.

Study Guide For Spring: https://spring-book.mystrikingly.com