Extracting receipt values with clj-llm

2026-03-23 Mon 12:05 article clojure llm publish

Intro

After reading "25 Years of Eggs", where a careful multi-stage pipeline of SAM3, PaddleOCR, and Claude handled 11,345 receipts, I was curious how a smaller vision-enabled model would fare going directly from image to structured data, with no preprocessing at all.

Using clj-llm with Gemini Flash:

I put together a quick example babashka script for receipt extraction using clj-llm:

(def ai (openrouter/backend
         {:defaults {:model "google/gemini-3-flash-preview"}}))

(def ReceiptSchema
  [:map
   [:currency :string]
   [:items [:vector
            [:map
             [:name {:description "expand all abbreviations, convert to english"} :string]
             [:category {:description "eg. dairy, produce, meat etc"} :string]
             [:value [:map
                      [:unit {:description "eg. kg, per-unit etc"} :string]
                      [:price :double]]]
             [:quantity-of-value :double]
             [:total-price :double]]]]
   [:total :double]
   [:date :string]
   [:store :string]])

(def system-prompt "Extract structured data from the receipt image and return it in the specified schema format.")

(llm/generate ai
 {:schema ReceiptSchema
  :system-prompt system-prompt}
 [(content/image (first *command-line-args*)) {:max-width 512}]) 

I was able to get clean, structured output from a variety of test receipts. Obviously this is just a simple muck-around and getting it up to the level the original author had would take far more structured test input and iteration, but I think it's a promising start.

Example input/outputs

Lidl, no date

lidl.jpg

  • Store: Lidl Limerick, Caherdavin
  • Date: not-found
  • Total: 43.79 EUR
NameCategoryQtyUnitPriceTotal
Fresh Orange Juicebeverage2.00per-unit1.452.90
Goat Cheesedairy0.23kg11.992.72
Greek Style Yoghurtdairy1.00per-unit1.991.99
Christmas CD's 23359entertainment1.00per-unit3.993.99
Non Bio Washinghousehold1.00per-unit6.996.99
Lobstermeat/fish1.00per-unit4.994.99
Bio Instant Coffeepantry1.00per-unit2.992.99
Mustard with Guinnesspantry1.00per-unit1.691.69
Passatapantry2.00per-unit0.591.18
Bananasproduce0.56kg1.150.64
Organic Bananasproduce1.00per-unit1.491.49
Turnipproduce1.00per-unit0.890.89
Onionsproduce1.00per-unit0.790.79
Mushrooms Chestnutproduce1.00per-unit0.990.99
Garlicproduce1.00per-unit1.191.19
Stem Cherry Tomatoesproduce1.00per-unit2.492.49
Chocolate Bar 74%snacks2.00per-unit0.851.70
Chocolate Lolliessnacks1.00per-unit1.291.29
Nestle Blue Ribandsnacks1.00per-unit1.691.69
Gingerbread Roundssnacks1.00per-unit1.191.19

Target

target.jpg

  • Store: TARGET
  • Date: 12/09/2017
  • Total: 29.04 USD
NameCategoryQtyUnitPriceTotal
KISSESGROCERY1.00per-unit1.001.00
BUTTERFINGERGROCERY3.00per-unit3.009.00
FOOD COLORGROCERY1.00per-unit2.992.99
UP UPHEALTH-BEAUTY-COSMETICS1.00per-unit3.593.59
CONAIR HEADBHEALTH-BEAUTY-COSMETICS1.00per-unit3.993.99
CONAIR CLIPSHEALTH-BEAUTY-COSMETICS1.00per-unit4.794.79
CHAPSTICK 2HEALTH-BEAUTY-COSMETICS1.00per-unit1.981.98

Japanese supermarket

japanese.jpg

  • Store: Fresh Daimaru L. Takaraya Store
  • Date: 2023-05-14
  • Total: 3169.0 JPY
NameCategoryQtyUnitPriceTotal
Iwasaki Food Hand-cut Kishimen NoodlesBakery/Pasta1.00per-unit69.0069.00
Whole Grain Granola MorningCereal/Grains1.00per-unit499.00499.00
S&B Grated Raw GarlicCondiments1.00per-unit99.0099.00
Ebara Sukiyaki Sauce 500gCondiments/Sauces1.00per-unit259.00259.00
Hokkaido Special Selection Tokachi Milk 1000mlDairy/Beverage1.00per-unit185.00185.00
Beef Shoulder Slices (Medium)Meat1.00per-unit524.00524.00
Beef Shoulder Slices (Medium)Meat1.00per-unit509.00509.00
Lopia Bag 5 YenMiscellaneous1.00per-unit5.005.00
Yokoo Aku-nuki Country KonjacPrepared Food/Konjac1.00per-unit99.0099.00
Takato Chikuwabu 1pcProcessed Food/Seafood Product1.00per-unit99.0099.00
Chinese CabbageProduce1.00per-unit128.00128.00
Narita Bean SproutsProduce1.00per-unit35.0035.00
Niigata Enoki MushroomProduce1.00per-unit78.0078.00
Takano Foods Addictive Pickled VegetablesProduce/Prepared Food1.00per-unit89.0089.00
Sanriku Capelin with Roe 3PSeafood/Prepared Food1.00per-unit199.00199.00
Hagiwara Hokkaido Soybean Momen TofuSoy Products1.00per-unit59.0059.00

Note how we get English translations "for free" because of the LLM.

Cost considerations

Even more promising is the difference in cost - using the small gemini model, a single receipt costs only around $0.002.

{:cost-details
  {:upstream-inference-cost 0.00214,
   :upstream-inference-prompt-cost 6.01E-4,
   :upstream-inference-completions-cost 0.001539},
  :finish-reason "tool_calls",
  :completion-tokens 513,
  :prompt-tokens 1202,
  :prompt-tokens-details
  {:cached-tokens 0, :cache-write-tokens 0, :audio-tokens 0, :video-tokens 0},
  :cost 0.00214,
  :is-byok false,
  :total-tokens 1715,
  :completion-tokens-details
  {:reasoning-tokens 0, :image-tokens 0, :audio-tokens 0},
  :model "google/gemini-3-flash-preview"}

Making the total for the original article's 11,345 receipts come out to $22.69. A fair reduction from his original $1,591.

Other benefits

We get some other nice benefits too:

  • Translation is an LLM's bread and butter. We can normalize to English easily.
  • Abbreviated terms can be expanded by the LLM without too much fuss.

Making the script more robust

The script would benefit from testing with a larger sampling of receipts, and using an enum to keep the categories regular across different stores. Notice how the target splitting of GROCERIES and HEALTH-BEAUTY-COSMETICS has a large influence on the way the categories are output. Adjusting the schema with a better description can help, but this is better achieved with a sampling of your common receipts and use of an enum to reign-in the required categoires you need.

Example with adjusted description field

Updating the description for the "categories" field in the schema to:

eg. sweets, dairy, produce, meat etc. ON AN INDIVIDUAL ITEM BASIS. Ignore on-receipt categories/groupings. Be precise, not general.

Gave slightly more informative output for the Target receipt:

NameCategoryQtyUnitPriceTotal
FOOD COLORbaking supplies1.00per-unit2.992.99
CONAIR HEADBhair accessories1.00per-unit3.993.99
CONAIR CLIPShair accessories1.00per-unit4.794.79
CHAPSTICK 2lip care1.00per-unit1.981.98
UP UPpersonal care1.00per-unit3.593.59
KISSESsweets1.00per-unit1.001.00
BUTTERFINGERsweets3.00per-unit3.009.00