Technology18-Jul-20237 min read

Using Gemini to clean RFQs and quotations.

Industrial RFQs arrive as PDFs, photographs, voice notes, and WhatsApp messages. Gemini converts them to structured line items in seconds, at under one rupee per request.

By Mohammad Jamnagarwala · Simply Five Studio

A buyer at a refinery in Gujarat sends an RFQ to a flange manufacturer in Mumbai. The RFQ arrives as a four-page PDF, with two pages of standardised drawings, one page of a typed specification table, and one page of handwritten notes the buyer's engineer added on a recent visit. The manufacturer's estimator opens the PDF and begins the work that field used to take fifty minutes per request: reading the document, identifying the line items, matching them to internal SKU codes, checking the material grade against current inventory, and producing a structured quote.

This same arrival pattern repeats across industrial manufacturers in India: PDFs of varying quality, photographs of existing parts, voice notes describing a custom requirement, WhatsApp messages with mixed text and image content. The format is heterogeneous. The structural translation work, until recently, was entirely human. Gemini changed the math on that translation work, and the change is large enough that any industrial manufacturer not already using it for RFQ cleanup is leaving operational margin on the table.

What Gemini does well in this workflow

Gemini's vision and document-understanding capabilities are well-suited to the industrial RFQ pattern for three specific reasons.

First, the model handles mixed-format documents natively. A PDF with embedded drawings, a typed table, and handwritten annotations can be passed to Gemini as a single input. The model extracts text from the typed sections, OCRs the handwriting, identifies the drawing as a reference rather than a data source, and returns a structured output that captures the line items.

Second, the model handles the long-context shape of industrial requests. A complete RFQ with attachments, prior correspondence, and the buyer's standard terms can exceed several thousand tokens. Gemini's long-context window accepts the entire bundle in a single call, which avoids the chunking gymnastics that earlier models required.

Third, the cost-per-call is low enough that the workflow is economically viable at volume. A typical RFQ cleanup call against Gemini 1.5 Flash, with a moderately sized PDF input and a structured JSON output, costs under one rupee. A manufacturer processing two hundred RFQs a month spends roughly two hundred rupees on AI translation, against an estimator-time saving of dozens of hours.

What the prompt pattern actually looks like

The prompt that produces reliable structured output is not exotic. The shape that has worked across the AI-augmented ERP for Amaan Enterprises and the spec-translation layer in the CFX sales app is consistent.

The system prompt establishes the role: "You are an estimator for an industrial hardware manufacturer. Extract the request for quotation from the attached document into the JSON schema provided. Do not invent specifications that are not present in the source. If a field is ambiguous or missing, return null for that field and add a note in the clarification array."

The user prompt provides the document and the schema. The schema is fixed across all RFQs for a given manufacturer: line items with material grade, dimensions, surface treatment, quantity, delivery window, and any reference to attached drawings.

The output is JSON. The estimator reviews the JSON inside the ERP, corrects what needs correcting, and commits. The commit creates the structured opportunity record with the line items pre-populated. The work that previously took fifty minutes now takes ten, and most of those ten minutes are review rather than data entry.

The discipline that produces accuracy is the schema. A loose prompt that asks Gemini to "extract the order" produces variable output that the estimator cannot easily audit. A tight schema with explicit fields and a clarification array for ambiguous inputs produces output that fits cleanly into the downstream workflow.

Accuracy numbers from real workflows

The accuracy of Gemini extraction on industrial RFQ documents, measured by the share of extracted line items that pass review without correction, sits in the 85 to 92 percent range across the firms we have deployed this against. The remaining 8 to 15 percent require a correction step, which is materially less work than transcribing the entire document.

The pattern of errors is also informative. Most corrections involve domain-specific terminology that the model interprets generically. "ASTM A105" is occasionally returned as a partial extraction missing the surface-treatment code that follows. Handwritten dimensions get misread when the handwriting is genuinely difficult. These are not failure modes that disqualify the workflow. They are friction points that the review step is designed to handle.

The accuracy on the easier inputs (clean PDFs with typed specifications) is near 100 percent. The accuracy on the harder inputs (photographs of handwritten requests, low-quality scans, voice notes) is lower but still well above the threshold where the workflow saves time. The estimator's role shifts from "data entry" to "review the AI extraction and correct the edge cases", which is a more leveraged use of an experienced estimator's attention.

When Gemini is the right model versus Claude or GPT

The choice between Gemini, Claude, and GPT for RFQ cleanup is not a religious question. It is a workflow-fit question.

Gemini is the right model when the inputs are vision-heavy (PDFs with drawings, photographs of parts, scanned documents) and the cost-per-call needs to be low. Gemini 1.5 Flash, in particular, sits at a price point where high-volume document processing is economically viable.

Claude is the right model when the workflow requires multi-step reasoning over the extracted data. If the estimator's task is not just "extract the line items" but also "identify the discrepancies between this RFQ and the buyer's previous order" or "flag the items that exceed the standard delivery window", Claude's reasoning depth produces better output. The cost-per-call is higher, but the workflow that justifies it is correspondingly higher-leverage.

GPT sits between the two on most dimensions. For RFQ cleanup specifically, Gemini's vision performance and price point have made it the default in the work we have shipped over the last year. The decision is reviewed every quarter, because the model landscape is moving, and the right answer in 12-Mar-2024 may not be the right answer in 12-Mar-2025. The essay on when AI earns its presence discusses the broader question of where AI should and should not be integrated into operational software, which applies directly to this choice.

What integrating Gemini actually involves

The integration is direct, not through middleware. The application makes an authenticated call to the Google AI API with the document attached and the schema specified. The response is parsed and rendered inside the ERP for the estimator to review. The audit trail records the model version, the input hash, the raw output, and the estimator's corrections.

The architectural decision worth flagging is direct integration over a third-party middleware layer. Middleware services that aggregate multiple AI providers behind a single API are convenient for prototyping. They are expensive at scale and they obscure the audit trail. The work for CFX and other engagements goes directly to the provider. The firm controls the relationship with Google. The data captured by the AI workflow stays in the firm's database. The latency and cost are predictable.

The other piece worth noting is the prompt versioning discipline. The prompt that produces reliable extraction is not static. It evolves as the firm encounters new document patterns and as the model itself is updated. The prompt lives in the codebase, versioned alongside the rest of the application. Prompt changes go through review the same way code changes do. This sounds procedural. It is, and it is the discipline that keeps the workflow reliable for years rather than weeks.

Decision infrastructure for an industrial manufacturer increasingly includes an AI layer at the document-translation boundary. Gemini does not write the quote. It removes the slow manual translation between heterogeneous input and structured ERP record, which is the work that consumed the estimator's day and rewarded none of the estimator's actual expertise. The shift returns the estimator's attention to judgement, where the firm's commercial outcomes are actually decided.

If your estimator is spending hours per day translating documents into your ERP, the conversation begins with a workflow audit of the actual document patterns coming in. Start a Conversation. The first phase maps the AI integration that fits the firm's RFQ shape, scoped through the internal systems engagement model.

Related essays.

Technology

WooCommerce is a great way to start and a poor thing to depend on forever.

We have argued for years that WooCommerce plus a custom ERP beats every all-in-one for distributors. That is still true as a starting point. This is the other half of the argument: the signals that tell you WooCommerce has become the liability, and what to rebuild it as.

Technology

When two databases hold the same customer, you have two truths.

A storefront on one database and an ERP on another is the most common architecture in Indian distribution. It also quietly accrues a debt: every record that lives in both places has to be kept in agreement, and the sync that does it is the part most likely to fail.

Technology

Direct WhatsApp Cloud API vs BSP: what founders should know.

Every Indian B2B founder running customer messaging on WhatsApp eventually asks the same question. The honest answer is technical, not a sales pitch, and it depends on a few specific factors most BSP comparison pages skip.

Continue the conversation

If this resonated, tell us about your operation.

The contact form takes about two minutes. The reply comes from the founder within two working days.

Start a Conversation More essays