OCR (Tesseract)

integration_ocr · action · SaaS Integrations · Available · v1.0.0

Description

Estrae testo da immagini e PDF scansionati usando Tesseract.js (pure JS, no native binary, no API key). Lingue: italiano + inglese di default, configurabile. Output con bounding boxes, confidence score per blocco, threshold filter.

⚙️ Configuration parameters

Fields shown in the editor when configuring the node. Generated directly from the NodeDefconfigFields.

FieldTypeRequiredDefaultDescription
action
Azione
enum
extract_text
yesextract_text
source
Tipo sorgente
enum
file_pathbase64
yesfile_path = path nel volume workspace. base64 = contenuto inline.
content
Path o base64
string (multiline)yes
languages
Lingue (separate da virgola)
stringnoita,engCodici Tesseract: ita, eng, fra, deu, spa, ...
confidenceThreshold
Soglia confidence (0-100)
numberno30Filtra blocchi sotto questa soglia.

⬆️ Node output

Fields available to downstream nodes via $node.<alias>.json.<field>:

  • text
  • confidence
  • blocks

💡 Configuration example

JSON snippet of the node as it appears in the workflow. Values are derived fromdefaultValue and from required parameters.

{
  "id": "node-integration_ocr-1",
  "defId": "integration_ocr",
  "label": "OCR (Tesseract)",
  "config": {
    "action": "extract_text",
    "source": "file_path",
    "content": "<content>",
    "languages": "ita,eng",
    "confidenceThreshold": 30
  }
}

🔒 Security notes

Local OCR (no cloud upload). Data stays on the tenant container, never sent to third parties. Higher-quality Google Vision OCR (cloud) option coming.

🔗 Related nodes in the same category

Ready to use OCR (Tesseract)?

Available now on all FlowForge plans. Try it free without a credit card.

Start freeBrowse all nodes