Difference between revisions of "OCR service"

From 1Archive help
Jump to: navigation, search
(Web-service)
Line 1: Line 1:
 +
The OCR service is used for recognizing and validating documents coming from BilltoBox.
  
 +
==BilltoBox - Onea integration==
  
==Webservice==
+
<br/>{{note|This is only applicable for companies for which the Value Added Service OCR & Scanning is set to Onea in BilltoBox.}}<br/>
<br/>
 
'''URL''': https://ocr.onea.be/WS
 
  
<br/>{{info|For each request to the webservice, user credentials should be passed.}}<br/>
+
PDF documents added in BilltoBox wil automatically be sent to Onea. These will pass through our recognition server, where they will await validation by a manual validator.
  
===GetUserInfo===
+
<br/>{{info|The advantage of using a manual validator is we can increase the recognition percentage to a maximum.}}<br/>
  
Fetches the user information, such as:
+
Schematic overview of the flow:
 +
<br/>[[image:billtobox_onea_integration.png|link=]]<br/>
  
* Number of OCR actions per OCR type
+
# BilltoBox is sending a pdf document to Onea
* Number of available credits per OCR type
+
# Onea sets the status to ''PENDING'' and sends a uuid back
* Allowed OCR types
+
# Onea sets the status to ''PROCESSING'' when the document is successfully received and ready to process
  
'''Parameters''': None.
+
==Manual validator manual==
 
 
'''Returns''': XML schema with the user information.
 
 
 
 
 
===PostDocument===
 
 
 
Post a new document to the OCR service, which should be sent to the OCR engine.
 
 
 
'''Parameters''': Base64 encoded document or upload id.
 
 
 
'''Returns''': XML schema with the unique key of the document in the OCR service.
 
 
 
 
 
===GetDocumentStatus===
 
 
 
Check the status of a certain document in the OCR service.
 
 
 
'''Parameters''': Unique key of a document in the OCR service.
 
 
 
'''Returns''': XML schema with the status of the document in the OCR service.
 
 
 
 
 
===GetDocumentResult===
 
 
 
Get the resulting data of the OCR process, based on the document id in the OCR service.
 
 
 
'''Parameters''': Unique key of a document in the OCR service.
 
 
 
'''Returns''': File with the OCR result (PDF with embedded data, XML, TXT, …).
 
<br/>
 
<br/>
 
<code>
 
<OCRRESULT xmlns:abbyy="http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml">
 
    <TEXTS>
 
        <TEXT l="2140" t="15" r="2246" b="40">000529</TEXT>
 
        <TEXT l="273" t="294" r="635" b="337">ES FINANCE</TEXT>
 
        <TEXT l="273" t="349" r="723" b="377">BNP PARIBAS GROUP</TEXT>
 
        <TEXT l="1405" t="524" r="1805" b="541">1225/1307-1/1-8162660001 -1277-001458</TEXT>
 
        <TEXT l="1476" t="679" r="1664" b="710">ONEA NV</TEXT>
 
        <TEXT l="1476" t="725" r="2198" b="756">OTTERGEMSESTEENWEG-ZUID 731</TEXT>
 
        <TEXT l="1475" t="772" r="1695" b="804">9000 </TEXT>
 
        …
 
        <TEXT l="1407" t="2187" r="1559" b="2216">Papier van</TEXT>
 
        <TEXT l="1313" t="2220" r="1653" b="2243">verantwoorde herkomst</TEXT>
 
        <TEXT l="1343" t="2274" r="1624" b="2305">FSC® C011145</TEXT>
 
    </TEXTS>
 
    <BARCODES/>
 
</OCRRESULT>
 
</code>
 

Revision as of 11:43, 6 June 2018

The OCR service is used for recognizing and validating documents coming from BilltoBox.

1 BilltoBox - Onea integration


Note.png This is only applicable for companies for which the Value Added Service OCR & Scanning is set to Onea in BilltoBox.

PDF documents added in BilltoBox wil automatically be sent to Onea. These will pass through our recognition server, where they will await validation by a manual validator.


Info.png The advantage of using a manual validator is we can increase the recognition percentage to a maximum.

Schematic overview of the flow:
Billtobox onea integration.png

  1. BilltoBox is sending a pdf document to Onea
  2. Onea sets the status to PENDING and sends a uuid back
  3. Onea sets the status to PROCESSING when the document is successfully received and ready to process

2 Manual validator manual