Online Web Ocr Api Sdk

Home
/
Cloud
/
Online Web Ocr Api...

OCR API SDK info

OCR-IT Conversion Services have been designed specifically for Litigation Support needs. Trials with millions of pages of legal document, often in multiple languages, put stringent requirements on the accuracy and speed of image-to-text conversion systems. Our customers have asked and we responded with a unique image-to-text conversion system that exceeds requirements of legal community for service accurate and processing speed as well as ease of integration with existing platforms like Opticon, IPRO, and Summation. We can accommodate multiple load file formats and match virtually any project requirements.

Text Recognition As A Service (SaaS)

Developers can access the OCR Cloud 2.0 via HTTP(S) POST interface. Whenever and wherever text recognition is required, images are transmitted to the OCR Cloud 2.0. Instant load-balanced processing in the Cloud performs OCR. The recognition result is returned via the same interface, either by querying for result or URL notification. OCR-It OCR Cloud 2.0 is a robust and scaleable platform.

OCR IT Conversion the Simpler Solution

OCR API Cloud software helps users to introduce character recognition abilities for their numerous software application products such as computer hardware, online portals and mobile devices. Trails have been successfully conducted on legal document files written in various different languages. This unique Optical Character Recognition Software doesn’t involve complex procedures like software licensing. Software developers can access with simple coded lines through specific web services after they submit complex images. The developer is assured of accurate recognized code samples in multiple languages according to his preference.

Developers can easily access OCR Software Development Cloud through HTTP interface whenever they feel the need for text recognition solutions from image files, it is then broadcasted to OCR API, which instantly processes the files and performs optical character recognition and feeds the result through the same HTTP interface.

This is an extremely scalable platform and is virus free too. OCR is created to provide 100% accuracy in results. OCR Cloud is fully independent web program and is powerful enough to be operated through quality mobile devices and other web-based applications to change image files into simple usable text codes converted using the OCR or optical character recognition procedure. OCR library provides wholesome solutions to those involved in litigation Support and other corporate atmosphere

Mobile Friendly

OCR Cloud 2.0 is a powerful Web-based API which allows developers of mobile and small footprint applications to integrate highly accurate Optical Character Recognition technologies that convert images and photographs into manageable, usable and searchable text. With advanced binarization, image pre-processing and filtering algorithms, OCR Cloud 2.0 produces quality results even from less than perfect pictures.

SaaS

Developers can access the OCR Cloud 2.0 via an HTTP or HTTP(S) interfaces. Whenever and wherever text recognition is required, images are transmitted to the OCR Cloud 2.0 using this interface. Once transmitted, the images are processed in the Cloud where OCR is performed. The recognition result is returned via the same interface, either by querying for result or notification. The OCR-IT OCR Cloud 2.0 is a robust platform designed to operate at scale. It is commercially deployed in numerous distributed or mobile environments.

Free Development Account

One of most powerful automated ways to reach OCR Cloud 2.0 is through Web-based API. This innovative capability provides software application developers with flexible powerful access to best-in-marketplace award-winning OCR on demand and without any initial investments. Free Trial and production subscriptions are available online through an automated portal.

OCR CLOUD 2.0 API

To meet growing needs in distributed applications and mobile markets, OCR-IT LLC created OCR Cloud 2.0 – the next generation document conversion platform – a flexible, efficient, powerful and scalable platform that can handle high volumes of pages and large numbers of requests. By combining the best of breed OCR engines and industry leading system integration expertise, OCR Cloud 2.0 now offer the highest accuracy document conversion at unbeatable price

CAPABILITIES

OCR Cloud 2.0 platform can convert virtually any image (TIF, JPG, PNG, BMP) or PDF to any standard text-based document type (TXT, DOC, RTF, XLS, PPT, XML, HTML) or searchable PDF.

Auto-language detection and support for over 200 languages including: Latin based languages Cyrillic based languages Chinese, Japanese, Korean, Thai, and Hebrew.

FREE TRIAL

ARCHITECTURE

Accurate and 100% automated to ensure privacy
The OCR Cloud 2.0 is built on high accuracy automated text recognition technology and modern state of art platform. How accurate? Benchmark tests show that recognition from OCR-IT LLC delivers accuracy that is virtually on par with leading OCR software alternatives. A free development account offers full access to API for your own evaluation. Privacy and Security of data OCR Cloud 2.0 is a fully automated service without any human intervention. This is important, since you as developers, as well as your users, want to make certain that their images are secure and private. OCR-IT LLC understands that and treats security and privacy among our top priorities. Our security mechanisms provide variety of controls at your fingertips to access and delete your image once the process completes. OCR on Mobile devices

Text Recognition as a Service (SaaS)

Developers can access the OCR Cloud 2.0 via an HTTP(S) interface. Whenever and wherever text recognition is required, images are transmitted to the OCR Cloud 2.0 using this interface. Once transmitted, the images are processed in the Cloud where OCR is performed. The recognition result is returned via the same interface, either by querying for result or notification. The OCR-It OCR Cloud 2.0 is a robust platform designed to operate at scale. It is commercially deployed in numerous distributed or mobile environments.

CLOUD BASED OCR API

OCR Cloud 2.0 is a cloud-based state of the art hosted OCR platform designed to convert millions of pages ACCURATELY and EFFICIENTLY at unbeatable prices. It requires no complex software licensing or purchases, and provides easy access to best OCR technology for integrations within minutes.
To meet growing needs in distributed applications and mobile markets, OCR-IT created OCR Cloud 2.0 – the next generation document conversion platform – a flexible, efficient, powerful and scalable platform that can handle high volumes of pages and large numbers of requests. By combining the best of breed OCR engines and industry leading system integration expertise, OCR Cloud 2.0 now offer the highest accuracy document conversion at unbeatable prices.

Image Formats

PDF
TIF
JPG
JPEG2000
PNG
BMP

OCR Languages

Latin based languages,
Cyrillic based languages
Chinese, Japanese, Korean, Thai
Hebrew
Arabic

Auto-language/multi-language detection and support for over 190 languages.

Output Formats

Cleaned JPG, TIF, PDF Image export
Searchable PDF (Text under/over image, PDF/A, PDF Compressed)
TXT (standard or Unicode)
DOC / DOCX / RTF
XLS / XLSX
XML
HTML
ODBC Compatible Databases

Multiple simultaneous output streams available.

Data Transfer Methods

POST HTTP via Web API

Swagger URL

SIGN UP AND PRICING

Overview

1. SUBMITTING A JOB

OVERVIEW

INPUT PARAMETERS

ApiKey (required):

InputFiles (required):

Name (optional)

Password (optional)

InputUrl (required if inputBlob not provided)

InputType (required in a few cases):

InputBlob (optional)

NotifyURL (optional):

CleanupSettings (optional):

OCRSettings (optional):

OutputSettings (optional):

Error Element On Failed Job Submission

Examples

2. HANDLING JOB STATUS

2.1 Status for jobs in progress

2.2 Status for successful jobs

2.3 Status for expired jobs

2.4 Status for failed jobs

3. CLEANUP JOB

4. RETRIEVING JOB RESULTS

PER-PAGE CHARGES

LIST OF SUPPORTED LANGUAGES

Languages with full dictionary support

Languages without dictionary support

Artificial languages

Formal languages

Note:

Questions?

SWAGGER URL

Please visit the Swagger page to check the latest description of service methods. Documentation may become obsolete but Swagger would be the latest always:

http://ocrapi.datacapture.cloud/swagger/ui/index.html

SIGN UP AND PRICING

GET PRICING INFORMATION AND SIGN UP FOR AN API KEY:

https://portal.datacapture.cloud/#/login

OVERVIEW

The DataCapture.cloud OCR Web API allows to submit OCR requests (images in PDF / TIFF / PNG / JPG / BMP / PCX / DCX formats) and get back textual results (in TXT / PDF / RTF / Word / Excel / XML / CSV / others, with full Unicode support). Multilingual OCR in a variety of languages (listed at the end of this document) is supported.

Key Features:

Support of Common Image Formats
Variety of Print Types
Image Cleanup: Deskew, Despeckele, Remove Texture, Automatic Rotation Detection
Over 180 OCR Languages
Mixed Languages Auto-detection
Barcode Recognition
Two Speeds of Text Recognition: Quality, Speed
Specialized Text Extraction Algorithms
All Popular Output Formats
Enhanced Error Handling

By using the various API settings, you can optimize the OCR process to a variety of sources (scans, digital camera images, etc) and a variety of purposes (full-text indexing of articles, invoice scanning, etc). Barcode scanning is also supported. For assistance in optimizing the API for your particular task, please contact our Support Team.

Using the API consists of the following stages:

Submit a Job
Handle the Job Status – one or both of the following:
Check Job Status manually
Get notified about Job Status automatically
Get Results of a Job

1. SUBMITTING A JOB

OVERVIEW

Submit a job by sending an HTTP POST request to the following URL:

http://BASEURL/api/jobs

The request message body should contain JSON of the following format (explained in detail below):

{

  “apiKey”:””,

  “profile”:””,

  “notifyUrl”:””

  “inputFiles“: [],

  “cleanupSettings“: {},

  “ocrSettings“: {},

  “outputSettings“: {}

}

The Content-Type of the request should be “application/json”.

In case of success, the response will be an HTTP 200 (Success) response code, and the following JSON (explained in detail below):

{  

“jobUrl“: “string”,

“status“: “Submitted”

}

In case of an error, an HTTP error code is returned along with JSON explaining the error (see section on Error Elements at the end of this document).

You will be charged for 1 page of OCR upon successful job submission, and for the rest of the pages (in case of multi-page document) upon successful job completion. Please note that certain errors (such as a corrupt input file) can only be detected once you’ve already been charged for the 1st page.

INPUT PARAMETERS

APIKEY (REQUIRED)

This is your API key, which is issued to you when you subscribe to the DataCapture.cloud API here

PROFILE (OPTIONAL):

If this parameter specified API would use specific profile settings for file processing. If parameter not specified profile “Default” would be used.
Currently only “Default” profile is supported.

Please note, that OCR API does not currently share Profiles with the other DataCapture.cloud services.

INPUTFILES (REQUIRED):

Array of JSON objects. One object per file

“inputFiles”: [{

“name”: “string”,

“password”: “string”,

“inputUrl”: ”string”,

“inputBlob”: “string”,

“inputType”: “string”

}]

NAME (OPTIONAL)

Optional parameter to provide output filename. Currently it is not supported.

PASSWORD (OPTIONAL)

Required If file protected by password.

INPUTURL (REQUIRED IF INPUTBLOB NOT PROVIDED)

The URL of the image on which you want to perform OCR (must be http:// or https://)

NOTE 1: Make sure that the InputURL is properly encoded. This is especially a concern if the URL contains query parameters. For example, if your image is at:
http://example.com/images?id=565&size=large,
the job request should be:
{“inputFiles“: [{“inputUrl“: “http://example.com/images?id=565&size=large”}]}Note that the “&” in the original URL has turned into “&”, as required by encoding rules.

Normally, if you use a standard library for dealing with JSON, this would be done for you automatically. However, if you are constructing JSON manually from strings, you may need to do this manually.

NOTE 2: Do not URL-encode (percent-encode) the InputURL. For example, if your image is at:
http://example.com/My%20Picture.jpg,
the job request should be:
{“inputFiles“: [{“inputUrl“: “http://example.com/My Picture.jpg “}]}Note that a real space is used instead of the “%20” percent-encoded version.

The image cannot exceed 200MB in size and cannot take more than 15 minutes to download.

The image must be in a supported format (see table below). If the image URL path (not counting the query string, if any) does not end in a dot followed by a supported extension (case-insensitive, see table below), the InputType parameter must be provided. E.g.:

http://example.com/scan001.tif – InputType not required (TIF auto-detected)
http://example.com/scan001.tif?resolution=high – InputType not required (TIF auto-detected)
http://example.com/scan001 – InputType required
http://example.com/scan001?format=.tif – InputType required

Supported formats and extensions are:

FORMAT	EXTENSIONS	SUPPORTED FORMAT DETAILS
PDF	pdf	Version 1.6 or earlier
BMP	bmp	2-bit – Uncompressed Black & White 4- and 8-bit – Uncompressed Palette 16-bit – Uncompressed Mask 24-bit – Uncompressed Palette and TrueColor 32-bit – Uncompressed Mask
PCX	pcx	2-bit Black & White, 4- and 8-bit Gray
DCX	dcx	2-bit Black & White, 4- and 8-bit Gray
JPG	jpg, jpeg	Jpeg: Gray, Color Jpeg 2000: Gray Part 1, Color Part 1
TIF	tif, tiff	Black&White: uncompressed, CCITT3, CCITT3FAX, CCITT4, PackBits, ZIP, LZW Gray: uncompressed, Packbits, JPEG, ZIP, LZW TrueColor: uncompressed, JPEG, ZIP, LZW Palette: uncompressed, Packbits, ZIP Multi-image TIFF
PNG	png	Black&white, gray, color

INPUTTYPE (REQUIRED IN A FEW CASES):

Specifies the input type. Must be one of the Supported Formats (leftmost column in the table above). Not required if the type can be auto-detected from the URL (see InputURL above).

InputType parameter required if image/file provided through inputBlob. Also InputType parameter required if the type can’t be auto-detected from the URL (see InputURL above). In other cases parameter is optional.

INPUTBLOB (OPTIONAL)

Image/File can be posted as base64 string in that parameter. It is necessary to specify inputType parameter (see above).

Please note – at one request you can post only one blob file. Also it is forbidden to post a regular file with inputUrl specified and a blob.

NOTIFYURL (OPTIONAL):

The URL to which a notification should be sent when the job succeeds or fails (see section 2b on notifications). Must be http:// or https://.

NOTE: The NotifyURL must not be URL-encoded (i.e. should use “ “ and not “%20”), and must be encoded (i.e. should use “&” and not “&”), just like the InputURL. See the InputURL section above for more details and examples

OCR API will send either a successful (see 2.2) or failed (2.4) status report to the webhook.

CLEANUPSETTINGS (OPTIONAL):

Settings that control image cleanup, in the following form (every element is optional):

“cleanupSettings“: {

    “deskew“: true,

    “removeGarbage“: true,

    “removeTexture“: true,

    “splitDualPage“: true,

    “rotationType“: “NoRotation”,

}

The settings are explained below

Deskew	(Boolean) Specifies whether the skew angle for an image should be corrected during preprocessing. This mode is recommended if you want to automatically correct skew for images you work with. The default value is ‘true’.
RemoveGarbage	(Boolean) Specifies whether garbage (excess dots that are smaller than a certain size) should be removed from the image during preprocessing. The default value is ‘true’.
RemoveTexture	(Boolean) Specifies whether background noise should be cleared before the recognition process starts. The default value is ‘true’. *Before After*
SplitDualPage	(Boolean) Specifies whether API should try to split the image vertically to 2 separate pages. The default value is ‘false’.
RotationType	(String) Specifies what type of rotation will be performed upon the image during preprocessing. The default value is “Automatic”, which means that rotation will be detected automatically. Allowed values: NoRotation – no rotation Automatic – auto-detect rotation Clockwise – rotate by 90 degrees clockwise Counterclockwise – rotate by 90 degrees counterclockwise Upsidedown – rotate by 180 degrees

OCRSETTINGS (OPTIONAL):

Settings that control image recognition, in the following form (every element is optional):

“ocrSettings“: {

    “speedOcr“: false,

    “lookForBarcodes“: true,

    “analysisMode“: “MixedDocument”,

    “printType“: “Normal”,

    “ocrLanguage“: “English”

  }

The settings are described below:

PrintType	(Semicolon-delimited list of strings) Specifies the types of printed text in the image. The default value is “Normal”, which corresponds to common typographic text equivalent to laser printer. Normal Modern Text Typewriter Matrix OCR_A OCR-A Text OCR_B OCR-B Text MICR_E13B If you would like to recognize more than one text type in the same document, separate types with semicolons without spaces. For example, “Normal;Typewriter”.
OCRLanguage	(Semicolon-delimited list of strings) This property allows you to specify which of over 200 supported languages should be used for OCR, including mixed languages within the same document. See list of supported languages at the end of this document (6). The default value is “English“. To specify more than one language, separate languages with semicolons (without spaces) – for example: “English;Dutch (Belgium);Danish”.
SpeedOCR	(Boolean) This property provides faster recognition speed (by as much as 2-2.5 times, depending on server load) at the cost of a moderately increased error rate (1.5-2 times more errors). On good, print-quality texts, OCR makes an average of 1-2 errors per page more in this mode, which in some cases is a small sacrifice for the substantial increase in speed. Such moderate increase in error rate can be easily tolerated in many cases, such as full text indexing with “fuzzy” searches, preliminary recognition, etc. The default value is ‘false’.
AnalysisMode	(String) Specifies how aggressively the text should be extracted. The default value is “FullPageDocument”. FullPageDocument – This mode is useful if you export your text to document archives: the full page layout is retained and full-text search is available if you save in this mode. This mode will look for images and text within an image. FullTextIndexing — This mode is used to extract data from a document, including text in pictures. Note that the OCR retains both the picture and the text in it. Text extracted from a picture block can only be exported to TXT, PDF and XML formats (XML export support is coming soon). The data can then be used for subsequent full-text indexing and search. The program retains the logical reading order, pictures, and tables. InvoicePreprocessing — This mode is used to pre-process invoices. Usually they are noisy, low-quality images. This mode extracts all text from the image, including tables, pictures, small text areas, and noise. The result is plain text without table blocks and picture blocks. ExtractBarcodes — this mode is used to extract barcodes only.NOTE: Barcode values are extracted in all modes as long as LookForBarcodes is true.
LookForBarcodes	(Boolean) Specifies whether barcodes should be recognized. Default is ‘true’.

OUTPUTSETTINGS (OPTIONAL)

Settings that control text result output, in the following form (every element is optional):

“outputSettings“: {

    “exportFormat“: “Text;PDF”

  }

The settings are explained below:

ExportFormat

(Semicolon-delimited list of strings) Specifies the desired formats for text output.

The default value is “Text;PDF”, which corresponds to both Text and PDF output.

RTF – export to *.RTF (rich-text) format. Retains full page layout and preserves pictures. The program will automatically select the most suitable paper size when saving the recognized text and pictures.

MSWord – export to *.DOC (Microsoft Word) format. Retains full page layout and preserves pictures. The program will automatically select the most suitable paper size when saving the recognized text and pictures.

MSExcel – export to *.XLS (Microsoft Excel) format.

PDF – export to *.PDF format

DBF – export to *.DBF format

Text – export to *.TXT common formatted ASCII text-only output

CSV – export to *.CSV format

PPT – export to *.PPT format

XML – export to *.XML format

UnicodeText_UTF8 – export to *.UTF8.TXT format

UnicodeText_UTF16 – export to *.UTF16.TXT format

UnicodeCSV_UTF8 – export to *.UTF8.CSV format

UnicodeCSV_UTF16 – export to *.UTF16.CSV format

If you would like to produce more than one output format from the same image request, separate your desired output formats with semicolons without spaces. For example, “PDF;Text;UnicodeText_UTF8”.

NOTE: You will need to know the file extension of the desired format (specified above) to retrieve the job results (see section 2.2 of this document).

ERROR ELEMENT ON FAILED JOB SUBMISSION

If the job submission fails, you will receive an appropriate HTTP error code, as well as an <Error>Code</Error> response. The possible values of ‘Code’ are:

Code	HTTP Error Code	Description
BadInputURL	400	InputURL is invalid or missing, or is not an HTTP/HTTPS URL
BadNotifyURL	400	NotifyURL is invalid or missing, or is not an HTTP/HTTPS URL
BadInputType	400	The specified InputType is invalid, OR InputType is missing, and auto-detected file type is not valid, OR InputType is missing, and auto-detection of file type has failed
BadRotationType	400	Rotation specified in CleanupSettings is invalid. Please note that it is case-sensitive.
BadAnalysisType	400	AnalysisMode specified in OCRSettings is invalid. Please note that it is case-sensitive.
BadPrintType	400	PrintType specified in OCRSettings is invalid. Please note that it is case-sensitive.
BadExportFormat	400	ExportFormat specified in OutputSettings is invalid. Please note that it is case-sensitive.
OCRSettingsTooComplex	400	OCRSettings are too complex. Try reducing the number of OCRLanguages and PrintTypes you are recognizing.
InternalError:ErrorNumber	500	Internal error has occurred. Contact support@wisetrend.com

EXAMPLES

URL Example:

HTTP POST to http://BASEURL/api/jobs

Message body example (simple):

{

  “inputFiles“: [

    {

      “inputUrl“: http://www.example.com/images/scan001.tif

    }

  ]

}

Message body example (with full parameters):

{

  “notifyUrl“: “http://example.com/notify”,

  “inputFiles“: [

    {

      “inputUrl“: “http://www.example.com/getScans.php?DocumentID=569“,

      “inputType“: “TIF”

    }

  ],

  “cleanupSettings“: {

    “deskew“: true,

    “removeGarbage“: true,

    “removeTexture“: true,

    “splitDualPage“: true,

    “rotationType“: “NoRotation”,

    “outputFormat“: “pdf”,

    “resolution“: “high”,

    “jpegQuality“: “string”

  },

  “ocrSettings“: {

    “speedOcr“: true,

    “lookForBarcodes“: true,

    “analysisMode“: “MixedDocument”,

    “printType“: “Print”,

    “ocrLanguage“: “French”

  },

  “outputSettings“: {

    “exportFormat“: “Text;PDF”

  }

}

Response example (status “Submitted”):

{

  “jobUrl“: “http://BASEURL/api/Jobs?JobId=00000000-0000-0000-0000-000000000000”,

  “status“: “Submitted”,

}

See next section for different available Status responses.

2. HANDLING JOB STATUS

There are two ways to handle job status:

You can manually check the status of any job by sending an HTTP GET request to the JobURL that you received when you submitted the job.
You can automatically get notified when the job succeeds or fails if you provide a NotifyURL when you submit a job. There will only be one attempt to notify you. It will be made when the job fully succeeds or fails (you will not get any intermediate status notifications). The notification will consist of an HTTP POST containing JSON status information (see 2.2 and 2.4).

Regardless of which method you use, the status report is in the same format, as described below.

2.1 STATUS FOR JOBS IN PROGRESS

For jobs that are not yet complete, the status report looks as follows:

{

  “jobUrl“: “http://BASEURL/api/Jobs?JobId=00000000-0000-0000-0000-000000000000”,

  “status“: “[status]”,

}

“Status” can either be:
“Submitted” – the job has been submitted but the image to be OCRed has not yet been downloaded
“Processing” – the image has been downloaded and is in the process of being OCRed
“Finished” – successful/expired/failed jobs.

“JobURL” repeats the URL where updated job status may be obtained.

2.2 STATUS FOR SUCCESSFULL JOBS

For jobs that have completed successfully, the status report looks as follows:

{  

   “jobUrl“:”http://BASEURL/api/Jobs?JobId=00000000-0000-0000-0000-000000000000”,

   “status“:”Finished”,

   “download“:[  

      {     “uri“:”http://ocrapi.datacapture.cloud/api/Files?JobId=00000000-0000-0000-0000-000000000000&outputFormat=pdf”,

         “outputFormat“:”pdf”,

         “creationDateUTC“: “2017-04-01T06:52:08.839Z”

      }

   ],

   “statistics“:{  

      “files“:[  

         {  

            “fileName“:”readme”,

            “downloadDateUTC“: “2017-04-01T06:52:08.839Z”,

            “warning“: “string”,

            “totalCharacters“:5594,

            “uncertainCharacters“:123,

            “pagesArea“:3

         }

      ],

      “creationDateUTC“: “2017-04-01T06:52:08.839Z”,

      “totalCharacters“:5594,

      “uncertainCharacters“:123,

      “pagesArea“:3

   }

}

There will be one <File> entry for each requested output format – by default, there will be one for TXT (plaintext) and the other for PDF. The <File> entries may appear in any order. Each contains an <OutputType> indicating the output type (file extension), and a <Uri> containing the address where the output may be downloaded.

As usual, “JobURL” repeats the URL where updated job status may be obtained.

2.3 STATUS FOR EXPIRED JOBS

Job results are not guaranteed to be kept for more than 24 hours. If a job has expired, it will not have a <Download> element, and the <Status> will be “Expired”.

2.4 STATUS FOR FAILED JOBS

For jobs that have failed, the status report looks as follows:

{  

   “jobUrl“:”http://ocrapi.datacapture.cloud/api/Jobs?JobId=00000000-0000-0000-0000-000000000000”,

   “status“:”Failed”,

   “errors“: [

    {

      “code“: “string”,

      “message“: “string”

    }

   ],

}

The <Status> may be one of the following:

FailedDownload	Could not download the image to be OCRed
FailedConversion	Could not perform OCR
FailedNoFunds	Insufficient funds for the number of pages you are attempting to OCR
FailedInternalError	Internal error, please contact support@wisetrend.com

The <Errors> element may or may not be present. If it is present, it may contain one or more <Error> elements with <Code> and <Message> sub-elements that can help you debug the problem. Here are some common <Code> values:

ConvertFailed	The ABBYY OCR engine reported an error during conversion. Make sure that the input file is not corrupt ad is not password-protected.
SubmitFailed	Could not submit the OCR job. Possibly an internal error, contact support@wisetrend.com
DownloadRejected	Could not download the input image. Ensure that it does not exceed maximum size and that the server with the image responds promptly.
DownloadFailed	Could not download the input image. Ensure that the image URL exists and does not require authentication.

As usual, “JobURL” repeats the URL where updated job status may be obtained.

3. CLEANUP JOB

POST http://BASEURL/api/[jobId]/cleanup also apiKey should be passed through BODY

Method sets job status to EXPIRED and no more information can be received through API including files and statistics

4. RETRIEVING JOB RESULTS

To get the results of the job, use the URLs from the successful job status reports (see section 2.2 above). Results will be returned with the correct Content-Type header. Note that results may be deleted after 7 days.

5. PER-PAGE CHARGES

You will be charged for 1 page at the time the OCR request is submitted (regardless of whether the job fails or succeeds) – this is the minimum charge to attempt a job. You will be charged for the rest of the pages only when the job succeeds.

6. LIST OF SUPPORTED LANGUAGES

LANGUAGES WITH FULL DICTIONARY SUPPORT

Your content goes here. Edit or remove this text inline or in the module Content settings. You can also style every aspect of this content in the module Design settings and even apply custom CSS to this text in the module Advanced settings.

Armenian (Eastern)
Armenian (Grabar)
Armenian (Western)
Bashkir
Bulgarian
Catalan
Chinese Simplified*
Chinese Traditional*
Croatian
Czech
Danish
Dutch (Belgium)
Dutch (Netherlands)
English
Estonian
Finnish
French
German

German (new spelling)
Greek
Hebrew*
Hungarian
Indonesian
Italian
Japanese*
Korean*
Latvian
Lithuanian
Norwegian (Group of Norwegian (Nynorsk) and Norwegian (Bokmal) languages.)
Norwegian (Bokmal)
Norwegian (Nynorsk)
Old English
Old French

Old German
Old Italian
Old Spanish
Polish
Portuguese (Brazil)
Portuguese (Portugal)
Romanian
Russian
Slovak
Slovenian
Spanish
Swedish
Tatar
Turkish
Ukrainian

LANGUAGES WITHOUT DICTIONARY SUPPORT

Abkhaz
Adyghe
Afrikaans
Agul
Albanian
Altaic
Avar
Aymara
Azerbaijani (Cyrillic)
Azerbaijani (Latin)
Basque
Belarussian
Bemba
Blackfoot
Icelandic
Ingush
Irish
Jingpo
Kabardian
Kalmyk
Karachay-Balkar
Karakalpak
Kasub
Kawa
Kazakh
Khakas
Khanty
Kikuyu
Kirghiz
Kongo
Koryak
Kpelle
Kumyk
Kurdish
Lak
Latin
Lezgin
Luba
Macedonian
Malagasy
Malay
Malinke
Maltese
Mansi
Maori

Breton
Bugotu
Buryat
Cebuano
Chamorro
Chechen
Chukchee
Chuvash
Corsican
Crimean Tatar
Crow
Dakota
Dargwa
Dungan
Eskimo (Cyrillic)
Mari
Maya
Miao
Minangkabau
Mohawk
Mongol
Mordvin
Nahuatl
Nenets
Nivkh
Nogay
Nyanja
Ojibway
Ossetian
Papiamento
Provencal
Quechua
Rhaeto-Romanic
Romanian (Moldavia)
Romany
Ruanda
Rundi
Russian (old spelling)
Sami (Lappish)
Samoan
Scottish Gaelic
Selkup
Serbian (Cyrillic)
Serbian (Latin)
Shona
Somali

Eskimo (Latin)
Even
Evenki
Faroese
Fijian
Frisian
Friulian
Gagauz
Galician
Ganda
German (Luxembourg)
Guarani
Hani
Hausa
Hawaiian
Sorbian
Sotho
Sunda
Swahili
Swazi
Tabassaran
Tagalog
Tahitian
Tajik
Tok Pisin
Tongan
Tswana
Tun
Turkmen
Tuvan
Udmurt
Uighur (Cyrillic)
Uighur (Latin)
Uzbek (Cyrillic)
Uzbek (Latin)
Welsh
Wolof
Xhosa
Yakut
Zapotec
Zulu

ARTIFICIAL LANGUAGES

Esperanto
Ido

Interlingua
Occidental

NOTE

Basic
C/C++
COBOL
Fortran

Java
Pascal
Simple chemical formulas MICR (E-13B) – recognition

Language for MICR (E-13B) text type Numbers Only

FORMAL LANGUAGES

Languages marked with “*” are not available in this API release. May be available by special request. Limited export formats and combinations of languages are available. Consult additional documentation or contact DataCapture.cloud team for assistance.

QUESTIONS

Contact support@wisetrend.com

Online Web Ocr Api Sdk

OCR API SDK info

Mobile Friendly

SaaS

Free Development Account

OCR CLOUD 2.0 API

CLOUD BASED OCR API

TABLE OF CONTENTS

SWAGGER URL

SIGN UP AND PRICING

OVERVIEW

1. SUBMITTING A JOB

2. HANDLING JOB STATUS

3. CLEANUP JOB

4. RETRIEVING JOB RESULTS

5. PER-PAGE CHARGES

6. LIST OF SUPPORTED LANGUAGES

QUESTIONS