Online Web Ocr Api Sdk
OCR API SDK info
OCR-IT Conversion Services have been designed specifically for Litigation Support needs. Trials with millions of pages of legal document, often in multiple languages, put stringent requirements on the accuracy and speed of image-to-text conversion systems. Our customers have asked and we responded with a unique image-to-text conversion system that exceeds requirements of legal community for service accurate and processing speed as well as ease of integration with existing platforms like Opticon, IPRO, and Summation. We can accommodate multiple load file formats and match virtually any project requirements.
Text Recognition As A Service (SaaS)
Developers can access the OCR Cloud 2.0 via HTTP(S) POST interface. Whenever and wherever text recognition is required, images are transmitted to the OCR Cloud 2.0. Instant load-balanced processing in the Cloud performs OCR. The recognition result is returned via the same interface, either by querying for result or URL notification. OCR-It OCR Cloud 2.0 is a robust and scaleable platform.
OCR IT Conversion the Simpler Solution
OCR API Cloud software helps users to introduce character recognition abilities for their numerous software application products such as computer hardware, online portals and mobile devices. Trails have been successfully conducted on legal document files written in various different languages. This unique Optical Character Recognition Software doesn’t involve complex procedures like software licensing. Software developers can access with simple coded lines through specific web services after they submit complex images. The developer is assured of accurate recognized code samples in multiple languages according to his preference.
Developers can easily access OCR Software Development Cloud through HTTP interface whenever they feel the need for text recognition solutions from image files, it is then broadcasted to OCR API, which instantly processes the files and performs optical character recognition and feeds the result through the same HTTP interface.
This is an extremely scalable platform and is virus free too. OCR is created to provide 100% accuracy in results. OCR Cloud is fully independent web program and is powerful enough to be operated through quality mobile devices and other web-based applications to change image files into simple usable text codes converted using the OCR or optical character recognition procedure. OCR library provides wholesome solutions to those involved in litigation Support and other corporate atmosphere
OCR CLOUD 2.0 API
OCR Cloud 2.0 is a cloud-based state of the art hosted OCR platform designed to convert millions of pages ACCURATELY and EFFICIENTLY at unbeatable prices. It requires no complex software licensing or purchases, and provides easy access to best OCR technology for integrations within minutes.
To meet growing needs in distributed applications and mobile markets, OCR-IT LLC created OCR Cloud 2.0 – the next generation document conversion platform – a flexible, efficient, powerful and scalable platform that can handle high volumes of pages and large numbers of requests. By combining the best of breed OCR engines and industry leading system integration expertise, OCR Cloud 2.0 now offer the highest accuracy document conversion at unbeatable price
CAPABILITIES
OCR Cloud 2.0 platform can convert virtually any image (TIF, JPG, PNG, BMP) or PDF to any standard text-based document type (TXT, DOC, RTF, XLS, PPT, XML, HTML) or searchable PDF.
Auto-language detection and support for over 200 languages including: Latin based languages Cyrillic based languages Chinese, Japanese, Korean, Thai, and Hebrew.
FREE TRIAL
One of most powerful automated ways to reach OCR Cloud 2.0 is through Web-based API. This innovative capability provides software application developers with flexible powerful access to best-in-marketplace award-winning OCR on demand and without any initial investments. Free Trial and production subscriptions are available online through an automated portal.
ARCHITECTURE
Accurate and 100% automated to ensure privacy
The OCR Cloud 2.0 is built on high accuracy automated text recognition technology and modern state of art platform. How accurate? Benchmark tests show that recognition from OCR-IT LLC delivers accuracy that is virtually on par with leading OCR software alternatives. A free development account offers full access to API for your own evaluation. Privacy and Security of data OCR Cloud 2.0 is a fully automated service without any human intervention. This is important, since you as developers, as well as your users, want to make certain that their images are secure and private. OCR-IT LLC understands that and treats security and privacy among our top priorities. Our security mechanisms provide variety of controls at your fingertips to access and delete your image once the process completes. OCR on Mobile devices
OCR Cloud 2.0 is a powerful Web-based API which allows developers of mobile and small footprint applications to integrate highly accurate Optical Character Recognition technologies that convert images and photographs into manageable, usable and searchable text. With advanced binarization, image pre-processing and filtering algorithms, OCR Cloud 2.0 produces quality results even from less than perfect pictures.
Text Recognition as a Service (SaaS)
Developers can access the OCR Cloud 2.0 via an HTTP(S) interface. Whenever and wherever text recognition is required, images are transmitted to the OCR Cloud 2.0 using this interface. Once transmitted, the images are processed in the Cloud where OCR is performed. The recognition result is returned via the same interface, either by querying for result or notification. The OCR-It OCR Cloud 2.0 is a robust platform designed to operate at scale. It is commercially deployed in numerous distributed or mobile environments.
CLOUD BASED OCR API
OCR Cloud 2.0 is a cloud-based state of the art hosted OCR platform designed to convert millions of pages ACCURATELY and EFFICIENTLY at unbeatable prices. It requires no complex software licensing or purchases, and provides easy access to best OCR technology for integrations within minutes.
To meet growing needs in distributed applications and mobile markets, OCR-IT created OCR Cloud 2.0 – the next generation document conversion platform – a flexible, efficient, powerful and scalable platform that can handle high volumes of pages and large numbers of requests. By combining the best of breed OCR engines and industry leading system integration expertise, OCR Cloud 2.0 now offer the highest accuracy document conversion at unbeatable prices.
Image Formats
- TIF
- JPG
- JPEG2000
- PNG
- BMP
OCR Languages
- Latin based languages,
Cyrillic based languages - Chinese, Japanese, Korean, Thai
- Hebrew
- Arabic
Auto-language/multi-language detection and support for over 190 languages.
Output Formats
- Cleaned JPG, TIF, PDF Image export
- Searchable PDF (Text under/over image, PDF/A, PDF Compressed)
- TXT (standard or Unicode)
- DOC / DOCX / RTF
- XLS / XLSX
- XML
- HTML
- ODBC Compatible Databases
Multiple simultaneous output streams available.
Data Transfer Methods
- POST HTTP via Web API
TABLE OF CONTENTS
Swagger URL
4
SIGN UP AND PRICING
4
Overview
4
1. SUBMITTING A JOB
4
OVERVIEW
4
INPUT PARAMETERS
5
ApiKey (required):
5
InputFiles (required):
5
Name (optional)
6
Password (optional)
6
InputUrl (required if inputBlob not provided)
6
InputType (required in a few cases):
7
InputBlob (optional)
7
NotifyURL (optional):
7
CleanupSettings (optional):
8
OCRSettings (optional):
9
OutputSettings (optional):
10
Error Element On Failed Job Submission
11
Examples
12
2. HANDLING JOB STATUS
13
2.1 Status for jobs in progress
13
2.2 Status for successful jobs
13
2.3 Status for expired jobs
14
2.4 Status for failed jobs
15
3. CLEANUP JOB
15
4. RETRIEVING JOB RESULTS
15
PER-PAGE CHARGES
15
PER-PAGE CHARGES
15
LIST OF SUPPORTED LANGUAGES
15
Languages with full dictionary support
15
Languages without dictionary support
16
Artificial languages
17
Formal languages
17
Note:
17
Questions?
17
SWAGGER URL
Please visit the Swagger page to check the latest description of service methods. Documentation may become obsolete but Swagger would be the latest always:
SIGN UP AND PRICING
GET PRICING INFORMATION AND SIGN UP FOR AN API KEY:
OVERVIEW
The DataCapture.cloud OCR Web API allows to submit OCR requests (images in PDF / TIFF / PNG / JPG / BMP / PCX / DCX formats) and get back textual results (in TXT / PDF / RTF / Word / Excel / XML / CSV / others, with full Unicode support). Multilingual OCR in a variety of languages (listed at the end of this document) is supported.
Key Features:
- Support of Common Image Formats
- Variety of Print Types
- Image Cleanup: Deskew, Despeckele, Remove Texture, Automatic Rotation Detection
- Over 180 OCR Languages
- Mixed Languages Auto-detection
- Barcode Recognition
- Two Speeds of Text Recognition: Quality, Speed
- Specialized Text Extraction Algorithms
- All Popular Output Formats
- Enhanced Error Handling
By using the various API settings, you can optimize the OCR process to a variety of sources (scans, digital camera images, etc) and a variety of purposes (full-text indexing of articles, invoice scanning, etc). Barcode scanning is also supported. For assistance in optimizing the API for your particular task, please contact our Support Team.
Using the API consists of the following stages:
- Submit a Job
- Handle the Job Status – one or both of the following:
- Check Job Status manually
- Get notified about Job Status automatically
- Get Results of a Job
1. SUBMITTING A JOB
OVERVIEW
Submit a job by sending an HTTP POST request to the following URL:
http://BASEURL/api/jobs
The request message body should contain JSON of the following format (explained in detail below):
{
“apiKey”:””,
“profile”:””,
“notifyUrl”:””
“inputFiles“: [],
“cleanupSettings“: {},
“ocrSettings“: {},
“outputSettings“: {}
}
The Content-Type of the request should be “application/json”.
In case of success, the response will be an HTTP 200 (Success) response code, and the following JSON (explained in detail below):
{
“jobUrl“: “string”,
“status“: “Submitted”
}
In case of an error, an HTTP error code is returned along with JSON explaining the error (see section on Error Elements at the end of this document).
You will be charged for 1 page of OCR upon successful job submission, and for the rest of the pages (in case of multi-page document) upon successful job completion. Please note that certain errors (such as a corrupt input file) can only be detected once you’ve already been charged for the 1st page.
INPUT PARAMETERS
APIKEY (REQUIRED)
This is your API key, which is issued to you when you subscribe to the DataCapture.cloud API here
PROFILE (OPTIONAL):
If this parameter specified API would use specific profile settings for file processing. If parameter not specified profile “Default” would be used.
Currently only “Default” profile is supported.
Please note, that OCR API does not currently share Profiles with the other DataCapture.cloud services.
INPUTFILES (REQUIRED):
Array of JSON objects. One object per file
“inputFiles”: [{
“name”: “string”,
“password”: “string”,
“inputUrl”: ”string”,
“inputBlob”: “string”,
“inputType”: “string”
}]
NAME (OPTIONAL)
Optional parameter to provide output filename. Currently it is not supported.
PASSWORD (OPTIONAL)
Required If file protected by password.
INPUTURL (REQUIRED IF INPUTBLOB NOT PROVIDED)
The URL of the image on which you want to perform OCR (must be http:// or https://)
NOTE 1: Make sure that the InputURL is properly encoded. This is especially a concern if the URL contains query parameters. For example, if your image is at:
http://example.com/images?id=565&size=large,
the job request should be:
{“inputFiles“: [{“inputUrl“: “http://example.com/images?id=565&size=large”}]}Note that the “&” in the original URL has turned into “&”, as required by encoding rules.
Normally, if you use a standard library for dealing with JSON, this would be done for you automatically. However, if you are constructing JSON manually from strings, you may need to do this manually.
NOTE 2: Do not URL-encode (percent-encode) the InputURL. For example, if your image is at:
http://example.com/My%20Picture.jpg,
the job request should be:
{“inputFiles“: [{“inputUrl“: “http://example.com/My Picture.jpg “}]}Note that a real space is used instead of the “%20” percent-encoded version.
The image cannot exceed 200MB in size and cannot take more than 15 minutes to download.
The image must be in a supported format (see table below). If the image URL path (not counting the query string, if any) does not end in a dot followed by a supported extension (case-insensitive, see table below), the InputType parameter must be provided. E.g.:
http://example.com/scan001.tif – InputType not required (TIF auto-detected)
http://example.com/scan001.tif?resolution=high – InputType not required (TIF auto-detected)
http://example.com/scan001 – InputType required
http://example.com/scan001?format=.tif – InputType required
Supported formats and extensions are:
FORMAT | EXTENSIONS | SUPPORTED FORMAT DETAILS |
Version 1.6 or earlier | ||
BMP | bmp | 2-bit – Uncompressed Black & White 4- and 8-bit – Uncompressed Palette 16-bit – Uncompressed Mask 24-bit – Uncompressed Palette and TrueColor 32-bit – Uncompressed Mask |
PCX | pcx | 2-bit Black & White, 4- and 8-bit Gray |
DCX | dcx | 2-bit Black & White, 4- and 8-bit Gray |
JPG | jpg, jpeg | Jpeg: Gray, Color Jpeg 2000: Gray Part 1, Color Part 1 |
TIF | tif, tiff | Black&White: uncompressed, CCITT3, CCITT3FAX, CCITT4, PackBits, ZIP, LZW Gray: uncompressed, Packbits, JPEG, ZIP, LZW TrueColor: uncompressed, JPEG, ZIP, LZW Palette: uncompressed, Packbits, ZIP Multi-image TIFF |
PNG | png | Black&white, gray, color |
INPUTTYPE (REQUIRED IN A FEW CASES):
Specifies the input type. Must be one of the Supported Formats (leftmost column in the table above). Not required if the type can be auto-detected from the URL (see InputURL above).
InputType parameter required if image/file provided through inputBlob. Also InputType parameter required if the type can’t be auto-detected from the URL (see InputURL above). In other cases parameter is optional.
INPUTBLOB (OPTIONAL)
Image/File can be posted as base64 string in that parameter. It is necessary to specify inputType parameter (see above).
Please note – at one request you can post only one blob file. Also it is forbidden to post a regular file with inputUrl specified and a blob.
NOTIFYURL (OPTIONAL):
The URL to which a notification should be sent when the job succeeds or fails (see section 2b on notifications). Must be http:// or https://.
NOTE: The NotifyURL must not be URL-encoded (i.e. should use “ “ and not “%20”), and must be encoded (i.e. should use “&” and not “&”), just like the InputURL. See the InputURL section above for more details and examples
OCR API will send either a successful (see 2.2) or failed (2.4) status report to the webhook.
CLEANUPSETTINGS (OPTIONAL):
Settings that control image cleanup, in the following form (every element is optional):
“cleanupSettings“: {
“deskew“: true,
“removeGarbage“: true,
“removeTexture“: true,
“splitDualPage“: true,
“rotationType“: “NoRotation”,
}
The settings are explained below
Deskew | (Boolean) Specifies whether the skew angle for an image should be corrected during preprocessing. This mode is recommended if you want to automatically correct skew for images you work with. The default value is ‘true’. |
RemoveGarbage | (Boolean) Specifies whether garbage (excess dots that are smaller than a certain size) should be removed from the image during preprocessing. The default value is ‘true’. |
RemoveTexture | (Boolean) Specifies whether background noise should be cleared before the recognition process starts. The default value is ‘true’. Before After |
SplitDualPage | (Boolean) Specifies whether API should try to split the image vertically to 2 separate pages. The default value is ‘false’. |
RotationType | (String) Specifies what type of rotation will be performed upon the image during preprocessing. The default value is “Automatic”, which means that rotation will be detected automatically. Allowed values: NoRotation – no rotation Automatic – auto-detect rotation Clockwise – rotate by 90 degrees clockwise Counterclockwise – rotate by 90 degrees counterclockwise Upsidedown – rotate by 180 degrees |
OCRSETTINGS (OPTIONAL):
Settings that control image recognition, in the following form (every element is optional):
“ocrSettings“: {
“speedOcr“: false,
“lookForBarcodes“: true,
“analysisMode“: “MixedDocument”,
“printType“: “Normal”,
“ocrLanguage“: “English”
}
The settings are described below:
PrintType | (Semicolon-delimited list of strings) Specifies the types of printed text in the image. The default value is “Normal”, which corresponds to common typographic text equivalent to laser printer. Normal Modern Text Typewriter Matrix OCR_A OCR-A Text OCR_B OCR-B Text MICR_E13B If you would like to recognize more than one text type in the same document, separate types with semicolons without spaces. For example, “Normal;Typewriter”. |
OCRLanguage | (Semicolon-delimited list of strings) This property allows you to specify which of over 200 supported languages should be used for OCR, including mixed languages within the same document. See list of supported languages at the end of this document (6). The default value is “English“. To specify more than one language, separate languages with semicolons (without spaces) – for example: “English;Dutch (Belgium);Danish”. |
SpeedOCR | (Boolean) This property provides faster recognition speed (by as much as 2-2.5 times, depending on server load) at the cost of a moderately increased error rate (1.5-2 times more errors). On good, print-quality texts, OCR makes an average of 1-2 errors per page more in this mode, which in some cases is a small sacrifice for the substantial increase in speed. Such moderate increase in error rate can be easily tolerated in many cases, such as full text indexing with “fuzzy” searches, preliminary recognition, etc. The default value is ‘false’. |
AnalysisMode | (String) Specifies how aggressively the text should be extracted. The default value is “FullPageDocument”. FullPageDocument – This mode is useful if you export your text to document archives: the full page layout is retained and full-text search is available if you save in this mode. This mode will look for images and text within an image. FullTextIndexing — This mode is used to extract data from a document, including text in pictures. Note that the OCR retains both the picture and the text in it. Text extracted from a picture block can only be exported to TXT, PDF and XML formats (XML export support is coming soon). The data can then be used for subsequent full-text indexing and search. The program retains the logical reading order, pictures, and tables. InvoicePreprocessing — This mode is used to pre-process invoices. Usually they are noisy, low-quality images. This mode extracts all text from the image, including tables, pictures, small text areas, and noise. The result is plain text without table blocks and picture blocks. ExtractBarcodes — this mode is used to extract barcodes only.NOTE: Barcode values are extracted in all modes as long as LookForBarcodes is true. |
LookForBarcodes | (Boolean) Specifies whether barcodes should be recognized. Default is ‘true’. |
OUTPUTSETTINGS (OPTIONAL)
Settings that control text result output, in the following form (every element is optional):
“outputSettings“: {
“exportFormat“: “Text;PDF”
}
The settings are explained below:
ExportFormat | (Semicolon-delimited list of strings) Specifies the desired formats for text output. The default value is “Text;PDF”, which corresponds to both Text and PDF output. RTF – export to *.RTF (rich-text) format. Retains full page layout and preserves pictures. The program will automatically select the most suitable paper size when saving the recognized text and pictures. MSWord – export to *.DOC (Microsoft Word) format. Retains full page layout and preserves pictures. The program will automatically select the most suitable paper size when saving the recognized text and pictures. MSExcel – export to *.XLS (Microsoft Excel) format. PDF – export to *.PDF format DBF – export to *.DBF format Text – export to *.TXT common formatted ASCII text-only output CSV – export to *.CSV format PPT – export to *.PPT format XML – export to *.XML format UnicodeText_UTF8 – export to *.UTF8.TXT format UnicodeText_UTF16 – export to *.UTF16.TXT format UnicodeCSV_UTF8 – export to *.UTF8.CSV format UnicodeCSV_UTF16 – export to *.UTF16.CSV format If you would like to produce more than one output format from the same image request, separate your desired output formats with semicolons without spaces. For example, “PDF;Text;UnicodeText_UTF8”. NOTE: You will need to know the file extension of the desired format (specified above) to retrieve the job results (see section 2.2 of this document). |
ERROR ELEMENT ON FAILED JOB SUBMISSION
If the job submission fails, you will receive an appropriate HTTP error code, as well as an <Error>Code</Error> response. The possible values of ‘Code’ are:
Code | HTTP Error Code | Description |
BadInputURL | 400 | InputURL is invalid or missing, or is not an HTTP/HTTPS URL |
BadNotifyURL | 400 | NotifyURL is invalid or missing, or is not an HTTP/HTTPS URL |
BadInputType | 400 | The specified InputType is invalid, OR InputType is missing, and auto-detected file type is not valid, OR InputType is missing, and auto-detection of file type has failed |
BadRotationType | 400 | Rotation specified in CleanupSettings is invalid. Please note that it is case-sensitive. |
BadAnalysisType | 400 | AnalysisMode specified in OCRSettings is invalid. Please note that it is case-sensitive. |
BadPrintType | 400 | PrintType specified in OCRSettings is invalid. Please note that it is case-sensitive. |
BadExportFormat | 400 | ExportFormat specified in OutputSettings is invalid. Please note that it is case-sensitive. |
OCRSettingsTooComplex | 400 | OCRSettings are too complex. Try reducing the number of OCRLanguages and PrintTypes you are recognizing. |
InternalError:ErrorNumber | 500 | Internal error has occurred. Contact support@wisetrend.com |
EXAMPLES
URL Example:
HTTP POST to http://BASEURL/api/jobs
Message body example (simple):
{
“inputFiles“: [
{
“inputUrl“: http://www.example.com/images/scan001.tif
}
]
}
Message body example (with full parameters):
{
“notifyUrl“: “http://example.com/notify”,
“inputFiles“: [
{
“inputUrl“: “http://www.example.com/getScans.php?DocumentID=569“,
“inputType“: “TIF”
}
],
“cleanupSettings“: {
“deskew“: true,
“removeGarbage“: true,
“removeTexture“: true,
“splitDualPage“: true,
“rotationType“: “NoRotation”,
“outputFormat“: “pdf”,
“resolution“: “high”,
“jpegQuality“: “string”
},
“ocrSettings“: {
“speedOcr“: true,
“lookForBarcodes“: true,
“analysisMode“: “MixedDocument”,
“printType“: “Print”,
“ocrLanguage“: “French”
},
“outputSettings“: {
“exportFormat“: “Text;PDF”
}
}
Response example (status “Submitted”):
{
“jobUrl“: “http://BASEURL/api/Jobs?JobId=00000000-0000-0000-0000-000000000000”,
“status“: “Submitted”,
}
See next section for different available Status responses.
2. HANDLING JOB STATUS
There are two ways to handle job status:
- You can manually check the status of any job by sending an HTTP GET request to the JobURL that you received when you submitted the job.
- You can automatically get notified when the job succeeds or fails if you provide a NotifyURL when you submit a job. There will only be one attempt to notify you. It will be made when the job fully succeeds or fails (you will not get any intermediate status notifications). The notification will consist of an HTTP POST containing JSON status information (see 2.2 and 2.4).
Regardless of which method you use, the status report is in the same format, as described below.
2.1 STATUS FOR JOBS IN PROGRESS
For jobs that are not yet complete, the status report looks as follows:
{
“jobUrl“: “http://BASEURL/api/Jobs?JobId=00000000-0000-0000-0000-000000000000”,
“status“: “[status]”,
}
“Status” can either be:
“Submitted” – the job has been submitted but the image to be OCRed has not yet been downloaded
“Processing” – the image has been downloaded and is in the process of being OCRed
“Finished” – successful/expired/failed jobs.
“JobURL” repeats the URL where updated job status may be obtained.
2.2 STATUS FOR SUCCESSFULL JOBS
For jobs that have completed successfully, the status report looks as follows:
{
“jobUrl“:”http://BASEURL/api/Jobs?JobId=00000000-0000-0000-0000-000000000000”,
“status“:”Finished”,
“download“:[
{ “uri“:”http://ocrapi.datacapture.cloud/api/Files?JobId=00000000-0000-0000-0000-000000000000&outputFormat=pdf”,
“outputFormat“:”pdf”,
“creationDateUTC“: “2017-04-01T06:52:08.839Z”
}
],
“statistics“:{
“files“:[
{
“fileName“:”readme”,
“downloadDateUTC“: “2017-04-01T06:52:08.839Z”,
“warning“: “string”,
“totalCharacters“:5594,
“uncertainCharacters“:123,
“pagesArea“:3
}
],
“creationDateUTC“: “2017-04-01T06:52:08.839Z”,
“totalCharacters“:5594,
“uncertainCharacters“:123,
“pagesArea“:3
}
}
There will be one <File> entry for each requested output format – by default, there will be one for TXT (plaintext) and the other for PDF. The <File> entries may appear in any order. Each contains an <OutputType> indicating the output type (file extension), and a <Uri> containing the address where the output may be downloaded.
As usual, “JobURL” repeats the URL where updated job status may be obtained.
2.3 STATUS FOR EXPIRED JOBS
Job results are not guaranteed to be kept for more than 24 hours. If a job has expired, it will not have a <Download> element, and the <Status> will be “Expired”.
2.4 STATUS FOR FAILED JOBS
For jobs that have failed, the status report looks as follows:
{
“jobUrl“:”http://ocrapi.datacapture.cloud/api/Jobs?JobId=00000000-0000-0000-0000-000000000000”,
“status“:”Failed”,
“errors“: [
{
“code“: “string”,
“message“: “string”
}
],
}
The <Status> may be one of the following:
FailedDownload | Could not download the image to be OCRed |
FailedConversion | Could not perform OCR |
FailedNoFunds | Insufficient funds for the number of pages you are attempting to OCR |
FailedInternalError | Internal error, please contact support@wisetrend.com |
The <Errors> element may or may not be present. If it is present, it may contain one or more <Error> elements with <Code> and <Message> sub-elements that can help you debug the problem. Here are some common <Code> values:
ConvertFailed | The ABBYY OCR engine reported an error during conversion. Make sure that the input file is not corrupt ad is not password-protected. |
SubmitFailed | Could not submit the OCR job. Possibly an internal error, contact support@wisetrend.com |
DownloadRejected | Could not download the input image. Ensure that it does not exceed maximum size and that the server with the image responds promptly. |
DownloadFailed | Could not download the input image. Ensure that the image URL exists and does not require authentication. |
As usual, “JobURL” repeats the URL where updated job status may be obtained.
3. CLEANUP JOB
POST http://BASEURL/api/[jobId]/cleanup also apiKey should be passed through BODY
Method sets job status to EXPIRED and no more information can be received through API including files and statistics
4. RETRIEVING JOB RESULTS
To get the results of the job, use the URLs from the successful job status reports (see section 2.2 above). Results will be returned with the correct Content-Type header. Note that results may be deleted after 7 days.
5. PER-PAGE CHARGES
You will be charged for 1 page at the time the OCR request is submitted (regardless of whether the job fails or succeeds) – this is the minimum charge to attempt a job. You will be charged for the rest of the pages only when the job succeeds.
6. LIST OF SUPPORTED LANGUAGES
LANGUAGES WITH FULL DICTIONARY SUPPORT
Your content goes here. Edit or remove this text inline or in the module Content settings. You can also style every aspect of this content in the module Design settings and even apply custom CSS to this text in the module Advanced settings.
- Armenian (Eastern)
- Armenian (Grabar)
- Armenian (Western)
- Bashkir
- Bulgarian
- Catalan
- Chinese Simplified*
- Chinese Traditional*
- Croatian
- Czech
- Danish
- Dutch (Belgium)
- Dutch (Netherlands)
- English
- Estonian
- Finnish
- French
- German
- German (new spelling)
- Greek
- Hebrew*
- Hungarian
- Indonesian
- Italian
- Japanese*
- Korean*
- Latvian
- Lithuanian
- Norwegian (Group of Norwegian (Nynorsk) and Norwegian (Bokmal) languages.)
- Norwegian (Bokmal)
- Norwegian (Nynorsk)
- Old English
- Old French
- Old German
- Old Italian
- Old Spanish
- Polish
- Portuguese (Brazil)
- Portuguese (Portugal)
- Romanian
- Russian
- Slovak
- Slovenian
- Spanish
- Swedish
- Tatar
- Turkish
- Ukrainian
LANGUAGES WITHOUT DICTIONARY SUPPORT
- Abkhaz
- Adyghe
- Afrikaans
- Agul
- Albanian
- Altaic
- Avar
- Aymara
- Azerbaijani (Cyrillic)
- Azerbaijani (Latin)
- Basque
- Belarussian
- Bemba
- Blackfoot
- Icelandic
- Ingush
- Irish
- Jingpo
- Kabardian
- Kalmyk
- Karachay-Balkar
- Karakalpak
- Kasub
- Kawa
- Kazakh
- Khakas
- Khanty
- Kikuyu
- Kirghiz
- Kongo
- Koryak
- Kpelle
- Kumyk
- Kurdish
- Lak
- Latin
- Lezgin
- Luba
- Macedonian
- Malagasy
- Malay
- Malinke
- Maltese
- Mansi
- Maori
- Breton
- Bugotu
- Buryat
- Cebuano
- Chamorro
- Chechen
- Chukchee
- Chuvash
- Corsican
- Crimean Tatar
- Crow
- Dakota
- Dargwa
- Dungan
- Eskimo (Cyrillic)
- Mari
- Maya
- Miao
- Minangkabau
- Mohawk
- Mongol
- Mordvin
- Nahuatl
- Nenets
- Nivkh
- Nogay
- Nyanja
- Ojibway
- Ossetian
- Papiamento
- Provencal
- Quechua
- Rhaeto-Romanic
- Romanian (Moldavia)
- Romany
- Ruanda
- Rundi
- Russian (old spelling)
- Sami (Lappish)
- Samoan
- Scottish Gaelic
- Selkup
- Serbian (Cyrillic)
- Serbian (Latin)
- Shona
- Somali
- Eskimo (Latin)
- Even
- Evenki
- Faroese
- Fijian
- Frisian
- Friulian
- Gagauz
- Galician
- Ganda
- German (Luxembourg)
- Guarani
- Hani
- Hausa
- Hawaiian
- Sorbian
- Sotho
- Sunda
- Swahili
- Swazi
- Tabassaran
- Tagalog
- Tahitian
- Tajik
- Tok Pisin
- Tongan
- Tswana
- Tun
- Turkmen
- Tuvan
- Udmurt
- Uighur (Cyrillic)
- Uighur (Latin)
- Uzbek (Cyrillic)
- Uzbek (Latin)
- Welsh
- Wolof
- Xhosa
- Yakut
- Zapotec
- Zulu
ARTIFICIAL LANGUAGES
- Esperanto
- Ido
- Interlingua
- Occidental
NOTE
- Basic
- C/C++
- COBOL
- Fortran
- Java
- Pascal
- Simple chemical formulas MICR (E-13B) – recognition
- Language for MICR (E-13B) text type Numbers Only
FORMAL LANGUAGES
Languages marked with “*” are not available in this API release. May be available by special request. Limited export formats and combinations of languages are available. Consult additional documentation or contact DataCapture.cloud team for assistance.
QUESTIONS
Contact support@wisetrend.com