OcrSrv API: Using the OCR API

To use the OCR API, an application needs to first link the OcrSrv library to itself, and then create the OCR engine instance according to the recognition type.

See the following section for detailed instructions on how to use the interfaces.

OCR API initialization

To create the recognition instance, first the client application needs to provide three parameters for OCREngineFactory::CreateOCREngineL. They are the reference to the observer class inherited from MOCREngineObserver , the TOcrEngineEnv object to set the recognition thread priority and maximum heap size, and a OCREngineFactory::TEngineType enumeration value to specify one recognition type. After creating a recognition instance, the client application gets the pointer of MOCREngineInterface and needs to convert it to either MOCREngineLayoutRecognize or MOCREngineRecognizeBlock according to the selected type.

Figure ‘Initialize OCR service’ shows how the process of the OCR API Initialization.

The following code snippet demonstrates how to create a recognition instance which supports the documental layout analysis. Creating an instance that supports regional recognition is quite similar so not presented here.

The OCR service perform the recognition in a working thread, and the priority of it shall be set through TOcrEngineEnv .

const TOcrEngineEnv env;

env.iPriority = EPriorityLess;          // Set thread's priority
env.iMaxHeapSize = 1200*KMinHeapGrowBy; // Set thread's heap maximum size

//Create the OCR engine instance. Note that the "observer" is an object which instantiates the MOCREngineObserver.
MOCREngineInterface* myEngine = OCREngineFactory::CreateOCREngineL(observer, 
                                                                   env, 
                                                                   OCREngineFactory::EEngineLayoutRecognize);

// Convert the instance from MOCREngineInterface point to MOCREngineLayoutRecognize
MOCREngineLayoutRecognize* layoutEngine = static_cast<MOCREngineLayoutRecognize*>(myEngine);

Recognition with layout analysis

The recognition with layout analysis consists of two steps. The first step is to analyze the entire image and get the information of the areas where the texts are. And the second step is to recognize part or all of the areas according to the user selection through an array of effective area indices. The layout analysis and recognition result will be sent asynchronously from the callback functions MOCREngineObserver::LayoutComplete and MOCREngineObserver::RecognizeComplete in the observer class.

Figure ‘Recognize with layout ’ shows how the process of recognition with layout.

The following code snippet demonstrates how to use the with layout recognition interface. After the layout analysis, MOCREngineObserver::LayoutComplete function will be called to inform the client application about the text block information. During the recognition process, the client application will continuously get the progress information through the MOCREngineObserver::RecognizeProcess function. After the recognition, recognition result will be provided through the MOCREngineObserver::RecognizeComplete function.

One or two supported languages have to be set active through MOCREngineBase::SetActiveLanguageL method from the base interfaces. The purpose of setting two languages is to do the recognition on an image which has both of the languages on it. For example, there may be English words among Chinese documentation. Then the user shall set English and Chinese as active languages.

Note that no more than two languages can be possibly mixed. And only western languages and eastern languages can be mixed together. It's not possible to set for example Chinese and Japanese both as active languages.

/**
* Set active languages
*/
RArray<TLanguage> languages;
languages.Append(ELangEnglish);    // A western language
languages.Append(ELangPrcChinese); // An eastern language

TRAPD(err, myEngine->Base()->SetActiveLanguageL(languages)); 

/**
* Layout analysis
*/
TOCRLayoutSetting layoutSettings;
layoutSettings.iBrightness = TOCRLayoutSetting::ENormal;
layoutSettings.iSkew = ETrue;      // Set this to ETrue will trigger the geometrical adjustment

_LIT(KFileName, "C:\\image.mbm");
CFbsBitmap image;
image.Load(KFileName);
const TInt handle = image->Handle();     // Get the handle from the font&bitmap server

// Type of myEngine is MOCREngineLayoutRecognize
TRAPD(err, myEngine->LayoutAnalysisL(handle, iLayoutSettings) );

The function MOCREngineBase::LayoutComplete gets called after the layout analysis completed. Its parameter aError indicates whether the analysis is successful or not, the aBlockCount tells the number of text areas identified. The aBlocks is a TOCRBlockInfo array that stores position and extent information of every identified text area. The user can certainly select which areas need to be recognized.

RArray<TInt> blockIndex; // Block index

for (TInt i = 0; i < blockCount; i++) // The blockCount from callback parameter aBlockCount
    {
        // If current block count is four and you do not like to recognize No.0 and No.1 block.
        if (i == 0 || i == 1)
           {
           continue;
           }
        blockIndex.Append(i);
    }

// Recognize No.2 and No.3 block.
TRAPD(err, myEngine->RecognizeL(iRecogSettings, blockIndex));

Region recognition

Region recognition functions are declared in MOCREngineRecognizeBlock . To use this type of recognition, EEngineRecognizeBlock shall be passed to OCREngineFactory::CreateOCREngineL as the type of the OCR engine. There are two types of region recognition:

Text area recognition

Figure ‘Recognize block’ shows the process of recognizing a specified text area.

The following code snippet demonstrates how to start a typical region recognition.

TOCRLayoutBlockInfo layoutInfo;

layoutInfo.iLayout = EOcrLayoutTypeH; // Set when the text lines are horizontal
layoutInfo.iText = EOcrTextMultiLine; // Set when there are more than one lines inside this area
layoutInfo.iBackgroundColor = EOcrBackgroundLight; // Set when the text color is darker than the background
layoutInfo.iRect.SetRect(0, 0, 100, 100); // Set the recognition area

_LIT(KFileName, "C:\\image.mbm");
CFbsBitmap image;
image.Load(KFileName);
const TInt handle = image->Handle(); // Get the handle of the image

// Type of myEngine is MOCREngineRecognizeBlock
TRAPD(err, myEngine->RecognizeBlockL(handle, layoutInfo));

Special content recognition

Figure ‘Recognize special region’ shows the process of special content recognition.

The following code snippet demonstrates how to start a typical special region recognition. The user can specify the text content to be either E-mail addresses, phone numbers or web addresses.

TRegionInfo regionInfo;

regionInfo.iBackgroundColor = EOcrBackgroundLight;
regionInfo.iType = TRegionInfo::EEmailAddress;
regionInfo.iRect.SetRect(0, 0, 100, 100);

_LIT(KFileName, "C:\\image.mbm");
CFbsBitmap image;
image.Load(KFileName);
const TInt handle = image->Handle();

// Type of myEngine is MOCREngineRecognizeBlock
TRAPD(err, myEngine->RecognizeSpecialRegionL(handle, regionInfo));

Cancel recognition

During the recognition, the client application can cancel the recognition process. A cancel request is also handled asynchronously, observer functions in the MOCREngineObserver will report a KErrCancel message through their aError parameter. Both recognition with layout analysis and the region recognition can be canceled.

Figure ‘Cancel recognition’ shows the process of canceling the recognition.

myEngine->Base()->CancelOperation();

Releasing the OCR API

To release the OCR engine instance, you need to call OCREngineFactory::ReleaseOCREngine function.

The following code snippet demonstrates how to use the OCREngineFactory::ReleaseOCREngine the recognizing with layout interface.

Note that there could be only one type of the recognition engine existing at the same time. The instance shall be released before creating another instance from the OCREngineFactory::CreateOCREngineL .

OCREngineFactory::ReleaseOCREngine(myEngine);

Error handling

Memory overhead

The dynamic memory consumption mostly comes from the OCR engine itself. Heap consumption now is around 900KB - 1000KB depending on the image size and language variants.

Exception	Description
`KErrNoMemory`	Reported when there isn't enough memory for the layout analysis or the recognition.
`KErrServerBusy`	Reported when a new recognition request coming while the OCR engine is busy.
`KErrAbort`	Child thread does not exist or operation is aborted.
`KErrArgument`	Bad parameters.
`KErrNotSupported`	Some functionality is not supported.
`KErrGeneral`	General system level error exceptions.
`KErrNotFound`	No engine or database found.

Exception	Description
`KErrOcrBadImage`	Bad image or unsupported image format (Only 24-bit colored or 8-bit gray scale images in bitmap format are supported).

Exception	Description
`KErrOcrBadRegion`	Bad layout region.
`KErrOcrNotSetLanguage`	Before layout or recognition, you must set one or two active languages.

Exception	Description
`KErrOcrBadLanguage`	Unsupported language.
`KErrOcrBadDictFile`	Bad database file.