Image Processing involves document classification (recognizing docs), zoning reqd fields & recognizing text characters, picture/logo objects. Fundamental ability in all 3 functions is the ability of a ML/AI model to recognize objects.
Usual approach is to train models using manually labelled data & for every class of recognition (English has 95 printable chars). Also, there are multiple printed fonts & unlimited ways of handwriting each symbol. Each specific type of a symbol is like a sub-class of the main character class. For each sub class, we need thousands of samples manually labelled. Model recognition will be strong and good where we have more samples and will be weak and error prone where we have fewer samples. The sparse sample problem brings in errors in recognition. Recognition errors also occur due to encountering new ways of writing a text character while processing. Hence ML models have to be retrained using failure data and previous versions need to be replaced with retrained models. This cycle extends for several months, increasing the Time To Value (TTV) in automation implementations.
????? ????? ????????: We wanted to change this ML approach to a skill based, with a fundamental skill (more fundamental than recognizing a text character) is taught to the model.
Text characters – printed & handwritten are composed of image features. These features can be lines (of different sizes (relative), orientation & thickness), arcs, circles, dots, etc. A character form can be described by listing out these features with their relative attributes. If these descriptions are made with allowances for variations possible, then these descriptions can be the generic knowledge required to recognize text characters.
???: We developed a image feature description language as the culmination of research work of over five years to describe image object features for any text symbol of any language or any image / picture object including logo / icon objects. Patterns team called this language Patterns Description Language (PDL).
Patterns team went on to develop a Teacher Bot Worker that can generate a PDL for a given text character / object image provided along with the language – character ID (Unicode) / object ID (for non – unicode objects). The Text recognizer models were also added with a skill-based model to compare features in a object image to be recognized with PDL descriptions available to the model.
The combination of PDL and Teacher Bot Worker overcame: the sparse data problem, eliminated frequent retraining & replacement of models and catapulted success rates in recognition to new levels of success. This ability now provides Bot Workers with on the job & continuous ability to learn.
Nov
19



