OCR in Alfresco 7.2

cancel
Showing results for 
Search instead for 
Did you mean: 
anoop
Member II

OCR in Alfresco 7.2

Hi all,

       Installed version 7.2 (Not docker), runs very well, but the OCR action cant be integrated to it, we used this "https://github.com/keensoft/alfresco-simple-ocr" guide with no luck. Is there a way ? Kindly guide.

Thanks is anticipation

regards

ANOOP

8 Replies
fedorow
Senior Member II

Re: OCR in Alfresco 7.2

That module use old transformer.

Try this project https://github.com/aborroy/alf-tengine-ocr

mitpatoliya
Moderator
Moderator

Re: OCR in Alfresco 7.2

It's probably because you would have not set the supporting OCR software. That module just set up connectivity between repo and OCR tool. You have ensured that OCR software is being set up properly and related configurations of modules are being set up correctly.

anoop
Member II

Re: OCR in Alfresco 7.2

No, we are successful in implementing the OCR feature in version 6, in 7 , following the same method, the OCR action is not coming in the drop down menu of folder rules.

anoop
Member II

Re: OCR in Alfresco 7.2

Hi,

I am not quite able to follow the instructions, it is dealing with docker ?  Can you elaborate a little more ?  We dont want it in docker, btw we succeeded to run OCR in 6.

Thanks and regards

anoop

atultalhar
Member II

Re: OCR in Alfresco 7.2

Hi @anoop , I am facing the same problem. Did you find the solution for this one, if yes could you please provide steps to configure this action in the dropdown list.

fedorow
Senior Member II

Re: OCR in Alfresco 7.2

@anoopin the rule dropdown list look for some thing like 'embed-metadata' or some contains word 'embeded'. Sorry, I do not remember exacly.

The working example of OCR you can crate with https://github.com/Alfresco/alfresco-docker-installer. Look inside and implement it without docker, if you like.

 

anoop
Member II

Re: OCR in Alfresco 7.2

Hi, 

   Since we have a working 6.2 install, we were kinda disappointed in 7.1, now we installed 23.1 and tried to follow your direction. We managed to integrate the .jar file in to alfresco, which allows to create the rule, but apart from that nothing happens when we upload the .pdf. Can you be a litle more specific, we dont have much experience with docker as well.

Any help is very much appreciated.

Regards.

fedorow
Senior Member II

Re: OCR in Alfresco 7.2

The ocr for 7.x version works fine for 23.1.

I can give you just base line where to go. First make ats-transformer-ocr-1.0.0.jar file from this repository:

https://github.com/aborroy/alf-tengine-ocr/blob/master/ats-transformer-ocr/README.md

Next prepare your host. The docker declarative aproach use Dockerfile to prepare working environment, install tools and application. Here is the Dockerfile for ocr container:

https://github.com/aborroy/alf-tengine-ocr/blob/master/ats-transformer-ocr/Dockerfile.

It is a list of instructions for ubuntu. Read it and make proper configurations and installations on your host. Of couse you can't just run the commands from Dockerfile. Make shure it is nececery for you.

The logic of ocr process is next:

  • you run java application ats-transformer-ocr-1.0.0.jar which listen port 8090
  • alfresco ocr module call localTransform.ocr.url=http://localhost:8090/ (add this property to alfresco repository)
  • ats-transformer-ocr-1.0.0.jar get the file from module and run tesseract
  • ocr-ed file returns to Alfresco as new version of file.

 

Good luck,

Serge