convert scanned pdf into searchable pdf - Alfresco Community 5.2

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
anuradha1
Active Member II

convert scanned pdf into searchable pdf - Alfresco Community 5.2

Hi,

I am using alfresco for a long time. I scanned around 100,000 documents & uploaded into alfresco. But suddenly i faced a problem because it can't be read using it's content. If i scanned a document using a scanner having ocr then it can be. But, i don't have ocr on every scanner so i need to integrate OCR module into alfresco. I tried tesseract ocr & simple ocr, but both did not worked.

  1. If anyone knows, plz tell me another way to do this or correct way to integrate tesseract-ocr or simple-ocr
  2. I need to convert all uploaded document into searchable pdf also.

Thanks

5 Replies
sercama
Active Member II

Re: convert scanned pdf into searchable pdf - Alfresco Community 5.2

Hi anuradha madhushani

Do you know this addon?

GitHub - keensoft/alfresco-simple-ocr: Simple OCR action for Alfresco 

You can configure a rule which it executes the OCR action from this addon.

Regards.

anuradha1
Active Member II

Re: convert scanned pdf into searchable pdf - Alfresco Community 5.2

I tried it. but not working. i am having CentOS 7 repository. I think pdfsandwitch not supported for this repository. OCRmyPDF also not installed i can't find whether its supported to  CentOS 7 or not. What can i do??

Thank you.

angelborroy
Expert

Re: convert scanned pdf into searchable pdf - Alfresco Community 5.2

You can take this as base: 

https://github.com/keensoft/alfresco-simple-ocr/blob/master/docker/pdfsandwich-1.6-centos-7/Dockerfi...

Software Engineer in Alfresco Search Team.
anuradha1
Active Member II

Re: convert scanned pdf into searchable pdf - Alfresco Community 5.2

ok. i am successfully installed pdfsandwitch also. after restarting alfresco, it displays ocr button. once i clicked it, below message loaded but nothing happen.

please help me. I am in a big trouble now

anuradha1
Active Member II

Re: convert scanned pdf into searchable pdf - Alfresco Community 5.2

Thank you for all who helps to me. I did it with simple-ocr for my Linux installation. But now i need to do the same with an alfresco windows installation. Anyone knows a way to do that??

Any help will be appreciated.

Thank you.