How to migrate data from different Ftps as datasource

cancel
Showing results for 
Search instead for 
Did you mean: 
alex90
Member II

How to migrate data from different Ftps as datasource

Good Afternoon sir, my name is Alejandro and I'm a software engineer student From Cuba. I'm working in my thesis and right know i have a situation and I cant find a proper solution to my problem. So sorry to disturb you, i know that you are a very busy man, but hopefully you can dedicate a few minutes of your time.

The Situation: I'm going to implement a digital repository using alfresco community version 5.1 to manage our university digital content which is stored at a moment in differents ftp servers (software installers, books, thesis). I intent to use alfresco as a backend and Orchard CMS as our intranet frontend which is a non functional requierement and communicate both with CMIS. The general idea is that we use a social networking approch in which every user can modify metadata, add tags in order to improve the search, which by the way is the general objective of my work (allows searches and download to the digital content of our intranet , because right know it takes a lot of time to find anything because it is storage in a ftp server without a good cataloging).
I already successfully created a custom data model but when a decided to migrate the content from these ftps, i didn't find any documentation about it. I read about bulk import tool but it happent that i need the data locally in the same computer that runs alfresco, and as i said, the data source are different ftp server.
So How can i migrate data from differents ftps servers as datasource to Alfresco?. Is it necessary to physically import files to Alfresco or can i work with index pointing to the ftp files (keep the files in the ftps and have in Alfresco a reference of that object (I only have search and download functional requierements))?.
Please I need your help as a guidence because here in cuba we dont have experience working with Alfresco and it is very difficult to have access to internet. So if you can point out the way of fixing this, or any recommendation i will be forever greatfull. Thank You and Again so sorry to disturb You

1 Reply
mehe
Senior Member II

Re: How to migrate data from different Ftps as datasource

Hi Alejandro,

normally you just put documents (PDF, Word,...) in a DMS like Alfresco, because its indexing and previewing capabilities are mainly designed for text centric content. Storing software installers in a DMS is not a good Idea.

You could import all content into your DMS, transferring (copying) the files to your alfresco server (or mount the filesystems on your alfresco server) and import it via the bulk-importer.

But you could also create links to your ftp files in the alfresco repo and bind the desired metadata to those links.

Detecting/monitoring dead links, file movements and access rights in such a setup would be different and more complicated if you have "living" content. You would then have a kind of registry for your ftp content.

If you also store the index information (or description) on the ftp servers, you could automate an "indexing" process for your ftp content in your alfresco system.

For example: You have a ftp server ftp01 with a file /path/to/file/TheFile.pdf that has a fellow ftp file /path/to/TheFile.pdf.description in wich you have a YAML, XML or JSON structure with the metadata you are interested in --> Import the description and a ftp/html link to the original file to alfresco, then you can use alfresco functions on the link/description of the linked file. But you won't have a fulltext index of your files, despite you also transfer the text to alfresco...

This is only one possibility... Depending on what you want to achieve, importing the content directly in alfresco could be the better move.

By the way - how many documents/files are you talking about?