Hi everyone -
I am new to cloud computing (and am doing on-the-job learning), but my company has instructed us to use the Azure platform for a new data lake / data warehousing program. They specifically want us to leverage Azure Data Factory as much as possible.
I'm reading the documentation, but am overwhelmed by the vast number of services, how they connect, what sequence they should be in, etc. As a newbie, I can conceptualise one step at a time, but am struggling to see it as a whole.
A business area asks me to ingest a dataset and provision it to a data mart.
Okay, first I must verify that it's not already being ingested to our Data Lake.
Then I need to create an ADF Pipeline to connect / extract / load / transform it for them. That could potentially involve creating multiple Linked Services, an Integration applinked Runtime, VMs, other stuff?
Maybe I need to create a Data Flow somewhere in there? Maybe there are lots of other obvious milestones I'm not thinking of.
My plan is to break this down into very granular step-by-step instructions for myself (to become a bit of a checklist to guide me as I learn), but before I can get to the very granular details I need to conceptualise the big picture. That's important not only for my own sanity, but so I can coherently explain to my managers what I'm doing and why I can't just magically "press a button" and everything's done
I'm hoping there's a flowchart / process map of some description that elegantly depicts these major milestones and decision trees? Possibly something but obviously about the ADF process (focussing on ETL/ELT).
I need some kind of baseline to get my head in order, but haven't been able to find anything thus far.
Big thanks to anyone who knows of such a thing and can help!