I don't know why you would consider customizing the NodeService to help with the "test against production data" problem.
Everyone faces the "how can my test data look as close to production as possible" problem. You can either approximate the production data with a test set that you re-load into a fresh repository when you are ready to test. Or, you can snapshot your database and content store volumes, spin up a clone of the production server, create volumes from the snapshots and attach those, and then run your tests.
Of course the larger your production data gets the more time it takes to snapshot and clone the data but that's one of the costs you pay to test against actual production data.
One thing you might think about is this: If the actual content (the binary files and the metadata) is really going to affect the functioning of the application so much so that testing the system with anything but real production data is not sufficient, perhaps you are asking Alfresco to do too much and the architecture should be revisited.
Integration tests using the Alfresco SDK work quite well. And they can be developed to import a consistent and reusable test data set on each run. To me, this kind of setup is far preferable.