At the end of my last post I alluded to an improved ingestion pipeline using Step Functions, this post looks at how we can use the recently announced AWS service Comprehend to analyse text files.
The updated architecture is shown below (click to enlarge).
The Lambda function that fetches the content sets a flag to indicate whether the content is an image, this is used by the Step Function definition to decide whether to call the ProcessImage or ProcessText Lambda function as shown in the Step Function definition below:
We covered the behaviour of the ProcessImage in the last post, the new ProcessText Lambda function takes the text and sends it Comprehend to detect entities and to perform semantic analysis. The function then looks for Person, Location and Date entities in the text and compares the positive and negative values from the sentiment analysis to determine the values for the properties on the acme:insuranceClaimReport custom type.