AWS DeepLens TechConnect IoT Rule

AWS DeepLens: Creating an IoT Rule (Part 2 of 3)

This post is the second in a series on getting started with the AWS DeepLens. In Part 1, we introduced a program that could detect faces and crop them by extending the boilerplate Greengrass Lambda and pre-built model provided by AWS. This focussed on the local capabilities of the device, but the DeepLens device is much more than that. At its core, DeepLens is a fully fledged IoT device, which is just one part of the 3 Pillars of IoT: devices, cloud and intelligence.

All code and templates mentioned can be found here. This can be deployed using AWS SAM, which helps reduces the the complexity for creating event-based AWS Lambda functions.

Sending faces to IoT Message Broker

AWS DeepLens Device Page

AWS DeepLens Device Console Page

When registering a DeepLens device, AWS creates all things associated with the IoT cloud pillar . If you have a look for yourself in the IoT Core AWS console page, you will see existing IoT groups, devices, certificates, etc.. This all simplifies the process of interacting with the middle-man, the MQTT topic that is displayed on the main DeepLens device console page. The DeepLens (and others if given authorization) has the right to publish messages to the IoT topic within certain limits.

Previously, the AWS Lambda function responsible for detecting faces only showed them on the output streams and was only publishing to the MQTT topic the threshold of detected faces. We can modify this by including cropped face images as part of the packets that are sent the topic.

The Greengrass function below extends the original version by publishing a message for each detected face. Encoded cropped face images are set in the “image_string” key of the object. IoT messages have a size limit of 128 KB, but the images will be well within the limit and encoded in Base64.


# File "src/greengrassHelloWorld.py" in code repository
from threading import Thread, Event
import os
import json
import numpy as np
import awscam
import cv2
import greengrasssdk

class LocalDisplay(Thread):
    def __init__(self, resolution):
    ...
    def run(self):
    ...
    def set_frame_data(self, frame):
    ....

    def set_frame_data_padded(self, frame):
        """
        Set the stream frame and return the rendered cropped face
        """
        ....
        return outputImage

def greengrass_infinite_infer_run():
    ...
    # Create a local display instance that will dump the image bytes
    # to a FIFO file that the image can be rendered locally.
    local_display = LocalDisplay('480p')
    local_display.start()
    # The sample projects come with optimized artifacts,
    # hence only the artifact path is required.
    model_path = '/opt/awscam/artifacts/mxnet_deploy_ssd_FP16_FUSED.xml'
    ...
    while True:
        # Get a frame from the video stream
        ret, frame = awscam.getLastFrame()
        # Resize frame to the same size as the training set.
        frame_resize = cv2.resize(frame, (input_height, input_width))
        ...
        model = awscam.Model(model_path, {'GPU': 1})
        # Process the frame
        ...
        # Set the next frame in the local display stream.
        local_display.set_frame_data(frame)

        # Get the detected faces and probabilities
        for obj in parsed_inference_results[model_type]:
            if obj['prob'] > detection_threshold:
                # Add bounding boxes to full resolution frame
                xmin = int(xscale * obj['xmin']) \
                       + int((obj['xmin'] - input_width / 2) + input_width / 2)
                ymin = int(yscale * obj['ymin'])
                xmax = int(xscale * obj['xmax']) \
                       + int((obj['xmax'] - input_width / 2) + input_width / 2)
                ymax = int(yscale * obj['ymax'])

                # Add face detection to iot topic payload
                cloud_output[output_map[obj['label']]] = obj['prob']

                # Zoom in on Face
                crop_img = frame[ymin - 45:ymax + 45, xmin - 30:xmax + 30]
                output_image = local_display.set_frame_data_padded(crop_img)

                # Encode cropped face image and add to IoT message
                frame_string_raw = cv2.imencode('.jpg', output_image)[1]
                frame_string = base64.b64encode(frame_string_raw)
                cloud_output['image_string'] = frame_string

                # Send results to the cloud
                client.publish(topic=iot_topic, payload=json.dumps(cloud_output))
        ...

greengrass_infinite_infer_run()

Save faces to S3 with an IoT Rule

The third IoT pillar intelligence interacts with the cloud pillar, which uses insights to perform actions on other AWS and/or external services. Our goal is to have all detected faces saved to an S3 bucket in the original JPEG format before we encoded it to Base64. To achieve this, we need to create an IoT rule that will launch an action to do so.

IoT Rules listen for incoming MQTT messages of a topic and when a certain condition is met, it will launch an action. The messages from the queue are analysed and transformed using a provided SQL statement. We want to act on all messages, passing on data captured by the DeepLens device and also inject the “unix_time” property. The IoT Rule Engine will allow us to construct statements that do just that, calling the timestamp function within a SQL statement to add it to the result, as seen in the statement below.


# MQTT message
{
    "image_string": "/9j/4AAQ...",
    "face": 0.94287109375
}

# SQL Statement 
SELECT *, timestamp() as unix_time FROM '$aws/things/deeplens_topic_name/infer'

# IoT Rule Action event
{
    "image_string": "/9j/4AAQ...",
    "unix_time": 1540710101060,
    "face": 0.94287109375
}

The action is an AWS Lambda function (seen below) that is given an S3 Bucket name and an event. At a minimum, the event must contain properties: “image_string” representing the encoded image and “unix_time” which used for the name of the file. The last property is not something that is provided when the IoT message is published to the MQTT topic but instead is added by the IoT rule that calls the action.


# File "src/process_queue.py" in code repository
import os
import boto3
import json
import base64

def handler(event, context):
    """
    Decode a Base64 encoded JPEG image and save to an S3 Bucket with an IoT Rule
    """
    # Convert image back to binary
    jpg_original = base64.b64decode(event['image_string'])

    # Save image to S3 with the timestamp as the name
    s3_client = boto3.client('s3')
    s3_client.put_object(
        Body=jpg_original,
        Bucket=os.environ["DETECTED_FACES_BUCKET"],
        Key='{}.jpg'.format(event['unix_time']),
    )

Deploying an IoT Rule with AWS SAM

AWS SAM makes it incredibile easy to deploy an IoT Rule as it is a supported event type for Serverless function resources, a high-level wrapper for AWS Lambda. By providing only the DeepLens topic name as a parameter for the template below, an fully event-driven and least privalege AWS architecture is deployed.


# File "template.yaml" in code repository
AWSTemplateFormatVersion: '2010-09-09'
Transform: 'AWS::Serverless-2016-10-31'

Parameters:
  DeepLensTopic:
    Type: String
    Description: Topic path for DeepLens device "$aws/things/deeplens_..."

Resources:
  ProcessDeepLensQueue:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: python2.7
      Timeout: 30
      MemorySize: 256
      Handler: process_queue.handler
      CodeUri: ./src
      Environment:
        Variables:
          DETECTED_FACES_BUCKET: !Ref DetectedFaces

      Policies:
        - S3CrudPolicy:
            BucketName: !Ref DetectedFaces

      Events:
        DeepLensRule:
          Type: IoTRule
          Properties:
            Sql: !Sub "SELECT *, timestamp() as unix_time FROM '${DeepLensTopic}'"

  DetectedFaces:
    Type: AWS::S3::Bucket
AWS DeepLens TechConnect

AWS DeepLens: Getting Hands-on (Part 1 of 3)

TechConnect recently acquired two AWS DeepLens to play around with. Announced at Re:Invent 2017, the AWS DeepLens is a small Intel Atom powered Deep Learning focused device with an embedded High-Definition video camera. The DeepLens runs AWS Greengrass, allowing quick compute for local events without having to send a large amount of data for processing on the cloud. This can substantially help businesses reduce costs, sensitive information transfer, and latency response for local events.

Zero to Hero (pick a sample project)

AWS DeepLens Face Detection Project

Creating a face detection AWS DeepLens project

What I think makes the DeepLens special, is how easy it is to get started using computer vision models to process visual surroundings on the device itself. TechConnect has strong capabilities in Machine Learning, but I myself haven’t had much of a chance to play around with Deep Learning frameworks like MxNet or Tensorflow. Thankfully, AWS provides a collection of pre-trained models and projects to help anyone get started. But if you are quite savvy already in those frameworks, you can train and use your own models too.

Face Detection Lambda Function

AWS DeepLens Face Detection Lambda Function

AWS DeepLens Face Detection Lambda Function


An AWS DeepLens project consists of a trained model and an AWS Lambda function (written in Python) at its core. These are deployed to the device to run via AWS Greengrass, where the AWS Lambda function continually processes each frame of coming in from the video feed using the awscam module.

The function can access the model that is downloaded as an accessible artifact on the device at the path. This location and others (example: “/tmp”) have permissions granted to the function from the AWS Greengrass group which is associated with the DeepLens project. I chose the face detection sample project, which processes faces in a video frame captured from the DeepLens camera and draws a rectangle around them.


from threading import Thread, Event
import os
import json
import numpy as np
import awscam
import cv2
import greengrasssdk

class LocalDisplay(Thread):
    def __init__(self, resolution):
    ...
    def run(self):
    ...
    def set_frame_data(self, frame):
    ....

def greengrass_infinite_infer_run():
    ...
    # Create a local display instance that will dump the image bytes
    # to a FIFO file that the image can be rendered locally.
    local_display = LocalDisplay('480p')
    local_display.start()
    # The sample projects come with optimized artifacts,
    # hence only the artifact path is required.
    model_path = '/opt/awscam/artifacts/mxnet_deploy_ssd_FP16_FUSED.xml'
    ...
    while True:
        # Get a frame from the video stream
        ret, frame = awscam.getLastFrame()
        # Resize frame to the same size as the training set.
        frame_resize = cv2.resize(frame, (input_height, input_width))
        ...
        model = awscam.Model(model_path, {'GPU': 1})
        # Process the frame
        ...
        # Set the next frame in the local display stream.
        local_display.set_frame_data(frame)
        ...

greengrass_infinite_infer_run()

Extending the original functionality: AWS DeepLens Zoom Enhance!

AWS DeepLens Face Detection Enhance

AWS DeepLens Face Detection Enhance

I decided to have a bit of fun and extend the original application functionality by cropping and enhancing a detected face. The DeepLens project video output set to 480p definition, but the camera frames from the device are much higher than this! So reusing the code from the original sample that drew a rectangle around each detected face, I was able to capture a face and display that on the big screen. The only difficult thing was centring the captured face and adding padding, bringing back bad memories of how hard centring an image in CSS used to be!


from threading import Thread, Event
import os
import json
import numpy as np
import awscam
import cv2
import greengrasssdk

class LocalDisplay(Thread):
    def __init__(self, resolution):
    ...
    def run(self):
    ...
    def set_frame_data(self, frame):
        # Get image dimensions
        image_height, image_width, image_channels = frame.shape

        # only shrink if image is bigger than required
        if self.resolution[0] < image_height or self.resolution[1] < image_width:
            # get scaling factor
            scaling_factor = self.resolution[0] / float(image_height)
            if self.resolution[1] / float(image_width) < scaling_factor:
                scaling_factor = self.resolution[1] / float(image_width)

            # resize image
            frame = cv2.resize(frame, None, fx=scaling_factor, fy=scaling_factor, interpolation=cv2.INTER_AREA)

        # Get image dimensions and padding after scaling
        image_height, image_width, image_channels = frame.shape

        x_padding = self.resolution[0] - image_width
        y_padding = self.resolution[1] - image_height

        if x_padding <= 0:
            x_padding_left, x_padding_right = 0, 0
        else:
            x_padding_left = int(np.floor(x_padding / 2))
            x_padding_right = int(np.ceil(x_padding / 2))

        if y_padding  detection_threshold:
            # Add bounding boxes to full resolution frame
            xmin = int(xscale * obj['xmin']) \
                   + int((obj['xmin'] - input_width / 2) + input_width / 2)
            ymin = int(yscale * obj['ymin'])
             max = int(xscale * obj['xmax']) \
                   + int((obj['xmax'] - input_width / 2) + input_width / 2)
            ymax = int(yscale * obj['ymax'])

            # Show Enhanced Face
            crop_img = frame[ymin - 45:ymax + 45, xmin - 30:xmax + 30]
            local_display.set_frame_data(crop_img)
            time.sleep(5)
        ...

def greengrass_infinite_infer_run():
    ...
    while True:
        # Get a frame from the video stream
        ret, frame = awscam.getLastFrame()
        # Resize frame to the same size as the training set.
        frame_resize = cv2.resize(frame, (input_height, input_width))
        ...
        model = awscam.Model(model_path, {'GPU': 1})
        # Process the frame
        ...
        # Set the non-cropped frame in the local display stream.
        local_display.set_frame_data(frame)
        
        # Get the detected faces and probabilities
        for obj in parsed_inference_results[model_type]:
           if obj['prob'] > detection_threshold:
               # Add bounding boxes to full resolution frame
               xmin = int(xscale * obj['xmin']) \
                      + int((obj['xmin'] - input_width / 2) + input_width / 2)
               ymin = int(yscale * obj['ymin'])
               xmax = int(xscale * obj['xmax']) \
                      + int((obj['xmax'] - input_width / 2) + input_width / 2)
               ymax = int(yscale * obj['ymax'])

               # Add face detection to iot topic payload
               cloud_output[output_map[obj['label']]] = obj['prob']

               # Zoom in on Face
               crop_img = frame[ymin - 45:ymax + 45, xmin - 30:xmax + 30]
               local_display.set_frame(crop_img)

        # Send results to the cloud
        client.publish(topic=iot_topic, payload=json.dumps(cloud_output))

greengrass_infinite_infer_run()
Machine Learning using Convolutional Neural Networks

Machine Learning with Amazon SageMaker

Computers are generally programmed to do what the developer dictates and will only behave predictably under the specified scenarios.

In recent years, people are increasingly turning to computers to perform tasks that can’t be achieved with traditional programming, which previously had to be done by humans performing manual tasks.   Machine Learning gives computers the ability to ‘learn’ and act on information based on observations without being explicitly programmed.

TechConnect entered the recent Get 2 the Core challenge on Unearthed’s crowd sourcing platformThis is TechConnect’s story, as part of the crowd sourcing approach, and does not imply or assert in any way that Newcrest Mining endorse Amazon Web Services or the work TechConnect have performed in this challenge.

Business problem

Currently a team at Newcrest Mining manually crop photographs of drill core samples before the photos can be fed into a system which detects the material type. This is extremely time-consuming due to the large number of photos. Hence why Newcrest Mining used crowd sourcing via the Unearthed platform, a platform bringing data scientists, start-ups and the energy & natural resources industry together.

Being able to automatically identify bounding box co-ordinates of the samples within an image would save 80-90% of the time spent preparing the photos.

Input Image

Machine Learning using Convolutional Neural Networks

Expected Output Image

Machine Learning using Convolutional Neural Networks

 

Before we can begin implementing an object-detection process, we first need to address a variety of issues with the photographs themselves, being:

  • Not all photos are straight
  • Not all core trays are in a fixed position relative to the camera
  • Not all photos are taken perpendicular to the core trays introducing a perspective distortion
  • Not all photos are high-resolution

In addition to the object-classification, we need to use an image-classification process to classify each image into a group based on the factors above. The groups are defined as:

Group 0 – Core trays are positioned correctly in the images with no distortion. This is the ideal case
Group 1 – Core trays are misaligned in the image
Group 2 – Core trays have perspective distortion
Group 3 – Core trays are misaligned and have perspective distortion
Group 4 – The photo has a low aspect ratio
Group 5 – The photo has a low aspect ratio and are misaligned

CNN Image Detection with Amazon Sagemaker

Solution

We tried to solve this problem using Machine Learning. In particular, we used supervised learning. When conducting supervised learning the system is provided with the input data and the classification/label desired output for each data point. The system learns a model that when provided a previously seen input will reliably output the correct labelling or the most likely label when an unseen input is provided.

This differs from unsupervised learning. When utilising unsupervised techniques, the target label is unknown and the system must group or derive the label from the inherent properties within the data set itself.

The Supervised Machine Learning process works by:

  1. Obtaining, preparing & labelling the input data
  2. Create a model
  3. Train the model
  4. Test the model
  5. Deploy & use the model

There are many specific algorithms for supervised learning that are appropriate for different learning tasks. The object detection and classification problem of identifying core samples in images is particularly suited to a technique known as convolutional neural networks. The model ‘learns’ by assigning and constantly adjusting internal weights and biases for each input of the training data to produce the specified output. The weights and biases become more accurate with more training data.

Amazon SageMaker provides a hosted platform that enabled us to quickly build, train, test and deploy our model.

Newcrest Mining provided a large collection of their photographs which contain core samples. A large subset of the photos also contained the expected output, which we used to train our model.

The expected output is a set of four (X, Y) coordinates per core sample in the photograph. The coordinates represent the corners of the bounding box that surrounds the core sample. Multiple sets of coordinates are expected for photos that contain multiple core samples.

The Process

We uploaded the supplied data to an AWS S3 bucket, using a separate prefix to separate images which we were provided the expected output for, and those with no output. S3 is an ideal store for the raw images with high durability, infinite capacity and direct integration with many other AWS products.

We further randomly split the photos with the expected output into a training dataset (70%) and a testing dataset (30%).

We created a Jupyter notebook on an Amazon SageMaker notebook instance to host and execute our code. By default the Jupyter notebook instance provides access to a wide variety of common data science tools such as numpy, tensorflow and matplotlib in addition to the Amazon SageMaker and AWS python SDKs. This allowed us to immediately focus on our particular problem of creating SageMaker compatible datasets with which we could build and test our models.

We trained our model by feeding the training dataset along with the expected output into an existing Sagemaker built object detection model to fine tune it to our specific problem. SageMaker has a collection of hyperparameters which influence how the model ‘learns’. Adjusting the hyperparameter values affects the overall accuracy of the model and how long the training takes. As the training proceeded we were able to monitor the changes to the primary accuracy metric and pre-emptively cancel any training configurations that did not perform well. This saved us considerable time and money by allowing us to abort poor configurations early.

We then tested the accuracy of our model by feeding testing data – data it has never seen – without the output, then comparing the model’s output to the expected output.

After the first round of training we had our benchmark for accuracy. From there we were able to tune the model by iteratively adjusting the hyperparameters, model parameters and by augmenting the data set with additional examples then retraining and retesting. Setting the hyperparameter values is more of an artform than a science – trial and error is often the best way.

We used a technique which dynamically assigned values to the learning rate after each epoch, similar to a harmonic progression:

Harmonic Progression

This technique allowed us to start with large values to allow the model to converge quickly initially, then reduce the learning rate value by an increasingly smaller amount after each epoch as the model gets closer to an optimal solution.  After many iterations of tuning, training and testing we had improved the overall accuracy of the model compared with our benchmark, and with our project deadline fast approaching we decided that it was accurate as possible in the timeframe that we had.

We then used our model to classify and detect the objects in the remaining photographs that didn’t exist in the training set.  The following images show the bounding boxes around the cores that our model predicted:

CNN Bounding
CNN Bounding

Lessons Learned

Before we began we had an extremely high expectation of how accurate our model would be. In reality it wasn’t as accurate as our expectations.
We discussed things that could have made the model more accurate, train faster or both, including:

  • Tuning the hyperparameters using SageMakers automated hyperparameter tuning tooling
  • Copying the data across multiple regions to gain better access to the specific machine types we required for training
  • Increasing the size of the training dataset by:
    • Requesting more photographs
    • Duplicating the provided photographs and modifying them slightly. This included:
      • including duplicate copies of images and labels
      • including copies after converting the images to greyscale
      • including copies after changing the aspect ratio of the images
      • including copies after mirroring the images
  • Splitting the problem into separate, simpler machine learnable stages
  • Strategies for identifying the corners of the cores when they are not a rectangle in the image

During these discussions we realised we hadn’t defined a cut-off for when we would consider our model to be ‘accurate enough’.

As a general rule the accuracy of the models you build improve most rapidly in the first few iterations, after that the rate of improvement slows significantly. Each subsequent improvement requires lengthier training, more sophisticated algorithms and models, more sophisticated feature engineering or substantial changes to approach entirely. This trend is depicted in the following chart:

Learning accuracy over time

Depending on the use case, a model with an accuracy of 90% often requires significantly less training time, engineering effort and sophistication than a model with an accuracy of 93%. The acceptance criteria for a model needs to carefully balance these considerations to maximise the overall return on investment for the project.

In our case time was the factor that dictated when we stopped training and started using the model to produce the outputs for unseen photographs.

 

Thank you to the team at TechConnect that volunteered to try Amazon Sagemaker to address the Get 2 the Core Challenge posted by Newcrest Mining on the Unearthered portal.  Also big thanks for sharing lessons learned and putting this blog together!

How to Motivate your Team to get AWS Certified

How to Motivate your Team to get AWS Certification

In a recent Team Survey by Wattsnext HR Consultants, TechConnect scored off the charts for the Training and Development opportunities it offers Team Members. We put heaps of effort into this and are proud of the result. In light of this, we thought we would share some of our methodologies on how to motivate your team to get AWS certification.

TechConnect is a cloud-based consultancy firm and we require our team members to always be a few steps ahead of the customers when it comes to tech ability. We have found that studying and completing the AWS certification exams really do assist with fantastic project outcomes and delighted customers, so we have invested heavily in this area.

If  you have a group of over-worked Techies that crawl out the door after a 10 hour slog most days, how do you motivate and invigorate them to further their qualifications? Letting your techies take a moment to sharpen their axe rather than keeping them at it with a blunt one greatly benefits both your employees and your business.  Our proven program is outlined below and is with reference to the AWS exams, but can be altered for other Tech based exams.

 

TechConnect AWS Certification Training and exam ideas :

  • Speak to AWS about understanding their training programs and look at setting up a training plan for your business. The training they have is extremely relevant and the labs are gold.
  • Subscribe to the AWS online training portal.
  • Offer your team instructor-led training. These are nearly always 3-day courses with some very relevant Labs.
  • If you are not located near a training centre or have people that prefer their own space, give the team member 3 days off to complete the online study. We use A Cloud Guru and it has been great.
  • Wherever possible, give the person relevant work to give them experience on the particular tools they will be assessed on. Giving a person a lot of devops work when they are studying for the AWS Certified Big Data exam does not help.
  • Dependent on the exam, we give the person a day off prior to the exam to help them study and prepare.
  • Pay for the cost of the exam for the team member – on the proviso that they pass!
  • Our big clincher is that we have a menu of bonuses the team member gets for passing. We have heaps of guys completing the exams in the lead up to Christmas as they want a little extra cash for the Christmas period. By way of example:
    • Associate Exam – $500
    • Big Data Exam – $2,500
    • Professional Exam – $2,500
  • speak to your account manager to see if they can assist as AWS are usually very excited about us getting people AWS certification
  • Set-up Friday afternoon tech sessions where the guys can ask the more experienced team members the questions they have.
  • Ensure the leaders set good examples for the team. Two of our directors are AWS Certified Professional Architects.

At TechConnect we strongly believe that we are a learning organisation and all team members buy into this culture. Our team members give up their own time to ensure the others are well equipped to study for and take the exams. We’re all here to help support each other!

Outcomes of our program:

  • We have a highly qualified workforce that is ready to delight customers with their expertise.
  • We are looking to achieve several AWS competencies. TechConnect now have our sights on the Big Data competency and are nearly there. The certifications enable the competencies.
  • A knowledgeable workforce works faster which allows us to be competitive in an already competitive market
  • Retention – We have great employee retention due to the investment we make in the team members.  This reminds me of the age-old joke:

Two managers are talking about training their employees. The first asks, “Yeah, but what if we train them, and they just leave?” The second responds, “What if we don’t train them, and they stay?”

  • Lazy team members don’t like to get qualified whereas energetic team members do. We scare the lazy ones and attract the energetic ones who become the high performing team members. This in turn makes Tech Connect a high performing organisation.

At TechConnect, we have had remarkable results with investing in training and hope this article helps in some way in steering your ship. If you are an employee, share this with your Leaders so they can assist in getting you skilled. If they don’t and you believe you are one of the energetic ones, jump on our website and drop us an email 😊

Intensive Care Unit - Data Collection

Precision Medicine Data Platform

Recently TechConnect and IntelliHQ attended the eHealth Expo 2018. IntelliHQ are specialists in Machine Learning in the health space, and are the innovators behind the development of a cloud-based precision medicine data platform. TechConnect are IntelliHQ’s cloud technology partners, and our strong relationship with Amazon Web Services and the AWS life sciences team has enabled us to deliver the first steps towards building out the precision medicine data platform.

This video certainly sums up the goals of IntelliHQ and how TechConnect are partnering to deliver solutions in life sciences on the Amazon Web Services cloud platform.

Achieving this level of integration with the General Electric Carescape High Speed Data Interface is a first in Australia and potentially a first outside of America. TechConnect have designed a lightweight service to connect to the GE Carescape and push the high fidelity data to Amazon Kinesis Firehose and then to persisted cost effective storage on Amazon S3.

With the raw data stored on Amazon S3, data lake principles can be applied to enrich and process data for research and ultimately help save more lives in a proactive way. The diagram below shows a high level architecture that supports the data collection and machine learning capability inside the precision medicine data platform.

 

GE Carescape HSDI to Cloud Connector

This software, named Panacea, will be made available as an open source project.

Be sure to explore the following two sources of further information:

Check out Dr Brent Richards’ presentation at the recent eHealth Expo 2018 as well as a selection of other speakers located here.

AIkademi seeks to develop the capabilities of individuals, organisations and communities to embrace the opportunities emerging from machine learning.

Training graduates for the cloud

How to Train your Grads

How to Train your Graduates

TechConnect is an AWS advanced partner and does a large amount of work in the cloud specifically around making sense of people’s complex data problems. Due to this, the technologies we work with are all relatively new and to find new team members with the necessary skill set is like finding a needle in a haystack.  We have searched far and wide to fill positions with people who have skills in Elastic Mapreduce, Redshift, Kinesis etc. with not much luck.

As we could not find them, we decided to make our own. The recipe was quite simple – train a graduate:

  • Write a study program for getting graduates skilled fast. Amazon Web Services offers a lot of free courses for graduates and there are several excellent value online courses.
  • Create an attractive year long roadmap for graduates to go from entry level salary to a very competitive salary if they hit certain milestones.
  • Ensure your company has a strong culture of sharing knowledge.
  • Attend Ribit University days where students can be introduced to the business.
  • Advertise on Ribit for free.
  • Interview graduates and assess whether their personality is an ideal fit as well as having a competent language ability.
  • Create a test to challenge graduates that want to work for TechConnect.
  • Get graduates to complete the tests.
  • Take on the best performing graduates.
  • Accelerate them through the program.
  • Make sure you have excellent support structures for the graduates. You need to ensure your ratio of graduates to seniors is appropriate for the work you do.

We have found this has worked excellently for us with definite advantages:

  • It’s really great to give back to the community. The whole team enjoys watching the graduates develop and revel in their success. Our team is very passionate in helping people grow their skillsets.
  • The graduate’s brains are like sponges and they take every opportunity to soak up knowledge
  • They are surprisingly very commercially astute.
  • They are loyal as they know you provided them with this excellent opportunity.
  • TechConnect acquires cost effective people that are trained in the exact technologies the company uses.
  • Our graduates have excellent customer service and do not need to be hidden away.
  • No one has 15 years’ experience in Redshift or Kinesis.
  • The graduates we have are generally excellent mathematicians which is gold in the data space.

Some tips with training your graduates:

  • When something new comes out in cloud that you know you will need (eg. AWS Athena) get the graduates to research the new product and then present a report during a brown bag session. As soon as the product is required, the seniors can call on that graduates to help and suddenly, they are subject matter experts.
  • The graduates generally do some really funky projects at Uni that make commercial projects seem quite dull. Let the graduates know this before they come onboard! We have had no complaints from the graduates as we always manage expectations. They are moving from doing intriguing AI projects with limited time pressures to crunching boring financial numbers and presenting this in some visualisation product. We do have interesting projects, but they are not all drones and underwater camera kind of projects.
  • Ensure you know what your graduates have studied. If you get a data project in the health space and you have a Biomedical graduate, they will be of enormous value.
  • Employ seniors who have a passion for upskilling and mentoring others. This is paramount to the success of the program.
  • They can be a bit nervous at times so make sure you explain company strategy and plans. For example, we do quarterly one on ones and had an excellent graduate very nervous that we were letting them go.
  • Graduates don’t know older or more enterprise technologies like SAP or Oracle and to be honest they won’t be too interested in these.
  • Don’t be afraid of taking on PhDs. They are phenomenally clever, and it sounds really cool when introducing them to your customers. We have two PhDs and two in the final stages of getting their PhDs.

Everything said, we have had 9 graduates pass through our doors and have retained 8. Some have been with us for 4 years now and therefore are pretty senior when it comes to cloud technologies.

I started my career off as a Physics/Chem teacher, so I am very passionate in regard to watching people grow. If you have any questions, please give us a hoy.

Banner image © DreamWorks Animation
AWS SAM Project

Using AWS SAM for a CORS Enabled Serverless API

Over the past two years TechConnect has had an increasing demand for creating ‘Serverless’ API backends, from scratch or converting existing services running on expensive virtual machines in AWS. This has been an iterative learning process for us and I feel many others in the industry. However, it feels like each month pioneers in the field answer our cries for help by creating or extending Open-source projects to make our ‘serverless’ lives a little easier.

There are quite a few options for creating serverless applications in AWS (Serverless Framework, Zappa, etc..). However, In this blog post, we will discuss using AWS SAM (Serverless Application Model, previously known as Project Flourish) to create a CORS enabled API. All templates and source code mentioned can be found in this GitHub repository. I heavily recommend having this open in another tab, along with the AWS SAM project.

AWS SAM Project

API Design First with Swagger

Code or Design first? One approach is not necessarily better than the other, but at TechConnect we’ve been focusing on a design first mentality when it comes to building APIs for our clients. We aren’t the users of the APIs we build and we aren’t the front-end developers who might build a website off of it. Instead our goal when creating an external API is to create a logical and human readable API contract specification. To achieve this we use Swagger, the Open API specification to build and document our RESTful backends.

In the image below, we have started to design a simple movie ratings API in YAML using the Open API specification. In its current state, it is just an API contract showing the requests and responses. However, it will be further modified to become an AWS API Gateway compatible and AWS Lambda integrated document in future steps.

Code Structure

Our API is a simple CRUD that will make use of Amazon DynamoDB to create, list and delete movie ratings of a given year. This could all easily reside in a single Python file, but instead we will split it up to make it a little more realistic for larger projects. As this is a small demo, we’ll be missing a few resources that would usually be included in a real project (tests, task runners, etc..), but try having a look at The Hitchhiker’s Guide to Python for a nice Python strucure for your own future APIs.


- template.yaml
- swagger.yaml
- requirements.txt
- movies
  - api
    - __init__.py
    - ratings.py

  - core
    - __init__.py
    - web.py
  - __init__.py

Our Python project movies contains two sub-packages; api and core. Our AWS Lambda handlers are located in api.ratings.py , where each handle will; process the request from API Gateway, interact with DynamoDB (using a table name set by an environment variable) and return an object to API Gateway.

movies.api.ratings.py

...
from movies.core import web

def get_ratings(event, context):
    ...
    return web.cors_web_response(200, ratings_list)

CORS in Lambda Responses

In the previous step you might have noticed we were using a function to build an integration response. The object body is serialized into a JSON string and the headers Access-Control-Allow-Headers, Access-Control-Allow-Methods and Access-Control-Allow-Origin are enabled for Cross-Origin Resource Sharing (CORS).

movies.core.web.py

def cors_web_response(status_code, body):
    return {
        'statusCode': status_code,
        "headers": {
            "Access-Control-Allow-Headers": 
                "Content-Type,Authorization,X-Amz-Date,X-Api-Key,X-Amz-Security-Token",
            "Access-Control-Allow-Methods": 
                "DELETE,GET,HEAD,OPTIONS,PATCH,POST,PUT",
            "Access-Control-Allow-Origin": 
                "*"
        },
        'body': json.dumps(body)
    }

CORS in Swagger

Previously in our Lambda code, we built CORS headers into our responses. However, this is only one half of the solution. Annoyingly we must add an OPTIONS HTTP method to every path level of our API. This is to satisfy the preflight request done by the client to check if CORS requests are enabled. Although it uses x-amazon-apigateway-integration, it is a mocked response by API Gateway. AWS Lambda is not needed to implement this.

swagger.yaml

paths:
  /ratings/{year}:
    options:
      tags:
      - "CORS"
      consumes:
      - application/json
      produces:
      - application/json
      responses:
        200:
          description: 200 response
          schema:
            $ref: "#/definitions/Empty"
          headers:
            Access-Control-Allow-Origin:
              type: string
            Access-Control-Allow-Methods:
              type: string
            Access-Control-Allow-Headers:
              type: string
      x-amazon-apigateway-integration:
        responses:
          default:
            statusCode: 200
            responseParameters:
              method.response.header.Access-Control-Allow-Methods: "'DELETE,GET,HEAD,OPTIONS,PATCH,POST,PUT'"
              method.response.header.Access-Control-Allow-Headers: "'Content-Type,Authorization,X-Amz-Date,X-Api-Key,X-Amz-Security-Token'"
              method.response.header.Access-Control-Allow-Origin: "'*'"
        passthroughBehavior: when_no_match
        requestTemplates:
          application/json: "{\"statusCode\": 200}"
        type: mock

Integrating with SAM

Since AWS SAM is an extension of CloudFormation, the syntax is almost identical. The snippets below show the integration between template.yaml and swagger.yaml. The AWS Lambda function GetRatings name is parsed into the API via a stage variable. swagger.yaml integrates the Lambda proxy using x-amazon-apigateway-integration. One important thing to note is that the Swagger document is not required to create an API Gateway resource in AWS SAM. However, we are using it due to our design first mentality and it being required for CORS preflight responses. The AWS SAM team are currently looking to reduce the need for this in CORS applications. Keep an eye out for the ongoing topic being discussed on GitHub.

template.yaml

AWSTemplateFormatVersion: '2010-09-09'
Transform: 'AWS::Serverless-2016-10-31'
Resources:
  ApiGatewayApi:
    Type: AWS::Serverless::Api
    Properties:
      DefinitionUri: swagger.yaml
      StageName: v1
      Variables:
        GetRatings: !Ref GetRatings
...
  GetRatings:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: ./build
      Handler: movies.api.ratings.get_ratings
      Role: !GetAtt CrudLambdaIAMRole.Arn
      Environment:
        Variables:
          RATINGS_TABLE: !Ref RatingsTable
      Events:
        GetRaidHandle:
          Type: Api
          Properties:
            RestApiId: !Ref ApiGatewayApi
            Path: /ratings/{year}
            Method: GET
...
swagger.yaml

paths:
  /ratings/{year}:
    get:
      ...
      x-amazon-apigateway-integration:
        responses:
          default:
            statusCode: 200
            responseParameters:
              method.response.header.Access-Control-Allow-Origin: "'*'"
        uri: arn:aws:apigateway:REGION:lambda:path/2015-03-31/functions/arn:aws:lambda:REGION:ACCOUNT_ID:function:${stageVariables.GetRatings}/invocations
        passthroughBehavior: when_no_match
        httpMethod: POST
        type: aws_proxy

Deploying SAM API

Now that all the resources are ready, the final step is to package and deploy the SAM application. You may have noticed in the template.yaml the source of the Lambda function was listed as ./build. Any AWS Lambda function that uses non-standard Python libraries will require them to be included in the deployment. To demonstrate this, we’ll send our code to a build folder and install the dependencies.


$ mkdir ./build
$ cp -p -r ./movies ./build/movies
$ pip install -r requirements.txt -t ./build

Finally, you will need to package your SAM deployment to convert it to a traditional AWS CloudFormation template. First your will need to make sure your own account id and desired region are used (using sed). You will also need to provide an existing S3 bucket to store the packaged code. If you inspect the template-out.yaml you will notice that the source of each AWS Lambda function in an object in S3. This is what is used by aws cloudformation deploy. One final tip is to remember to include --capabilities CAPABILITY_IAM in your deploy if you are creating any roles during your deployment.


$ sed -i "s/account_placeholder/AWS_ACCOUNT_ID/g" 'swagger.yaml'
$ sed -i "s/region_placeholder/AWS_REGION/g" 'swagger.yaml'
$ aws cloudformation package --template-file ./template.yaml --output-template-file ./template-out.yaml --s3-bucket YOUR_S3_BUCKET_NAME
$ aws cloudformation deploy --template-file template-out.yaml --stack-name MoviesAPI --capabilities CAPABILITY_IAM
AWS Lambda Specialty - Australia Partners

AWS Service Delivery Program for AWS Lambda

30 November 2016 – TechConnect IT Solutions, Making Your Cloud Journey a Success, announced today that it has achieved AWS Service Delivery Partner status for AWS Lambda.

The AWS Service Delivery Program is designed to highlight AWS Partner Network (APN) Partners who have a track record of delivering verified customer success for specific Amazon Web Services (AWS) products.

The AWS Service Delivery Program was recently launched to help AWS customers find qualified APN Partners that provide expertise in a specific service or skill area. To qualify, partners must pass service-specific verification of customer references and a technical review, meaning customers can be confident they are working with partners that provide recent and relevant experience.

AWS Lambda Partners provide services and tools that help customers build or migrate their solutions to a micro-services based serverless architecture, without the need to worry about provisioning or managing servers.

TechConnect, an Amazon Web Services Advanced Consulting Partner, is proud to participate in the AWS Service Delivery Program for AWS Lambda” said Mike Cunningham, CEO. “Our dynamic team assist organisations to deliver applications in the cloud using elastic serverless architectures. Applications built with no servers means a truly elastic and resilient architecture that grows with you.”

TechConnect build robust and secure serverless architectures with Amazon S3, Amazon Content Distribution Network, Amazon Route53, Amazon Certificate Manager, Amazon API Gateway, Amazon Lambda, Amazon RDS and\or Amazon DynamoDB.

DevOps in the Amazon Web Services Cloud

Are you game?

Gaming Solutions built on the Amazon Cloud

The time is now for Online Gaming

Not so long ago we spoke about the The Future of Cloud Services for Online Gaming and pointed out some of the fear uncertainty and doubt (“FUD”) surrounding online gaming software  providers and gaming operators deploying gaming solutions in the Amazon Web Services cloud.  Much of the concern came from ambiguous text surrounding illegal content and questionable activities.  Shown below as an excerpt from the previous version of the Amazon Acceptable Use Policy circa February this year.

“…

No Illegal, Harmful, or Offensive Use or Content

You may not use, or encourage, promote, facilitate or instruct others to use, the Services or AWS Site for any illegal, harmful or offensive use, or to transmit, store, display, distribute or otherwise make available content that is illegal, harmful, or offensive. Prohibited activities or content include:

  • Illegal Activities. Any illegal activities, including advertising, transmitting, or otherwise making available gambling sites or services or disseminating, promoting or facilitating child pornography.


…”

Today it is a very different matter due to the update on the 16th September 2016, note the omission of any statements with regards to gaming.

“…

You may not use, or encourage, promote, facilitate or instruct others to use, the Services or AWS Site for any illegal, harmful, fraudulent, infringing or offensive use, or to transmit, store, display, distribute or otherwise make available content that is illegal, harmful, fraudulent, infringing or offensive. Prohibited activities or content include:

  • Illegal, Harmful or Fraudulent Activities. Any activities that are illegal, that violate the rights of others, or that may be harmful to others, our operations or reputation, including disseminating, promoting or facilitating child pornography, offering or disseminating fraudulent goods, services, schemes, or promotions, make-money-fast schemes, ponzi and pyramid schemes, phishing, or pharming.
  • Infringing Content. Content that infringes or misappropriates the intellectual property or proprietary rights of others.
  • Offensive Content. Content that is defamatory, obscene, abusive, invasive of privacy, or otherwise objectionable, including content that constitutes child pornography, relates to bestiality, or depicts non-consensual sex acts.
  • Harmful Content. Content or other computer technology that may damage, interfere with, surreptitiously intercept, or expropriate any system, program, or data, including viruses, Trojan horses, worms, time bombs, or cancelbots.

…”

As far as we can tell, the FUD is now a thing of the past.  Embrace the future and the powerful elasticity of the cloud.

Gaming Solutons for AWS Cloud

Online Gaming in the Cloud

Gaming Solutions built on the Amazon Cloud

The Future of Cloud Services for Online Gaming

The Ultimate Marriage

Cloud Computing and Online Gambling is like the series How I Met Your Mother, you know the outcome but the journey to the end of the series is filled with failed relationships, trials, tribulations and the most beautiful moments! We all live for those moments and there is one on the horizon waiting for us.

Cloud and Online Gaming\Gambling are a match made in heaven! There are certain key aspects that align well to the industry these include:

  • Peak Load scalability, why buy hardware for a peak load that occurs once a year when you can scale on demand?
  • Distributed Denial of Service (“DDoS”) mitigation comes as part of the service.
  • Agility, move into new markets in days not weeks or months.
  • Test new markets: fail early, fail fast allows experiments to occur more often with minimal financial impact.
  • Perfect for disaster recovery solutions.
  • Access capability that your budget would typically never allow.
  • Your challenge here

“BUT UM”

In the words of Robin Sherbatsky, “BUT UM” is this legal and allowed?

The online gambling industry has always been at the front of the technical curve but when it comes to the adoption of cloud based technologies they have historically been constrained by the highly regulated environment that they operate in.

Traditional enterprises, faced with highly competitive economic forces, have not been constrained by the same and have adopted cloud technologies in pursuit of competitive advantage. They have effectively paved the way in formulating an approach that balances risk and reward as can clearly be seen in the graphic featured later in this post.

For the gambling industry, the first step into the cloud does not have to be the critical production workload, in fact there are a number of auxiliary workloads that are probably better suited. Examples of these types of workloads are Content, Data Feeds, Big Data Analytics, Development and Test Environments.

An excerpt below from the Amazon Web Services’ Acceptable Use Policy clearly addresses the issue of illegal activities. In today’s highly regulated markets and legal framework for online gambling operators, they are clearly operating in a LEGAL environment no matter what your personal bias may be.

“…

No Illegal, Harmful, or Offensive Use or Content

You may not use, or encourage, promote, facilitate or instruct others to use, the Services or AWS Site for any illegal, harmful or offensive use, or to transmit, store, display, distribute or otherwise make available content that is illegal, harmful, or offensive. Prohibited activities or content include:

 

  • Illegal Activities. Any illegal activities, including advertising, transmitting, or otherwise making available gambling sites or services or disseminating, promoting or facilitating child pornography.


…”

 

Let us move on then.

Help I’m in a Jurisdiction!

Through time we’ve seen regulators soften to the ideas of the modern computing architecture. We’ve progressed from dedicated systems to virtualisation, content distribution from outside the jurisdiction rather than inside, shared platforms for auxiliary systems and a much released grip on the back-office operations (IT specifically not the processes!).

It is only a matter of time before regulators warm to the idea of cloud infrastructures, granted through persistence from the smart licensees. Amazon Web Services now have enter ever changing number here data centres throughout the world, take a look at the existing and planned presence of the Amazon Web Services Global Infrastructure. There should be less and less reason why regulators must enforce in country or region jurisdictions, albeit job creation is a nice excuse.

A great recent example of jurisdictional risk is the Australian Lottoland crash, read at your leisure.

Carpe Diem

Your journey is yours to control.

Not having a strategy to deal with cloud computing will be a risk to your business.

Some applications are better suited to the cloud, are quicker to migrate, are of less value and of less risk to the business. Selecting the right applications to move is an important step in a successful cloud migration.

Why not start with something easy such as content distribution networks (Amazon CloudFront), archiving and cold storage (Amazon S3 and Glacier) or web services protected by DDoS mitigating services like Elastic Load Balancing and Amazon CloudFront? Content distribution, what a great way to start experimenting with cloud services as the gaming industry has been doing this for years already.

Amazon Web Services is not simply an alternative to servers or storage, it is a best-of-breed Infrastructure as a Service and Platform as a Service provider, well ahead of the rest. Independent reports show that Amazon is clearly the leader in the market.

Amazon Web Services Adoption Strategy

Enterprise Migration Path Example

Source Migrating Enterprise Applications to AWS: Best Practices & Techniques (ENT303) | AWS re:Invent 2013

Is it Secure?

Let’s be frank, Amazon Web Services wouldn’t have the rate of adoption it has without good security. Security is paramount and interwoven into every aspect of systems design, deployment and management. There are indeed operators that haven’t invested in advanced security because of the price and cost of ownership, cloud services greatly reduces the barriers to adoption of the best security tools and practices available. Just pay-as-you-go, it becomes really affordable and you will improve your security and availability if correctly planned. Satisfy yourself by visiting the Amazon Web Services Security page or take a look at the accreditations achieved.

The Future of Amazon Web Services for the Gaming Industry

I encourage you to read further as Amazon are investing heavily in architectures to support gaming, much of the same applies to gambling architectures:

Amazon Lumberyard

Amazon Lumberyard is a free, cross-platform, 3D game engine for you to create the highest-quality games, connect your games to the vast compute and storage of the AWS Cloud, and engage fans on Twitch.

By starting game projects with Lumberyard, you can spend more of your time creating great gameplay and building communities of fans, and less time on the undifferentiated heavy lifting of building a game engine and managing server infrastructure.

Amazon Gamelift

Amazon GameLift, a managed service for deploying, operating, and scaling session-based multiplayer games, reduces the time required to build a multiplayer backend from thousands of hours to just minutes. Available for developers using Amazon Lumberyard, Amazon GameLift is built on AWS’s highly available cloud infrastructure and allows you to quickly scale high-performance game servers up and down to meet player demand, without any additional engineering effort or upfront costs.

Gaming Case Studies

Amazon Web Services offers a comprehensive suite of products and services for video game developers across every major platform: mobile, console, PC and online. From AAA console and PC games, to educational and serious games, AWS provides the back end servers and hosting services for your game studio.

Build, deploy, distribute, analyze and monetize with AWS. Pay as you go, and only pay for what you use. Focus on your game, not your infrastructure.

GO EXPERIMENT and Barney Stinson says “BE LEGEN-wait-for-it-DARY!”

 

TechConnect will be happy to Make YOUR Cloud Journey a Success!