Saving cats with IoT : Part Two

The basic solution we’re going to build here is as follows (see part one for more information)

1) Build the project into a box using an ESP8266 and a VL53L0X time of flight sensor.

2) Send array of 320 distance measurements for any 16 second period where someone may have crossed the laser. Why 320? arbitrary, I was plotting the moving average on a TFT screen in version 1 and it had 320 pixels across …

3) Data is sent to AWS IoT Core and then a Rule Engine Action sends to AWS IoT Analytics.

4) Every hour, run a query on IoT Analytics to pull the last 14 days of measurement data. Doesn’t have to be 14 days, or run every hour, but that’s what I’m using.

5) Trigger a container dataset that executes the custom analysis when the SQL dataset has completed, this is the Jupyter notebook that will parse the data extracted in 4) and determine if we need to alert anyone about the cats or not.

The SQL query for (4) looks like this;

The output from the notebook, which shows a heat map of activity along with the most recent door crossing looks like this;

Using IoT to save Cats – what the internet was made for!

My wife had a cheery thought the other week, “How to do we provide for our cats when we’re gone?” This turned into a discussion of what would happen if we were both run over by a bus or died in a plane crash – who would feed the cats before it was too late? Morbid as it may be, there is a real problem to solve here as cats can only survive a matter of days without water and hepatic lipidosis can be kill a starving cat in just a few days too – so there’s actually quite a narrow window for help to arrive, and as we both live thousands of miles from our families, there are several scenarios that could mean help wouldn’t arrive until after it was too late.

So what to do?

Well, turn to IoT of course! I now have a challenge – how do we alert friends and family to the urgent cat peril should the worst happen? After ruling out various camera related options, pressure sensors under food bowls and water level sensors, I arrived at the simple conclusion that provided we could detect humans crossing into the kitchen, where the cat food is, then we could assume that all was well (when we’re away on vacation, we have cat sitters that look after the house and the cats).

Now that we’ve turned the problem into one of how to detect humans entering the kitchen, it’s much more fun. At first I was tempted to use an ultrasonic range detector to determine if people were walking through the kitchen door, but in theory cats can hear the ultrasonic frequencies concerned and whilst our 2 cats didn’t seem to notice it at all, rather than cause them any long term stress I went looking for another option.

Enter the tiny VL53L0X time of flight sensor that can measure accurately distances up to about 1200mm, which is perfect for measuring whether someone is coming through a doorway or not.

Using an ESP8266 based MCU from Adafruit, I soon had the sensor all packaged up in a small project box and secured to the kitchen door. As you can see, I put the power connector on the wrong side of the box (or the window for the TOF sensor, depending on how you look at it) – but it’s connected up and works a treat. As you can see, Mette the cat is most impressed!

The software on the Micro-controller calculates and stores the moving average of the reported distance in front of the sensor every 50ms for 320 samples (so 16 seconds of time). If it detects that something has happened, it sends all 320 samples to AWS IoT for additional analysis, otherwise it sends nothing to save on data that isn’t useful.

Plotting the data that arrives when something has been detected results in graphs a bit like this;

Notice how the observed distance is typically around 900mm (the distance across the door) but when someone walks in front, there is an easily recognisable pulse that we can use to state that someone has walked in or out of the kitchen – and since that is where the cat food lives, we can make the further assumption that someone is feeding the cats.

Interestingly, not all spikes are what you might think. Here’s one that happened when we were out of the house.

Notice that this time, the observed distance jumped UP and I think what happened here was that a glint of sun caught the sensor – so I’ll be tweaking my algorithm to ignore spikes that go in the wrong direction like this for the next iteration!

So, we have the basic technology in place, every time we think we’ve seen something cross in front of the sensor, we’ll send the data to AWS IoT – but how do we use this to save our cats? In part two we’ll cover how we used AWS IoT Analytics to build the workflow that keeps track of when the cats might have been last fed and alerts key people if something is not as it should be.

Connecting the ESP8266 to AWS IoT Core over MQTT

Securely sending IoT data to the cloud is an important consideration, especially if you can receive messages from the cloud and then activate equipment. It might be annoying if my house lights are turned on or off by someone else, but if my garage or front door can be opened by a malicious person, that’s much more serious.

AWS IoT Core is a secure platform for sending IoT device data, but this in turn presents challenges for developers using some of the popular micro-controllers like the ESP8266 which has very little RAM and a relatively slow processor. AWS IoT Core uses X.509 client certificates to identify devices and you’ll need to be able to negotiate a TLS 1.2 connection – which can be quite a challenge for a constrained device. Prior to the end of 2017 this was a real issue for the ESP8266 but thanks to work on the SSL libraries, it is now possible to easily make a secure connection – with one caveat.

While the ESP8266 can now make a TLS 1.2 negotiated connection to AWS IoT Core and identify itself using an X.509 client certificate, for a secure connection the client also needs to verify that the server really is who it claims to be. This is done by verifying the certificate authority that signed the server certificate and currently, this is beyond the memory capabilities of the ESP8266.

So caveat aside, what does the code look like for making a secure connection to AWS IoT Core over MQTT?

There are a few ways of handling the certificate encoding, and there is a nice example of how to do this over on github written by one of the contributors to the ESP8266 Arduino project.

While the risk of a compromise here is low, you should be cautious about any data you send or receive without verifying the identity of the server. Sending temperature readings or receiving commands to turn on some small projects is low risk, but I wouldn’t be sending my credit card details over MQTT from the ESP8266 for example (although this would be a stupid thing to do in any event). Remember this is not a security issue with AWS IoT Core, it’s with the Arduino library running on the ESP8266 which currently doesn’t have the capability of verifying the certificate chain. This may change in the future.

The good news is, there is an easy solution, upgrade your projects to the more recent and more powerful ESP32 – the big brother of the ESP8266 from the same manufacturer.

The code for the ESP32 is similar but simpler;

Copy and paste the certificate and key files you get when you create your device Thing in AWS IoT Core and tweak the formatting so you can use them like this;

const char*  certificatePemCrt = \
"-----BEGIN CERTIFICATE-----\n" \
"MIIDWjDCAkKgAwIBAlIVAO4oCOcEtp6ex+nzUkv1+Nd4ZcgEMA0GCFqGSIL3DQ3B" \
"---------------- redacted for clarity and privacy --------------" \
"AcVdn0SlXDZ2eqEIXs79tsOuw7awrkWvMRyZ8A4lQlin53dA77jXEzwbAOp6dp==" \
"-----END CERTIFICATE-----\n";

const char*  privatePemKey = \
"-----BEGIN RSA PRIVATE KEY-----\n" \
"MIIZpAIBFFKCAQEArKDwRPmAnkF0lomDj6i8I8qDRyTuJOLmCbn8CtPl12QlT7Yc" \
"---------------- redacted for clarity and privacy --------------" \
"Neawrz1V983PPKSrXeim6f6/gZq92ut5mCZZFwkN+muQtlLDixpFjL==" \
"-----END RSA PRIVATE KEY-----\n";

const String AmazonCACert = \
"MIIE0zCCA7ugAwIBAgIQGNrRniZ96LtKIVjNzGs7SjANBgkqhkiG9w0BAQUFADCB" \
"yjELMAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWduLCBJbmMuMR8wHQYDVQQL" \
"ExZWZXJpU2lnbiBUcnVzdCBOZXR3b3JrMTowOAYDVQQLEzEoYykgMjAwNiBWZXJp" \
"U2lnbiwgSW5jLiAtIEZvciBhdXRob3JpemVkIHVzZSBvbmx5MUUwQwYDVQQDEzxW" \
"ZXJpU2lnbiBDbGFzcyAzIFB1YmxpYyBQcmltYXJ5IENlcnRpZmljYXRpb24gQXV0" \
"aG9yaXR5IC0gRzUwHhcNMDYxMTA4MDAwMDAwWhcNMzYwNzE2MjM1OTU5WjCByjEL" \
"MAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWduLCBJbmMuMR8wHQYDVQQLExZW" \
"ZXJpU2lnbiBUcnVzdCBOZXR3b3JrMTowOAYDVQQLEzEoYykgMjAwNiBWZXJpU2ln" \
"biwgSW5jLiAtIEZvciBhdXRob3JpemVkIHVzZSBvbmx5MUUwQwYDVQQDEzxWZXJp" \
"U2lnbiBDbGFzcyAzIFB1YmxpYyBQcmltYXJ5IENlcnRpZmljYXRpb24gQXV0aG9y" \
"aXR5IC0gRzUwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCvJAgIKXo1" \
"nmAMqudLO07cfLw8RRy7K+D+KQL5VwijZIUVJ/XxrcgxiV0i6CqqpkKzj/i5Vbex" \
"t0uz/o9+B1fs70PbZmIVYc9gDaTY3vjgw2IIPVQT60nKWVSFJuUrjxuf6/WhkcIz" \
"SdhDY2pSS9KP6HBRTdGJaXvHcPaz3BJ023tdS1bTlr8Vd6Gw9KIl8q8ckmcY5fQG" \
"BO+QueQA5N06tRn/Arr0PO7gi+s3i+z016zy9vA9r911kTMZHRxAy3QkGSGT2RT+" \
"rCpSx4/VBEnkjWNHiDxpg8v+R70rfk/Fla4OndTRQ8Bnc+MUCH7lP59zuDMKz10/" \
"NIeWiu5T6CUVAgMBAAGjgbIwga8wDwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8E" \
"BAMCAQYwbQYIKwYBBQUHAQwEYTBfoV2gWzBZMFcwVRYJaW1hZ2UvZ2lmMCEwHzAH" \
"BgUrDgMCGgQUj+XTGoasjY5rw8+AatRIGCx7GS4wJRYjaHR0cDovL2xvZ28udmVy" \
"aXNpZ24uY29tL3ZzbG9nby5naWYwHQYDVR0OBBYEFH/TZafC3ey78DAJ80M5+gKv" \
"MzEzMA0GCSqGSIb3DQEBBQUAA4IBAQCTJEowX2LP2BqYLz3q3JktvXf2pXkiOOzE" \
"p6B4Eq1iDkVwZMXnl2YtmAl+X6/WzChl8gGqCBpH3vn5fJJaCGkgDdk+bW48DW7Y" \
"5gaRQBi5+MHt39tBquCWIMnNZBU4gcmU7qKEKQsTb47bDN0lAtukixlE0kF6BWlK" \
"WE9gyn6CagsCqiUXObXbf+eEZSqVir2G3l6BFoMtEMze/aiCKm0oHw0LxOXnGiYZ" \
"4fQRbxC1lfznQgUy286dUV4otp6F01vvpX1FQHKOtw5rDgb7MzVIcbidJ4vEZV8N" \
"hnacRHr2lVz2XTIIM6RUthg/aFzyQkqFOFSDX9HoLPKsEdao7WNq";

And now you can configure your wiFiClient to use the certificate and verify the CA certificate with;

wiFiClient.setCertificate(certificatePemCrt);
wiFiClient.setPrivateKey(privatePemKey);
wiFiClient.setCACert(AmazonCACert);

You could use similar code to the github project linked earlier – this ESP32 example just shows another way of encoding the certificate in your Arduino sketch.

SECURITY CAUTION

Anyone who can get physical access to your device can access the private key if you don’t take additional steps to protect it – and you’d also be surprised how many people upload the code containing the private keys to github, you may not want to do that either! There are steps you can take to make your system more secure, such as storing the private key in a dedicated crypto store like the ATECC508A microchip or, if using the ESP32, consider using the flash encryption features to add further layers of protection. Another great option is to make sure you don’t reuse certificates and keys on multiple devices and ensure that you only grant the minimum necessary permissions so that even if your private key was compromised, you could revoke the certificate paired with the key and only one single device would be impacted.

For more information on the AWS IoT security model, this blog post has more detail.

Vibration analysis with the ESP8266 & MPU6050

Our furnace blower motor began making an awful noise recently and despite best efforts to persuade it to run smoothly by adjusting the belt tension, there was an annoying rhythmical thump-thump-thump noise coming from it. Although detecting this degraded operation was super easy after the fact, I wondered how easy it would be do detect the early signs of a problem like this where essentially I would want to look for unusual vibration patterns to spot them well in advance of being able to hear that anything was wrong.

While looking at vibration sensors, I came across various small gyros and accelerometers and figured that they might be just the thing, so I ordered a few different types and prototyped a small project using the MPU6050 6 axis gyro / accelerometer package.

I used an ESP8266 micro-controller to gather the data and send it to an MQTT topic using AWS IoT Core and the 8×8 display segment is used to tell me when the device is capturing and when it is sending.

The accelerometer package is on the small board with the long plastic stick attached. I decided to use this so I could clip it into a photo hook that I could stick on the furnace motor. I know the physics of this are distinctly questionable, but I was interested if I could make any sense of the accelerometer readings.

Here it is all hooked up and capturing data – hence the large ‘C’ on the display.

The code for the ESP8266 was written using the Arduino IDE and makes use of the MIT licensed i2cdevlib for code to handle the MPU6050 accelerometer which is a remarkably competent sensor in a small package that can do a lot more than this simple project demonstrates.

Hopefully if you’ve been reading previous blogs, you’ll recall that we can use our standard pattern here of;

  1. Send data to AWS IoT Core MQTT topic
  2. Use a Rule to route the message to an AWS IoT Analytics Channel
  3. Connect the Channel to a Pipeline to a Data Store for collecting all the data
  4. Use data sets to perform the analysis

For sending the data to AWS IoT Core, I use the well established Arduino pubsubclient library and my publication method looks like this, with much of the code being for debugging purposes and helping me see what the device is doing.

int publish_mqtt(JsonObject &root,char const *topic) {

    int written = 0;
    if (root.success()) {    
       written = root.printTo(msg);
    }

    sprintf(outTopic,"sensor/%s/%s",macAddrNC,topic);
  
    Serial.print(F("INFO: "));    
    Serial.print(outTopic);
    Serial.print("->");
    Serial.print(msg);
    Serial.print("=");
    
    int published = (pubSubClient.publish(outTopic, msg))? 1:0;
    Serial.println(published);    
    return published;
}

The Rule simply routes all the sensor data to the appropriate topic like this;

But let’s take a look at the dataset – what information are we actually recording from this sensor?

Of course we can look at the C code running on the micro-controller to see what I send, and that looks like this;

void publish_data(int index) {
 
    if (!pubSubClient.connected()) { return; }
 
    DynamicJsonBuffer jsonBuffer(256);
    JsonObject &root = jsonBuffer.createObject();
   
    VectorInt16 datapoint = capture[index];
    root["seq"]=sequence;
    root["i"]= index;
    root["x"]=datapoint.x;
    root["y"]=datapoint.y;
    root["z"]=datapoint.z;
    publish_mqtt(root,"vibration/mpu6050");   
    jsonBuffer.clear();
           
}

And when we extract that data with a simple SQL query to get all of the data, we see a preview like this;

The x/y/z readings are the accelerometer readings for each of the x/y/z axes.  These aren’t quite raw sensor readings, they are the acceleration with the effect of gravity removed, and while this isn’t directly important for this example, the code that does that with the MPU6050 in my C code looks like this;

mpu.dmpGetQuaternion(&q, fifoBuffer);
mpu.dmpGetAccel(&aa, fifoBuffer);
mpu.dmpGetGravity(&gravity, &q);
mpu.dmpGetLinearAccel(&aaReal, &aa, &gravity);
VectorInt16 datapoint = VectorInt16(aaReal.x,aaReal.y,aaReal.z);

What about the sequence number and the i value?

My example code samples data for a few seconds from the sensor at 200Hz and then stops sampling and switches to sending mode, then it repeats this cycle. To help me make sense of it all, the sequence number is the epoch time for the start of each capture run and the i value is simply an index that counts from 0 up through n where n is the number of samples. This helps me analyse each chunk of data separately if I want to.

I was quite excited to see what this data looked like, so I created a Notebook in AWS IoT Analytics and did a simple graph of one of the samples. Hopefully the pattern of reading a dataset and plotting a graph is becoming familiar now so I won’t include all the setup code, but here’s the relevant extract from the Jupyter Notebook;

# Read the dataset

client = boto3.client('iotanalytics')
dataset = "vibration"
dataset_url = client.get_dataset_content(datasetName = dataset)['entries'][0]['dataURI']
df = pd.read_csv(dataset_url)

# Extract 1000 sample points from the sequence that began at 1518074892

analysis = df[((df['seq'] == 1518074892) & (df['i'] < 1000))].sort_values(by='i', ascending=True, inplace=False)

# Graph the accelerometer X axis readings

analysis.plot(title='Vibration Analysis x', \
                         kind='line',x='i',y='x',figsize=(20,8), \
                         color='red',linewidth=1,grid=True)

I was really pretty excited when I saw this first result. The data is clearly cyclical and it looks like the sample rate of 200Hz might have been fast enough to get something usable.

Let’s check this isn’t a fluke and look at the y-axis data as well. It’s worth saying that because I just randomly stuck the sensor onto the motor, my vibration data will be spread across the x,y,z axes and I was interested to see if this rendered the data unusable or whether something as simple as this could work.

This looks slightly cleaner than the x-axis data, so I chose to use that for the next steps.

Now for some basic data science

I have the raw data and what I want to know is – what are the key vibration energies of this motor. This helps answer the question is it running smoothly or is there a problem? How do I turn the waveform above into an energy plot of the main vibration frequencies? This is a job for a fast Fourier transform which “is an algorithm that samples a signal over a period of time and divides it into its frequency components”. Just what I need.

Well almost – perfect. So I now know I want to use a FFT to analyse the data, but how do I do that? This is where the standard data science libraries available with Amazon Sagemaker Jupyter Notebooks come to the rescue and I can use scipy and fftpack with a quick import like this;

import scipy.fftpack

This lets me do the FFT analysis with just a few lines of code;

sig = analysis['y']
sig_fft = scipy.fftpack.fft(sig)

# Why 0.005? The data is being sampled at 200Hz
time_step = 0.005

# And the power (sig_fft is of complex dtype)
power = np.abs(sig_fft)

# The corresponding frequencies
sample_freq = scipy.fftpack.fftfreq(sig.size, d=time_step)

# Only interested in the positive frequencies, the negative just mirror these. 
# Also drop the first data point for 0Hz

sample_freq = sample_freq[1:int(len(sample_freq)/2)]
power = power[1:int(len(power)/2)]

For the moment of truth, let’s plot this on a graph and see if we have a clear signal we can interpret from the data.

plt.figure(figsize=(20, 8))
plt.xlabel('Frequency [Hz]')
plt.ylabel('Power')
plt.title("FFT Spectrum for single axis")
plt.xticks(np.arange(0, max(sample_freq)+1, 2.0))
plt.plot(sample_freq, power, color='blue')

I was pretty excited when I saw this as the plot of power against frequency made sense. The large spike at around 11Hz aligned with the thump-thump-thump noise I could hear and the smaller, but still significant spike at 30Hz could well be the ‘normal’ operating vibration since the mains frequency is 60Hz. I’m guessing a bit at this since I’m neither a data scientist, a motor expert or an electrician, but it made sense to me. The important thing is that we have extracted a clear signal from the data that can be used to provide an insight.

Detecting clouds and clear skies (part two)

Last time we covered how to route data from a cloud sensor to IoT Analytics and how to create a SQL data set that would be executed every 15 minutes containing the most recent data. Now that we have that data, what sort of analysis can we do on it to find out if the sky is cloudy or clear?

AWS IoT Analytics is integrated with a powerful data science tool, Amazon Sagemaker, which has easy to use data exploration and visualization capabilities that you can run from your browser using Jupyter Notebooks. Sounds scary, but actually it’s really straight forward and there are plenty of web based resources to help you learn and explore increasingly advanced capabilities.

Let’s begin by drawing a simple graph of our cloud sensor data as often visualizing the data is the first step towards deciding how to do some analysis. From the IoT Analytics console, tap Aalyze and then Notebooks from the left menu. Tap Create Notebook to reach the screen below.

There are a number of pre-built templates you can explore, but for our project, we’re going to start from a Blank Notebook so tap on that.

To create your Jupyter notebook (and the instance on which it will run), follow the official documentation Explore your Data section and get yourself to the stage where you have a blank notebook in your browser.

Let’s start writing some code. We’ll be using Python for writing our analysis in this example.

Enter the following code in the first empty cell of the notebook. This code loads the boto3 AWS SDK , the pandas library which is great for slicing and dicing your data, and mathplotlib which we will use for drawing our graph. The final statement allows the graph output to appear inline in the notebook when executed.

import boto3
import pandas as pd
from matplotlib import pyplot as plt
%matplotlib inline

Your notebook should start looking like the image below – we’ll explain the rest of the code shortly.

client = boto3.client('iotanalytics')
dataset = "cloudy"
dataset_url = client.get_dataset_content(datasetName = dataset)['entries'][0]['dataURI']
df = pd.read_csv(dataset_url)

This code reads the dataset produced by our SQL query into a panda data frame. One way of thinking about a data frame is that it’s like an Excel spreadsheet of your data with rows and columns and this is a great fit for our data set from IoT Analytics which is already in tabular format as a CSV – so we can use the read_csv function as above.

Finally, to draw a graph of the data, we can write this code in another cell.

df['datetime'] = pd.to_datetime(df["received"]/1000, unit='s')
ax1 = df.plot(kind='line',x='datetime',y='object',color='blue',linewidth=4)

df.plot(title='Is it cloudy?',ax=ax1, \
                         kind='line',x='datetime',y='ambient',figsize=(20,8), \
                         color='cyan',linewidth=4,grid=True)

When you run this cell, you will see the output like this for example

Here’s all the code in one place to give a sense of how little code you need to write to achieve this.

import boto3
import pandas as pd
from matplotlib import pyplot as plt
%matplotlib inline

client = boto3.client('iotanalytics')
dataset = "cloudy"
dataset_url = client.get_dataset_content(datasetName = dataset)['entries'][0]['dataURI']
df = pd.read_csv(dataset_url)
df['datetime'] = pd.to_datetime(df["received"]/1000, unit='s')

ax1 = df.plot(kind='line',x='datetime',y='object',color='blue',linewidth=4)
df.plot(title='Is it cloudy?',ax=ax1, \
                         kind='line',x='datetime',y='ambient',figsize=(20,8), \
                         color='cyan',linewidth=4,grid=True)

Of course what would be really nice would be to be able to run analysis like this automatically every 15 minutes and notify us when conditions change, this will be the topic of a future post that harnesses a recently released feature of IoT Analytics  for automating your workflow and in the meantime you can read more about that in the official documentation.

 

 

Detecting clouds and clear skies (part one)

As a keen, yet lazy, amateur astronomer, my quest for a fully automated observatory continues. My ideal morning would start with a lovely cup of coffee and an email from my observatory telling me what it was able to image overnight along with some nice photos. To achieve this, one of the pieces of information I need the computer system to know is whether the sky is clear or not. If it’s clear, then we can open the observatory roof, if it’s cloudy, we should stop the observation session – that sort of thing.

Unsurprisingly, finding sensors to detect clouds isn’t that straight forward, but it turns out that a possible solution comes from a neat little infra-red temperature sensor. Point one of these straight up to the sky, and you’ll get quite different readings when it’s cloudy or clear, so a bit of data analysis can easily determine if it’s likely to be worth rolling back the observatory roof or not.

For my project, I used some gorilla glue to fix the sensor inside a cable gland and then mounted it on the top of a small project box like this.

Inside the box, all we need is a trusty ESP8266 Micro-controller, a power connector and a few resistors – total project cost around $30. Commercial cloud sensors (yes, you can buy such a thing) start at several hundred $ and up, so if we can get this to work, it will be a very frugal option.

As you can see, I’ve left the USB cable connected to the device so that I can easily re-program the MCU later if required. I could of course do this with an OTA (over the air) update, but for this project the cable is fine.

Here it is, screwed onto the fence in the garden.

So what does the data look like? The upper cyan line is the ‘ambient’ or local temperature at sensor level whereas the dark blue like is the ‘object’ or remote temperature. The larger the difference, the clearer the skies, and when they are reading the same, that typically means there is rain or snow directly on the sensor window.

The software running on the MCU is written in C using the Arduino IDE and the ESP8266 SDK. It doesn’t do anything complex, it connects to the local WiFi network, establishes a secure MQTT connection with AWS IoT Core, and then every 30 seconds or so it reads the temperature sensor and then publishes the data to an MQTT topic. It really is a ‘dumb’ data collector since it makes no attempt to infer the state of the sky locally on the box.

So how do we pick up the MQTT data and analyze it? I’d like to be able to infer the state of the sky now, but also to have a historic record of my data for later analysis, and perhaps to use for training a machine learning model against other sources of data (images of the sky for example). For scenarios where you want to store the connected device data, AWS IoT Analytics is often a good fit and so what I’m going to do is as follows;

  1. Create a Data Store in AWS IoT Analytics to collect all my data
  2. Create a Channel to receive the data from the MQTT Topic
  3. Create a Pipeline to join the Channel to the Data Store, and perhaps send some real-time data to CloudWatch at the same time.
  4. Create a Rule in AWS IoT Core to route data from the MQTT topic to my channel
  5. Schedule a dataset to analyze the data every 15 minutes
  6. Publish to an SNS topic when it’s both dark and the sky seems clear

I covered steps 1 to 4 in an earlier introductory blog with part one and part two, and the principle is the same for any project like this. Let’s turn our attention to the analysis part of the project.

Head back to the IoT Analytics console and from the Analyze sub-menu, select Data sets and then tap Create

SQL Data sets are used when you want to execute a query against your data store and this is the common use case and what we will want to start with. Container Data sets are more advanced and let you trigger the execution of arbitrary Python (or indeed a custom container) once the SQL Data set is ready. Container Data sets are both powerful and flexible as we will see a bit later on.

So let’s start by creating the SQL Data set, tap on Create SQL and pick a suitable name and select the Data Store that you want to execute the query against.

Tap Next and now we get the SQL editing screen where we can enter our query that will run every 15 minutes.

The query I’m using in more detail is;

SELECT ambient,object,status.uptime,status.rssi,status.heap,epoch,received FROM cloudy_skies 
WHERE __dt >= current_date - interval '5' day 
AND full_topic like '%infrared/temperature'

An important note here is the __dt WHERE clause. IoT Analytics stores your messages partitioned by ingest date to make query performance faster and lower your costs. Without this line, the whole data store would be scanned and depending on how much data you have, this could take a very long time to complete. In this case, I’m choosing to pull out the most recent 5 days of data, which is more than I actually need to know if it is currently cloudy or not, but gives me flexibility in the next stage when I author a Jupyter Notebook to do the analysis.

Once you have your query, tap Next to configure the data selection window.

I’m going to use the default ‘None’ option here. The other option, delta windows, is a powerful option that enables you to perform analysis on only the new data that has arrived since you last queried the data. I’ll cover this more advanced topic in a future post, but for now just tap on Next to move on to the scheduling page.

Setting a schedule is entirely optional, but in this case we want to check on sky conditions every 15 minutes, so we can choose that option from the drop-down menu and tap Next to move to the final step, setting the retention policy.

Retention policies are useful when you might have large data sets that are incurring storage costs you’d prefer to avoid and you don’t need the data to be available for long periods. For this project, my data sets are small and I don’t need to take any special action, so just tap on the final Create data set button and we’re done.

Let’s review what we’ve done

We’ve created a Channel connected to a Pipeline feeding a Data store where all the IoT device data will be collected.

We’ve created a rule in IoT Core to route data from the appropriate MQTT topic into the Channel.

We’ve created a Data set that will execute a SQL query every 15 minutes to gather the most recent data.

How do we do some analysis on this data to see if the sky is clear though? I’ll cover that in part two.

 

Why I love the ESP8266

Underneath the silver square on the larger, rectangular breakout board sits my favourite micro-controller, the ESP8266 by Espressif – the perfect* starting point for all your IoT projects.

Not only are these little super-stars crazily cheap, less than $10 each, they can run from batteries, have a deep sleep mode to conserve power and yet have WiFi built in – along with digital I/O pins, an Analog to Digital Converter and more …

For programming, I’m a fan of the Arduino framework which fully supports these micro-controllers and is not only easy to use but has a wealth of libraries and support available on the internet. You can get the Arduino IDE from arduino.cc and  the ESP8266 Arduino SDK is maintained on github.com

There is an abundance of web help for getting going with this combination, and I’m not going to repeat or dive deep with that here. You could do worse than begin with this README.md from the link above.

UPDATE

Since I began my IoT projects, Amazon has taken stewardship of FreeRTOS, a different (non-Arduino) operating system for micro-controllers that is more suitable for robust commercial applications than my little hobby projects. Amazon FreeRTOS doesn’t yet run on the ESP8266 but does run on the more powerful ESP32 and I’m planning to migrate my projects to that combination in future as time permits.

*Well, almost perfect.

One of the challenges with memory and cpu constrained micro-controllers is negotiating a secure connection to the cloud service where you may be sending and receiving messages. In particular, performing validation of the certificate chain is – or at least was when I began, beyond the capabilities of this little chip.

It is still capable of negotiating a TLS 1.2 connection and using a client X.509 certificate to authenticate, which is good, but it’s important to note that without being able to validate the server certificate, a man in the middle attack is technically possible. I elected to accept this risk for my tinkering, however I have now begun to migrate to the big brother to the ESP8266, the ESP32, which is capable of making a more secure connection by validating the server provided certificate chain.