IOT Data needs to be SMART, not Big
Now mix IOT in with this. I have seen and read so much
rubbish that the veritable explosion of devices ^ is going to create so much
data that we will be swamped. I don’t know what this means: will Western
civilisation collapse or if we yawn long enough, will it just go away? Or will
we need gum boots to push our way through?
But here is the problem and yes, IOT is the cause. A whole
bunch of companies are racing off to build devices that are going to project
data into something somewhere that someone somehow is going to visualise and
goggle at endlessly. Apparently. So yes, I can set 1000 sensors on 50 farms to
start sending data off to some cloud based storage somewhere. Repeated
thousands of times over everywhere in the world, these little sensors are going
to start pumping data at some massive rate (seconds? minutes? hours?) into
cloud storage. How incredibly stupid and how incredibly wasteful.
We need SMART
data, not big data and anybody who tells me they are going to create tonnes of
data from their remote monitoring solutions is confirming their complete lack
of credibility in this space. For data to be useful it needs to be relevant and
accessible. Terabytes of unstructured drivel is still drivel. So in order to
save the world from collapse we have worked on a SMART data solution:
Sparsely MAnaged Relevant Transactions
First, we need to be SPARSE.
Let me give you an example. You are monitoring an airlock
and you need to know when a door is opened and how often. You might even want
to know how long it is open for bio-security rating purposes. Your sample
frequency is per second because the open door will trigger alerts. So if the
door is shut, I don’t need to be told that over and over. As long as a
heartbeat exists, I only want to know when the state changes – ie the door
opens or closes. Another example: a cool room temperature which we monitor once
a minute. Under normal conditions, the room will chill to within 1.5C and 4C.
if the room temp doesn’t vary more than (say) 0.5C then I don’t want to know
about it. Don’t send the data! It can easily be extrapolated at the storage end
without fuss or complexity. Save the bandwidth and storage cost.
Secondly, the process needs to be MAnaged
Data has to be controlled at the source. It is too late once
it gets into the hubs and storage pools. This is my problem with most LoRA
solutions. They just pump data regardless because that is all they can do.
There needs to be some intelligence at the device in order restrict outgoing
packets to be sparse data and not big data. My contention is that the sending
gateway must have application code and an operating system in order to manage
what ends up in our IOT solution space. Managed devices are just as important
to bandwidth and storage control as sparse data protocols are.
Thirdly, data needs to be Relevant
We must not send data just because we can. That is just
vain: the thought that somebody one day might want it. Phooey. We need a reason
for the data we capture if for no other reason than there is a cost to storage
and a cost for processing. Let’s go back to our cool room example. We know that
the thermostat should set the room to 1.5 – 4C but it is misbehaving. The
repair man decides to change the precision to 0.1C and get readings every
second instead of every minute. He wants to watch how fast the compressor pulls
down the room and how fast it warms up. You see, we don’t really need this data
for the last six months, just now when a problem has been detected. Once the
problem is fixed I revert back to minute readings on a lower precision. We
could always collect this data but that just ignores the cost and performance
hits we suffer as a consequence. We need to collect relevant data, not
everything just because we can. This is the bower bird syndrome on steroids.
One more comment: this process of changing the capture parameters could be
automated, if you think about it, assuming the device is managed. The repair
man will take a profile of the data so that he can review comparative
performance down the track. It is idle wastefulness to collect that data
continually. It also means you don’t understand your system’s basic metrics.
Finally, we must think of our data as being Transaction
basedSo what do I mean by that? Not only should data be sparse, managed and relevant but it should be part of some logical business process. It should belong to a business transaction, not just recorded ‘because’. For example, if I am monitoring the pressure differential across a filter system, the differential value will have levels to create alerts which are aimed at advising me when to replace the filters. Transactions accumulate into trend sets and usually have events that are generated from level settings. Yes, it might be interesting to watch the pressure differential but it isn’t useful if it cannot result in a practical transaction of some kind. The important outcome here is how often I have to replace the filters, not the pressure differentials leading up to the event. Ok, that data can be predictive in itself but after a few months, I have data for proactive replacement rather than waiting for failure. Note here how the transaction actually modifies with experience? It is all about being smart and not just reactive. This is how we give our customers true value and it doesn’t derive from big data at all. Just smart data.
In summary, we need to collect sparsely managed data that is
relevant and transaction oriented. Obviously we need to know why we want it and
what we are going to do with it.
This is where most IOT solutions go wrong. They go out and
collect data then look for a problem to solve with it. Data without relevance
and without a transaction outcome. Ladies and gentlemen, turn this around. Go
out there and find problems that you can build an IOT solution to solve.
Geoff Schaller@ArcoflexIOT
Note: ^Gartner again, 50 billion by 2020, although the
number changes more often than the font