IOT Data Behaving Badly
One of the biggest sins being perpetrated by so-called IOT technology companies is that they simply don’t understand data: its cost and the use to which it is going to be put. Getting either one of these attributes wrong is damaging but getting both wrong will be a business funeral. More importantly, it will cause a loss of business confidence in the IOT space. We all have a duty to correct this.
I am going to divide this discussion into the three components: cost, usage and control. Whilst all three attributes have some common territory, they each drive different fundamental properties of the IOT space and all influence business outcomes. The underlying discussion is about the technology chosen to deliver data. Capability and cost are directly related as we shall see.
1. Data Costs
The cost problem is a function of the technology employed and there are two sources of data cost: the telecommunications network and management framework. Starting with the telecoms side first, there are essentially three available avenues to the internet:
· Local Area Network – this is a low or no cost connectivity solution but isn’t likely to be widely available, especially in remote monitoring scenarios. But for building or factory networks, it is a genuine option. Any data volume is possible and two way flows are implicit.
· 3G/4G Data – This is provided by all major Telcos. By obtaining a data SIM (and not just any old SIM) you can connect anywhere within a cell tower radius. Mostly this is around 3km from a tower but in rural areas without hilly terrain, this can reach 7-8km. Any data volume is possible. The cost here is based on the SIM card but will be at least $10/month for up to 1GB per month. Two-way data flow is a genuine capability. Battery powered options are not practical unless you include solar powered chargers.
· LoRa 900Mhz Data – Low power wide area networks are being progressively installed into Telco towers throughout the world. In Australia they can only be found in capital cities right now but a rural rollout is envisaged over the next 24 months. Line of sight can yield up to a 25km range with quite a decent battery life, depending on data flow and sensor type. However, you are limited to 144 messages a day per device and usually something quite small like 12 bytes per message. This is also very definitely a one-way solution. Only data coming out from the device is possible and will not be controlling devices or updating its firmware remotely. The cost however can be as low as $10/year per device, not counting base stations or repeaters.
These communication costs need to be viewed as a per device cost but the device may be aggregating multiple sensors, except in a LoRa situation where aggregation is not possible. When using 3G/4G, care must be taken with data flows or the bandwidth charges are going to go through the roof. There is also a question of who owns the telecommunications contract and who will pay the excess charges when they occur. The big thing to note is that cost is measured in bytes.
The other data cost arises where these data packets are sent. Whether this is Amazon AWS, Cloudera, Microsoft Azure or the host of other platforms, there is a cost and they are roughly similar. They all have a free tier but no scalable commercial solution is likely to survive in the free tier alone. For example, Microsoft offers the following for the Azure IoT Hub:
The basis here is in messages per day. If you have a device that wants to send one message a second, this is 86,400 messages per day – not even 5 devices for the $50 tier. If you only need to send one message an hour, then your $50 tier supports some 16,000 devices. The complication though is what these services count as a “message”. Command calls, device lookups, heart beats and other network calls all count as a message. Unfortunately, it doesn’t end there. On top of this you need data storage and web jobs to manage the data flow, as well as visualisation resources or software. More monthly costs and more management complexity. Unless you are designing a genuine multi-tenant solution then these costs will dwarf the platform cost. If you are going multi-tenant, then there is management software to write. More costs.
The summary I offer you here is that if you design badly or don’t know what you need and why, it is going to break the piggy bank. Most of the commercial solutions on offer do not properly disclose all these costs or come with so many zeros in the price tag they are untenable.
2. Using Data Properly
In many senses, this is an easier discussion because we should be able to match the data flow requirements to the application. Why then do I see so many solutions being proposed to consumers that cannot deliver what is going to be expected? What is happening is that solution deliverers are concentrating on one specific communications technology and trying to flatten out all data needs into the one model. Duh! Not going to work. Let’s look at some examples and discuss where integration might be of value.
Various companies are targeting LoRa solutions at the agricultural sector and there are three very popular implementations: soil moisture, weather stations and stock counting. You could easily argue that soil moisture won’t change much in an hour so hourly reporting is fine. But what about weather? Do you want to wait one hour to find out the wind picked up to gale force or that a cloud burst occurred over your irrigated paddocks? Stock counting provides another insight into some low-brow thinking out there. Perhaps the aggregate can be sent hourly but if I am loading pens or counting stock through a dip, I need to know immediately the pen is full, not an hour later.
Silo and water level monitoring offers a different challenge. The devices are often very remote and usually do not have a local electricity source. It would seem obvious that a LoRa solution is perfect and sometimes it will be. But in the case of the water tank, we were asked to add EC and pH monitoring. Oops, they also wanted to monitor back pressure in the outflow line to detect leaks. I now have four LoRa devices and it is starting to look like a 3G battery powered central control unit on 4G with a solar charger is a more practical and cost effective solution.
If we aren’t talking remote, such as in the agricultural sense, then the rules are different. We can take as much data as we can generate, assuming the underlying bandwidth will cope. In a secure facility you might want to monitor every door and every access keypad and on a busy day it might generate a lot of data. Whatever the source or reason, the communication side is not really a problem but the data visualisation might be.
And here lies the next challenge. Are you dumping all that data on the client and expecting them to make head or tails from it or are you going to provide aggregation outcomes, alerts or management statistics of value? There are some very nice monitoring kits coming out of Scandinavia right now but apart from needing a genius electrician to install, you are left to design your own visualisations and set your own alerts. They also charge like a wounded bull per sensor. In itself this is quite a technical operation and if we all leave this to our prospective users to design, very few are going to get value for their trouble. Monitoring is fine but doing something sensible with that data is crucial to market acceptance.
By control l mean feedback. Refrigeration is a really good example. If a door is left open for too long, I want to set off an audible alarm in the building. Seed potato must be stored between 3.8C and 5.5C – too warm and it sprouts, too cold and it dies. If the temperature goes outside these limits I need to turn the unit up or down immediately, not wait for a human to respond to an SMS. (They could be at a party!) Being alerted that I have a disaster on my hands is not enough if I cannot get to unit fast enough. Even the silo solution has feedback potential. Outflow usage is slow – no problem - but when the truck is blowing the new grain in, it would be nice to know when it is nearing full so that we don’t overflow the silo. We could shut off the pump or sound a klaxon to force the onsite user to manual control.
There is also a common scenario where once a problem arises, a technician wants to see much denser data in order to assess what parts or equipment to bring with him on the inevitable call out. Unless there is some way to proactively change the data flow out of the device, it cannot happen. This might turn a low data flow pattern temporarily into a high flow pattern. There are countless examples of this. Heating and cooling, mechanical doors, alarms, lifts, plant and machinery and many, many more. To me, it isn’t enough to just monitor data; we need to provide value back to the source and help control problem situations.
In summary, our technology choice will dictate or compromise the capability we can supply the client. It will also dramatically influence the cost and complexity of providing that service. What does the customer expect from the data? One shoe size does not fit all and unless you are going to help the client do something practical with all that data then everyone is wasting their time.