In order to deliver insights, IoT solutions need to support real-time, batch-mode, and predictive analysis of the information generated by the solution. Since each mode of analysis is better informed when leveraging historic data and future approaches to analysis might not yet be understood, IoT solutions must archive data in a manner as flexible as possible to meet future requirements.
IoT solutions ensure that the business can obtain the most current and evolving collection of insights by storing raw unprocessed sensor data in a manner supporting the in-order replay of those raw samples. The store-and-replay ability should make historic raw samples appear almost as if the samples were arriving in the normal time-ordered sequence of non-historic samples.
The Telemetry Archiving design shown in the following diagram can deliver this functionality.
telemetry/deviceIDcontaining the measurement. This message is sent via a transport protocol to a protocol endpoint made available by the Server.
(4)and a raw storage path
telemetry/deviceID/replaytopic. The solution processes the replayed messages as necessary.
When implementing this design, consider the following questions:
Simply put yes. Most solutions should assume the answer to this question is “yes” since the replay of raw unprocessed sensor data enables the IoT solution to support the evolution of an IoT solutions' insights through:
An example of this consideration is below.
A physical site is being monitored for electrical energy (kWh) used. Energy sensors are sampled every 30 seconds and the samples are reported into the solution once a minute. As the raw, unprocessed messages arrive they are stored and a “15-minute average” process automatically calculates the 15-minute average of the monitored energy used. The calculated results are stored as records in the solution’s processed record repository. These new processed records are then used by additional analytic processes and the IoT solution’s user interface.
At a later date it is clear that the solution’s users also want processed records that display the maximum energy used every 5 minutes. To deliver this benefit, a new “5-minute maximum” process is implemented. This new process replays the historic raw unprocessed records and calculates the 5 minute maximum over every historic interval. Once complete, each calculated result is stored as a new type of processed record.
Without the retention of the raw samples, the solution would be limited to only performing calculations on data that arrives after the “5-minute maximum” feature is implemented in the solution. Most importantly, without the original raw samples, users would be unable to analyze the physical site using the 5-minute feature, prior to the time the feature was implemented.