This post shows a way to find out how many IoT (Edge) devices have been provisioned by a specific enrolment group within the last x minutes.
The solution could be much simpler if I just wanted to know how many devices are registering themselves. In this case the build in metrics are enough to get that information.
IoT Hub Metrics
The use case required a more sophisticated solution that is able to reflect the tenants, identified by tags.
Solution Architecture
Device Provisioning Service
Different Enrolment Groups separate devices in this scenario into Tenants. To be able to identify the customers, an initial tag CustomerId is added to the enrolment group. It is then applied to the devices that are create by DPS in the IoT Hub.
The metrics from DPS did not allow me to distinguish the tags/customers. But IoT Hub will make them available and offers events for newly created devices.
IoT Hub
Within IoT Hub I created an event subscription, that passed on all necessary events to an EventHub.
Event Subscription in IoT Hub
The event will include the device twin, which has been prepopulated with the tags specified in the enrolment group.
Device Twin in IoT Hub
As seen in the architecture diagram, Event Grid has been connected to an Event Hub. #plugandplay 😉 It will fire an event with the documented schema: Azure IoT Hub and Event Grid | Microsoft Docs
Event Hub
Why the additional Event Hub? Event Grid cannot be used as input for an Azure Stream Analytics Job and Event Hub is the universal connector in this case.
You can use the smallest tier (which is Basic) as there is not a lot of events flowing through it. The default 2 partitions is also fine.
As you can see, the query is pretty simple and can be adjusted easily.
Azure Stream Analytics Job Query
The example uses a blob storage as output, but you can choose to write to an Azure Function or whatever you want to do with the know how that one customer has onboarded lots of devices in a short period of time.
In this post I would like to show some tweaks you can (and might need to) apply to influence the behavior of your IoT Edge device, when it comes to message retention on devices that are limited in resources.
The setup of this scenario is not uncommon, as it uses a module to retrieve telemetry from machines, parses them in another module and sends the messages to an IoT Hub.
The problem
After a while the device is not sending data anymore and is not accessible via SSH. The logs reveal lots of message still in the queue.
Lots of messages in the queue in edgeHub logs
But why? And how can I find out what causes the problem? Spoiler: Disk full 🙁
Troubleshoot
Looking at logfiles helps a lot – if you have access to the logfiles. Fortunately IoT Edge can expose data in the Prometheus exposition format for the edgeHub and edgeAgent. These endpoints are enabled by default for IoT Edge 1.0.10 (upgrade to this version if you haven’t) and can be enabled for 1.0.9.
The data can then be uploaded to Log Analytics for further analysis and to create alerts with a sample metrics-collector module.
For analyzation and to display the metrics, you can use a Workbook in Azure Monitor.
Azure Monitor Workbook with edgeHub log extract
In this particular case I could see that the available disk space was going down, down, down until the whole device did not respond anymore (no SSH access possible, no data sent to Azure).
What to change?
Adding more space to the disk was not an option. Other solutions needed to solve the issue. There are 2 options I looked at and adjusted to be a better fit for the usage scenario and resource limitation.
After tweaking the settings, the following graph shows that now the device cleans up data before the disk runs full.
I can not give you values for you particular setup. You’ll need to figure them out for your setup depending on the amount of messages going though the Edge device and hardware sizing. Here are some pointers to settings which you might want to investigate, if you hit a similar problem on your devices:
The above image shows setting for RocksDB (orange: 512MB, blue, 128MB, green 256MB). With the default setting the device is running out of disk space.
What can I do to prevent the device crashing?
Well, it depends 🙂 You can find a setting from the above that will prevent a full disk for a known scenario. But if you don’t know which modules with which setting is deployed?
In this case an alarm for low disk space is an option. It then needs to trigger a function that calls a method on the device to restart the edgeHub. This will clear the cache.
In this post I want to show how to use properties that are added to messages that IoT devices are sending to Azure IoT Hub in Stream Analytics. And while talking about properties, let’s even use message enrichment 🙂
Stream Analytics Architecture
Sample Message
The green properties will be added by the Message enrichment feature of IoT Hub, as the data is not most likely not known on the IoT device or does not need to be transferred with each message.
The code that sends the message with the alert property has been adjusted to this:
string dataBuffer = $"{{\"messageId\":{count},\"temperature\":{_temperature},\"humidity\":{_humidity}}}";
using (var eventMessage = new Message(Encoding.UTF8.GetBytes(dataBuffer)))
{
eventMessage.Properties.Add("temperatureAlert", (_temperature > TemperatureThreshold) ? "true" : "false");
Configure IoT Hub
Device Twin
In most cases the IoT (Edge) device does not know which customer it is associated, as it does not need to know. For further processing of the data – or for device management – this information is relevant. Therefore we add this information to the device twin in Azure IoT Hub.
The property names do not need to match the desired properties that will be added via message enrichment. You can choose a structure that fits best.
Message Enrichment
We want to add the customer number and id from the device twins to the message before it is being passed along to an endpoint.
Message Enrichment settings in IoT Hub
As you can see the name of the property that is added does not need to match the name of the twin properties. Make sure you add the message enrichment to the right endpoint(s). You can decide to add different properties to messages that are routed to different endpoints.
Azure Stream Analytics
In the Stream Analytics job we use a SQL like query to filter the incoming message stream and route the messages to endpoints. The query will work fine as long as you use only the columns that are in the body of the messages (like “temperature” or “humidity” in this examle).
To be able to use the values in the properties, we need to use the GetMetadataPropertyValue function. Please take not of the sentence on the docs page: This function cannot be tested on the Azure portal using sample data
Query
SELECT
GetMetadataPropertyValue([IoTHub-Messaging], '[User].[temperatureAlert]') AS temperaturealert,
GetMetadataPropertyValue([IoTHub-Messaging], '[User].[CustomerName]') AS customername,
GetMetadataPropertyValue([IoTHub-Messaging], '[User].[CustomerId]') AS customerid,
*
The first three columns are our property and message enrichment columns while the other columns are all added as well.
Output
Let’s assume we want to add all message to a storage account where the customer id is part of the path.
Stream Analytics Blob storage output
This will work, as we added the customerid column in the query and it can be used for the path. Remember this is a demo and we only use the customerid as part of the path.
In the architecture diagram at the beginning of the post an Alert route is drawn. You can achieve this by adding a second query to the job which routes certain messages to that output.
With Azure IoT Edge you can deploy modules (also known as Docker Containers) to a server. I’ve created a sample solution on GitHub that deploys a module which monitors the temperature of the harddisk that the server is running on.
I though this had to be an easy task. Well, actually it is. If you find the right documentation and read it in the correct order 🙂
Basically I wanted to be able to login with my AAD (Azure Active Directory) user.
In the first step, the database needs to be configured for Azure Active Directory in order to add users in the second step.
Configure an Administrator
In the Azure portal go the the SQL server and search for “active directory” to add an Active Directory admin.
After you’ve added an admin and saved the value, you will be able to use SSMS (SQL Server Management Studio) to logon to the server. Probably SSMS will prompt you about a firewall exception.
Use SQL Management Studio to add users and grant permissions
For other users (not the administrator we configured above) to be able to logon, access has to be granted like with an on premises SQL Server.
Add a user to the master DB
Create a new query o
CREATE USER [rene.hezser@something.com] FROM EXTERNAL PROVIDER;
Next grant permissions to the user on the database itself.
Add user to database
Open another query on the database.
CREATE USER [rene.hezser@something.com] FROM EXTERNAL PROVIDER;
ALTER ROLE [db_owner] ADD MEMBER [rene.hezser@something.com];
Set-AzureWebsite : No default subscription has been designated. Use Select-AzureSubscription -Default <subscriptionName> to set the default subscription.
*doh* Again I’ve used PowerShell comandlets for Azure classic instead of Resource Manager 🙁
Reminder: Always check for the magic “Rm” chars in the command, if a resource cannot be found.
Nicht vergessen. Morgen findet das Azure Meetup zum Thema Build, Test und Deployment mit Azure in Bielefeld statt.
Meetup #2 – Build, Test und Deployment mit Azure
Wednesday, Oct 11, 2017, 7:00 PM
Arvato Bielefeld / Sennestadt Fuggerstraße 11 Bielefeld, DE
17 Mitglieder Went
Liebe Azure OWL Community,am[masked] wird unser zweites Azure OWL Meetup durchgeführt.Diesmal soll es hauptsächlich um Build, Test und Deployment auf der Azure Platform gehen.Tyler von Microsoft wird uns einen Vortrag zu Build, Test und Deployment vorstellen und speziell auf Delivery Pipelines mit Docker, Kubernetes und Vistual Studio Team…
A Runbook schedule can be triggered every hour. If you need a smaller interval, like every minute, you can use the Azure Scheduler to do so.
So I went to the Azure Portal, created an Azure Schedule instance (with a job collection tier of at least basic, to be able to create schedules that are triggered every minute) and called a Runbook via webhook.
The Runbook contains a cmdlet that results in an error 🙁 Get-AzureRmMetric : The term 'Get-AzureRmMetric' is not recognized as the name of a cmdlet, function, script file, or
operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try
again.
Azure cmdlets can be made available through the Automation Account the Runbook is using. The “Browse Gallery” link will let you find and choose the necessary cmdlets.
The error message above appears, because a) the cmdlet was not installed and b) the referenced version of AzureRM.profile was to old. Fortunately the problem can be resolved easily by upgrading the Azure modules.
After all modules are up to date, I could add the desired module and my runbook wasn’t complaining anymore 🙂
To be able to connect to a secure Service Fabric Cluster via PowerShell, you need to import the certificate specified into your personal certificate store. Otherwise an Exception will be thrown. Unfortunately the Exception does not point into the right direction 🙁
So in case you get an Exception like this
Connect-ServiceFabricCluster : An error occurred during this operation. Please check the trace logs for more details. At line:1 char:1 + Connect-ServiceFabricCluster -ConnectionEndpoint xyz-sf-de … + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : InvalidOperation: (:) [Connect-ServiceFabricCluster], FabricException + FullyQualifiedErrorId : CreateClusterConnectionErrorId,Microsoft.ServiceFabric.Powershell.ConnectCluster
you need to import the certificate with its private key (*.pfx) into the personal certificate store of the PC you are running PowerShell on.
Specifying -verbose for PowerShell will print additional information, that does not help a lot.
PS C:\WINDOWS\system32> Connect-ServiceFabricCluster -ConnectionEndpoint xyz-sf-dev.northeurope.cloudapp.azure.com:19000 -X509Credential -FindType FindByThumbprint -FindValue xyz -StoreLocation CurrentUser -StoreName My -ServerCertThumbprint xyz -Verbose VERBOSE: System.Fabric.FabricException: An error occurred during this operation. Please check the trace logs for more details. —> System.Runtime.InteropServices.COMException: Exception from HRESULT: 0x80071C57 at System.Fabric.Interop.NativeClient.IFabricClientSettings2.SetSecurityCredentials(IntPtr credentials) at System.Fabric.FabricClient.SetSecurityCredentialsInternal(SecurityCredentials credentials) at System.Fabric.Interop.Utility.<>c__DisplayClass25_0.<WrapNativeSyncInvoke>b__0() at System.Fabric.Interop.Utility.WrapNativeSyncInvoke[TResult](Func`1 func, String functionTag, String functionArgs) — End of inner exception stack trace — at System.Fabric.Interop.Utility.RunInMTA(Action action) at System.Fabric.FabricClient.InitializeFabricClient(SecurityCredentials credentialArg, FabricClientSettings newSettings, String[] hostEndpointsArg) at Microsoft.ServiceFabric.Powershell.ClusterConnection.FabricClientBuilder.Build() at Microsoft.ServiceFabric.Powershell.ClusterConnection..ctor(FabricClientBuilder fabricClientBuilder, Boolean getMetadata) at Microsoft.ServiceFabric.Powershell.ConnectCluster.ProcessRecord() Connect-ServiceFabricCluster : An error occurred during this operation. Please check the trace logs for more details. At line:1 char:1 + Connect-ServiceFabricCluster -ConnectionEndpoint xyz-sf-de … + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : InvalidOperation: (:) [Connect-ServiceFabricCluster], FabricException + FullyQualifiedErrorId : CreateClusterConnectionErrorId,Microsoft.ServiceFabric.Powershell.ConnectCluster
Am 13.7. wird das erste Treffen des Azure Meetups OWL stattfinden. Wir sind derzeit noch auf der Suche nach einem Ort und den Themen 🙂
Vermutlich wird es um Chatbots und Machine Learning gehen. Wir werden aber auch auf euer Feedback eingehen um Themen für die nächsten Treffen zu finden.
Wenn ihr dabei sein möchtet, meldet euch bitte über die Meetup Seite an.
Update:
Ort: Arvato Systems, An der Autobahn 100, Gütersloh, Tower I
Zeit: 19:00 – 21:00 Uhr
Es wird um eine namentliche Anmeldung gebeten, da die Besucher beim Pförtner angemeldet werden müssen. Bitte entweder über Meetup, oder sendet mir eine E-Mail.