125
I Use This!
Very High Activity

News

Analyzed 1 day ago. based on code collected 3 days ago.
Posted over 4 years ago
Qt Automotive Suite 5.12.4 was released today.
Posted over 4 years ago
Qt Automotive Suite 5.12.4 was released today.
Posted over 4 years ago
I am pleased to announce that Qt 5.13.1 is released today. As a patch release, Qt 5.13.1 does not add any new functionality but provides many bug fixes and other improvements.
Posted over 4 years ago
We are happy to announce the release of Qt Creator 4.10.0!
Posted over 4 years ago
This is a follow up to my previous posts about using Qt MQTT to connect to the cloud. MQTT is a prominent standard for telemetry, especially in the IoT scenario.   We are often approached by Qt customers and users on how to connect to a variety of ... [More] cloud providers, preferably keeping the requirements list short. With this post I would like to provide some more information on how to create a connection by just using Qt, without any third-party dependency. For this comparison we have chosen the following cloud providers: Amazon IoT Core Microsoft Azure IoT Hub Google Cloud IoT Core Alibaba Cloud IoT Platform The ultimate summary can be viewed in this table   The source code to locally test the results is available here. However, if you are interested in this topic, I recommend preparing a pitcher of coffee and continue reading… And if you are want to jump to a specific topic, use these shortcuts: Preface Getting connected Standard derivation (limitations) Available (custom) topics Communication routes Other / references Additional notes How can I test this myself? Closing words Preface / Setting expectations Before getting into the details I would like to emphasize some details. First, the focus is on getting devices connected to the cloud. Being able to send and receive messages is the prime target. This post will not talk about services, features, or costs by the cloud providers themselves once messages are in the cloud. Furthermore, the idea is to only use Qt and/or Qt MQTT to establish a connection. Most, if not all, vendors provide SDKs for either devices or monitoring (web and native) applications. However, using these SDKs extends the amount of additional dependencies, leading to higher requirements for storage and memory. The order in which the providers are being evaluated in this article is based on public usage according to this article. Getting connected The very first steps for sending messages are to create a solution for each vendor and then establish a TCP connection. Amazon IoT Core We assume that you have created an AWS account and an IoT Core service from your AWS console in the browser. The dashboard of this service looks like this: The create button open a wizard which will help setting up the first device. The only required information is the name of the device. All other items can be left empty. The service allows to automatically create a certificate to be used for a connection later. Store the certificates (including the root CA) and keep them available to be used in an application. For now, no policy is required. But we will get into this at a later stage. The last missing piece to start implementing an example is the hostname to connect to. AWS provides a list of endpoints here. Please note, that for MQTT you must use the account-specific prefix. Also, you can find the information on the settings page of the AWS IOT dashboard. Using Qt MQTT, a connection is then established with those few lines: const QString host = QStringLiteral(" A couple of details are important for a successful connection: The keepalive value needs to be within a certain threshold. 10 seconds seem to be a good indicator. Port 8883 is the standardized port for encrypted MQTT connections. The ClientID must be basicPubSub. This is a valid ID auto-generated during the creation of the IoT Core instance.   Microsoft Azure IoT Hub First, an account for the Azure Portal needs to be created. From the dashboard you need to create a new “Iot Hub” resource. The dashboard can be overwhelming initially, as Microsoft puts many cloud services and features on the fore-front. As the focus is on getting a first device connected, the simplest way is to go to Shared access policies and create a new access policy with all rights enabled. This is highly discouraged in a production environment for security reasons. Selecting the freshly-created policy we can copy the connection string. Following, we will use the Azure Device Explorer application, which can be downloaded here. This application suits perfectly for testing purposes. After launch, enter the connection string from above into the connection test edit and click update. The management tab allows for creating new test devices, specifying either an authentication via X509 or Security Keys. Security keys are the preselected standard method, which we aim at as well. Lastly, the Device Explorer allows us to create a SAS token, which will be needed to configure the MQTT client. A token has the following shape: HostName= We only need this part for authentication: SharedAccessSignature sr== The Azure IoT Hub uses TLS for the connection as well. To achieve the root CA, you can clone the Azure IoT C SDK located here or obtain the DigiCert Baltimore Root Certificate manually. Neither the web interface nor the Device Explorer provides it. To establish a connection from a Qt application using Qt MQTT the code looks like this const QString iotHubName = QStringLiteral(" Google Cloud IoT Core Once you have created an account for the Google Cloud Platform, the web interface provides a wizard to get your first project running using Cloud IoT Core. Once the project has been created, it might be hard to find your registry. A registry stores all information on devices, communication, rules, etc. Similar to Microsoft Azure, all available services are placed on the dashboard. You will find the IoT Core item in the Big Data section on the left side. After using the Google Cloud Platform for a while, you will find the search very useful to get to your target page. From the registry itself you can now add new devices. The interface asks you to provide the keys/certificates for your device. But it does not have a mean to create some from the service itself. Documentation exists on how to create these. And at production stage those steps will probably be automated in a different manner. However, for getting started these are additional steps required, which can become a hurdle. Once your device is entered into the registry, you can start with the client side implementation. Contrary to other providers, Google Cloud IoT Core does not use the device certificate while creating a connection. Instead, the private key is used for the password creation. The password itself needs to be generated as a JSON Web Token. While JSON Web Tokens are an open industry standard, this adds another dependency to your project. Something needs to be able to create these tokens. Google provides some sample code here, but adaptations to include it into an application are required. The client ID for the MQTT connection is constructed of multiple parameters and has the following form: projects/PROJECT_ID/locations/REGION/registries/REGISTRY_ID/devices/DEVICE_ID From personal experience, be aware of case sensitivity. Everything but the Project ID keeps the same capitalization as you created your project, registry and device. However, the project ID will be stored in all lower-case. Having considered all of this, the simplest implementation to establish a connection looks like this: const QString rootCAPath = QStringLiteral("root_ca.pem"); const QString deviceKeyPath = QStringLiteral("rsa_private.pem"); const QString clientId = QStringLiteral("projects/PROJECT_ID/locations/REGION/registries/REGISTRY_ID/devices/DEVICE_ID"); const QString googleiotHostName = QStringLiteral("mqtt.googleapis.com"); const QString password = QByteArray(CreateJwt(deviceKeyPath, " Alibaba Cloud IoT Platform The Alibaba Cloud IoT Platform is the only product which does come in multiple variants, a basic and a pro version. As of writing of this article this product structure seems to have changed. From what we can say it does not have an influence on the MQTT related items investigated here. After creating an account for the Alibaba Cloud the web dashboard allows to create a new IoT Platform instance. Following the instantiation, a wizard interface allows to create a product and a device. From these we need a couple of details to establish a MQTT connection. Product Key Product Secret Device Name Device Secret The implementation requires a couple of additional steps. To acquire all MQTT specific properties, the Client ID, username and password are created by concatenations and signing. This procedure is fully documented here. For convenience the documentation also includes example source code to handle this. If the concern is to not introduce external code, the instructions in the first link have to be followed. To connect a QMqtt client instance, this is sufficient iotx_dev_meta_info_t deviceInfo; qstrcpy(deviceInfo.product_key, "   You might recognize that we are not using QMqttClient::connectToHostEncrypted() as for all other providers. The Alibaba Cloud IoT Platform is the only vendor, which uses a non-TLS connection by default. It is documented, that it is possible to use one and also to receive a rootCA. However, the fact that this is possible after all surprises. Standard derivation (limitations) So far, we have established a MQTT connection to each of the IoT vendors. Each uses a slightly different approach to identify and authenticate a device, but all of these services follow the MQTT 3.1.1 standard. However, for the next steps developers need to be aware of certain limitations or variations to the standard. These will be discussed next. None of the providers have built-in support for quality-of-service (QoS) level 2. To some extend that makes sense, as telemetry information do not require multiple steps to verify message delivery. Whether a message is processed and validated is not of interest in this scenario. A developer should be aware of this limitation though. To refresh our memory on terminology, let us briefly recap retained and will messages. Retained messages are stored on the server side for future subscribers to receive the last information available on a topic. Will messages embedded to the connection request and will only be propagated in case of an unexpected disconnect from the client.   Amazon IoT Core The client ID is used to identify a device. If a second device uses the same ID during a connection attempt, then the first device will be disconnected without any notice. The second device will connect successfully. If your application code contains some sort of automatic reconnect, this can cause all devices with the same client ID to be unavailable. Retained messages are not supported by AWS and trying to send a retained message will cause the connection to be closed. AWS IoT Core supports will messages within the given allowed topics. A full description of standard deviations can be viewed here.   Microsoft Azure IoT Hub The client ID is used to identify a device. The behavior of two devices with the same ID is the same as for Amazon IoT Core. Retained messages are not supported on the IoT Hub. However, the documentation states that the Hub will internally append a flag to let the backend know that the messages was intended as retained. Will messages are allowed and supported, given the topic restrictions which will be discussed below. A full description of standard deviations can be viewed here.   Google Cloud IoT Core This provider uses the client ID and the password to successfully identify a device. Messages flagged as retain seem to lose this option during delivery. According to the debug logs they are forwarded as regular messages. We have not found any documentation about whether it might behave similar to the Azure IoT Hub, which forwards this request to its internal message queue. Will messages do not seem to be supported. While it is possible to store a will message in the connect statement, it does get ignored in case of irregular disconnect.   Alibaba Cloud IoT Platform The triplet of client ID, username and password are used to identify a device within a product. Both, the retain flag as well as will messages, are getting ignored from the server side. A message with retain specified is forwarded as a regular message and lost after delivery. Will messages are ignored and not stored anywhere during a connection. Available (custom) Topics MQTT uses a topic hierarchy to create a fine-grained context for messages. Topics are similar to a directory structure, starting from generic to device specific. One example of a topic hierarchy would be Sensors/Europe/Germany/Berlin/device_xyz/temperature Each IoT provider handles topics differently, so developers need to be very careful on this section.   Amazon IoT Core First, one needs to check which topics can be used by default. From the dashboard, browse to Secure-> Policies and select the default created policy. It should look like this AWS IoT Core specifies policies in JSON format, and you will find some of the previous details specified in this document. For instance, the available client IDs are specified in the Connect resource. It also allows to declare which topics are valid for publication, subscribing and receiving. It is possible to have multiple policies in place and devices need to have a policy attached. That way, it allows for a fine-grained security model where certain types groups have different access rights. Note that the topic description also allows wildcards. Those should not be confused with the wildcards in the MQTT standard. Meaning, you must use * instead of # to enable all subtopics. Once you have created a topic hierarchy based on your needs, the code itself is simple client.publish(QStringLiteral("topic_1"), "{\"message\":\"Somecontent\"}", 1); client.subscribe(QStringLiteral("topic_1"), 1);   Microsoft Azure IoT Hub The IoT Hub merely acts as an interface to connect existing MQTT solutions to the Hub. A user is not allowed to specify any custom topic, nor is it possible to introduce a topic hierarchy. A message can only be published in the following shape: const QString topic = QStringLiteral("devices/") + deviceId + QStringLiteral("/messages/events/"); client.publish(topic, "{id=123}", 1);   For subscriptions similar limitations exist client.subscribe(QStringLiteral("devices/") + deviceId + QStringLiteral("/messages/devicebound/#"), 1);   The wildcard for the subscription is used for additional information that the IoT Hub might add to a message. This can be a message ID for instance. To combine multiple properties the subtopic itself is url-encoded. An example message send from the IoT Hub has this topic included devices/TestDevice01/messages/devicebound/%24.mid=7493c5cc-d783-4ecd-8129-d3c87590b544&%24.to=%2Fdevices%2FTestDevice01%2Fmessages%2FdeviceBound&iothub-ack=full   Google Cloud IoT Core By default, a MQTT client should use this topic for publication /devices/ But it is also possible to add additional topics using the Google Cloud Shell or other APIs. In this case a topic customCross has been created. Those additional topics are reflected as subtopics on the MQTT side, though, meaning to publish a message to this topic, it would be /devices/ For subscriptions custom topics are not available and there are only two available topics a client can subscribe to /devices/ Config messages are retained messages from the cloud. Those will be send every time a client connects to keep the device in sync.   Alibaba Cloud IoT Platform Topics can easily be managed in the Topic Categories tab of the product dashboard. Each topic can be configured to receive, send or bidirectional communication. Furthermore, a couple of additional topics are generated by default to help creating a scalable structure. Note that the topic always contains the device ID. This has implications on communication routes as mentioned below. Communication routes Communication in the IoT context can be split into three different categories Device to Cloud (D2C) Cloud to Device (C2D) Device to Device (D2D) The first category is the most common one. Devices provide information about their state, sensor data or any other kind of information. Talking in the other direction happens in the case of providing behavior instructions, managing debug levels or any generic instruction. Regarding device-to-device communication, we need to be a bit more verbose on the definition inside this context. A typical example can be taken from the home automation. Given a certain light intensity, the sensor propagates the information and the blinds automatically react to this by going down (Something which never seems to work properly in office spaces ). Here, all logic is handled on the devices and no cloud intelligence is needed. Also, no additional rules or filters need to be created in the cloud instance itself. Surely, all tested providers can instantiate a method running in the cloud and then forwarding a command to another device. But that process is not part of this investigation.   Amazon IoT Core In the previous section we already covered the D2C and C2D cases. Once a topic hierarchy has been specified a client can publish to these topics, and also subscribe to one. To verify that the C2D connection works, select the Test tab on the left side of the dashboard. The browser will show a minimal interface, which allows to send a message with a specified topic. Also, the device-to-device case is handled nicely by subscribing and publishing to a topic as specified in the policy. Microsoft Azure IoT Hub It is possible to send messages from a device to the cloud and vice-versa. However, a user is not free to choose a topic. For sending the Device Explorer is a good utility, especially for testing the property bag feature. Device to device communication as in our definition is not possible using Azure IoT Hub. During the creation of this post, this article popped up. It talks about this exact use case using the Azure SDKs instead of plain MQTT. The approach there is to locate the Service SDK on the recipient device. So for bidirectional communication this would be needed on all devices, with the advantage of not routing through any server. Google Cloud IoT Core Sending messages from a device to the cloud is possible, allowing further granularity with subtopics for publication. Messages are received on two available topics as discussed in above section. As the custom topics still include the device ID, it is not possible to use a Google Cloud IoT Core instance as standard broker to propagate messages between devices (D2D). The dashboard for a device allows to send a command, as well as a configuration from the cloud interface to the device itself.   Alibaba Cloud IoT Platform Publishing and Subscribing can be done in a flexible manner using the IoT Platform. (Sub-)Topics can be generated to provide more structure. To test sending a message from the cloud to a device, the Topic List in the device dashboard includes a dialog. Device to device communication is also possible. Topics for these cannot be freely specified, they must reside exactly one level below /broadcast/ . The topic on this sub-level can be chosen freely. Other / References Amazon IoT Core MQTT specifics for AWS Available default topics Design Guide for Topic hierarchies, also a very good reference for non AWS related designs Microsoft Azure IoT Hub MQTT specifics for Azure Google Cloud IoT Core MQTT specifics for Cloud IoT Core Key generation for devices Commands Alibaba Cloud IoT Platform MQTT specifics for Cloud IoT Platform Connection guidance   Additional notes MQTT version 5 seems to be too young for the biggest providers to adopt to. This is very unfortunate, given that the latest standard adds in a couple of features specifically useful in the IoT world. Shared subscriptions would allow for automatic balancing of tasks, the new authentication command allows for higher flexibility registering devices, connection and message properties enable cloud connectivity to be more performant and easier to restrict/configure, etc. But at this point in time, we will have to wait for its adoption. Again, I want to emphasize that we have not looked into any of the features above IoT solutions provide to handle messages once received. This is a part of a completely different study and we would be very interested in hearing from your results in that field.   Additionally, we have not included RPC utilization of the providers. Some have hard coded topics to handle RPC like Google differentiating between commands and configuration. Alibaba even uses default topics to handle firmware update notifications via MQTT. TrendMicro has released a study on security related concerns in MQTT and RPC has a prominent spot in there, a must read for anyone setting up a MQTT architecture from scratch. How can I test this myself? I’ve created a sample application, which allows to connect to any of the above cloud vendors when required details are available. The interface itself is rather simple:   You can find the source code here on GitHub. Closing words For any of the broader IoT and cloud providers it is possible to connect a telemetry-based application using MQTT (and Qt MQTT). Each has different variations on connection details, also to which extend the standard is fully available for developers. Personally, I look forward to the adoption of MQTT version 5. The AUTH command allows for better integration of authentication methods, other features like topic aliases and properties bring in further use-cases for the IoT world. Additionally, shared-subscription are beneficial to create a data-worker relationship between devices. This last point however might step onto the toes of cloud vendors, as the purpose of them is to handle the load inside the cloud. I would like to close this post with questions to you. What is your experience with those cloud solutions? Is there anything in the list, I might have missed? Should other vendors or companies be included as well? Looking forward to your feedback… The post Cloud providers and telemetry via Qt MQTT appeared first on Qt Blog. [Less]
Posted over 4 years ago
This is a follow up to my previous posts about using Qt MQTT to connect to the cloud. MQTT is a prominent standard for telemetry, especially in the IoT scenario.   We are often approached by Qt customers and users on how to connect to a variety of ... [More] cloud providers, preferably keeping the requirements list short. With this post I would like to provide some more information on how to create a connection by just using Qt, without any third-party dependency. For this comparison we have chosen the following cloud providers: Amazon IoT Core Microsoft Azure IoT Hub Google Cloud IoT Core Alibaba Cloud IoT Platform The ultimate summary can be viewed in this table   The source code to locally test the results is available here. However, if you are interested in this topic, I recommend preparing a pitcher of coffee and continue reading… And if you are want to jump to a specific topic, use these shortcuts: Preface Getting connected Standard derivation (limitations) Available (custom) topics Communication routes Other / references Additional notes How can I test this myself? Closing words [Less]
Posted over 4 years ago
The agenda is published and KDAB engineers are offering a wealth of technical talks this year. Optimizing the Rendering of Qt Quick 2 applications, Giuseppe D’Angelo If you have ever wondered how Qt Quick applications are rendered to the screen, and ... [More] how to use that knowledge to find and fix performance problems, then this talk is for you. Qt Quick 2 applications are notably easy to write. With just a few lines of QML code we can create compelling, animated, fluid, 60FPS user interfaces. From time to time, however, we may face performance problems in our UI layer, especially in constrained hardware environments (mobile or embedded). In order to tackle these problems we need to understand the foundations of Qt Quick 2’s rendering, and how to design our code in order to extract the maximum performances from the underlying hardware. And while it is true that premature optimization is the root of all evil, avoiding premature pessimization is also extremely important, especially at early stages of design of a complex application. In this talk we will introduce the main principles behind profiling and optimizing the rendering of a Qt Quick 2 application. We will start by discussing how Qt Quick 2 renders a scene – namely, through its scene graph and via OpenGL calls. We will see how to gather information about how effectively Qt Quick is rendering a scene. Some of this information is available via “secret switches”, while other requires the usage of external tools, which is therefore necessary to master. Finally, we will discuss how to fix some of the most common performance problems: poor batching, excessive overdraws, fill rate limitation, and poor texture memory utilization. Practical experience with QML application is required, and some OpenGL knowledge is beneficial (but not necessary). Testing your code for security issues with automated fuzzing, Albert Astals Cid Writing secure code that deals with potentially untrusted data (parsers, importers, etc) is always hard since there are many potential cases to take into account. One of the techniques used to improve the security of such code is fuzzing. Fuzzing involves providing invalid or random data to a given piece of code to test its behaviour. Modern fuzzers are smart enough to understand what needs to be changed in the input to make the code go through a different code path making testing faster and more complete. oss-fuzz is a Free set of tools to make fuzzing of C/C++ code easier. It is comprised of various scripts and docker images, which, for example, have the base system libraries already compiled with the sanitizers. Coupling a fuzzer with the compiler sanitizers (asan, ubsan, msan) gives even better results since these sanitizers will make sure the code is run more strictly. In this session we’ll show how to fuzz a C++ codebase, as well as give you an update on how Qt is using these tools. Model models: tools for making better behaved models, André Somers Qt QAbstractItemModel API is used for both widgets and QML applications where it plays a central role for cases where there is a need to present larger quantities of data. For optimal performance, UI experience and to be able to use the insert, remove, move and displaced transitions on the QML ListView it is imperative that updates are signalled properly and as finely-grained as possible. But how do we deal with back-ends that don’t deliver enough data to do this? What if they just signal ‘changed’ or give us a whole new list without indicating what changed? Instead of relying on repetitive, hard-to-read and error-prone code or in the worst case relying on a model reset, I will present a generic approach leading to a simple drop-in solution to deal with data that comes in in bulk which results in proper, finely-grained updates to the model. A similar problem is providing all needed signals in case the model needs to be sorted. While a QSortFilterProxyModel does a good job doing the actual sorting, it does not provide the required signals to allow animating items moving in a QML ListView when the value of items changes or when the sort role or -column changes. In order to fix this, I will present a specialized proxy model that does enable this. Using these tools will help you make your models behave like “model” models. QML Component Design: the two-way binding problem, André Somers Did you ever create QML components that both display and manipulate a state that is stored in some kind of back-end? How do you ensure a property binding set by the user of your component stays intact? What should happen if that backed is slow, or rejects your changes? In this talk, we will explore this problem in some more detail, and demonstrate a few possible solutions that deal with the challenges raised in different ways. The goal is that at the end of this talk, you will have the tools available to make UI components that behave predictably and are intuitive to use in your applications. Improving your code using Clang Tooling, Kevin Funk Clang is a C/C++ compiler frontend which parses C++ source code. Since the beginning of its development, the Clang ecosystem had a strong focus on providing tooling around C++. Several tools exist, which help writing or debugging C++ or even Qt code. This talk will introduce you to several highly valuable tools backed by Clang, such as the Clang compiler frontend itself (for static analysis), the Clang static analyzer (for static analysis), clang-tidy (for static analysis, linting & refactoring of source code), clang-format (for enforcing a coding style on your code), Clang Sanitizers (dynamic analysis) and last but not least: Clazy (a compiler plugin for Clang that has additional checks for Qt code). For each tool in this presentation, we’ll do a brief introduction about its capabilities and then live-demonstrate its usage on a few code examples. We’ll also demonstrate how the Clang tools can be used on projects using build systems other than CMake, by applying a few tricks. Clang is available on all major platforms, thus these tools can be used freely on either Windows, Linux or macOS. (With some limitations of the Clang Sanitizers on the Windows platform). Git and Gerrit for working with and on Qt, Kevin Funk A basic knowledge of Git is essential if you want to apply patches back to Qt or try out a not yet released version of Qt. In this talk we’re going through the most basic bits about modern software development with the code version control system Git. Beginning with the basic concepts, such as the initial setup, checking out code we will show how to manage and commit changes as well as navigate through the Git history. Building on that knowledge, we’ll demonstrate doing exactly the same using the Git integration inside the QtCreator IDE, which provides similar functionality via a convenient graphical interface. After having done that, we will show how to get started with the code review system in place for the Qt ecosystem, Gerrit. As part of this talk we’ll discuss how to set up your Gerrit account, how to upload your SSH keys and how to configure your Git checkout to be ready to work with Gerrit. We’ll do a small change on Qt module checkout, verify we did not break existing functionality, and then submit our change for review. Qt 3D Node Editor and Shader Generator, Paul Lemire More and more frameworks and tools are providing higher level of development through the use of Node Editors to create code, behaviors or entire applications. Since Qt 5.12, Qt provides support for loading a tree of nodes and convert these to generate OpenGL GLSL shader codes for Qt 3D. This can be harnessed to create a single shader description that can then be translated into different languages. This talk will present that part of the framework, show how it is used and discuss possible ideas of how that could be extended to target completely different type of features in the future. What’s new in KUESA and Qt 3D, Paul Lemire Various things have been overhauled and improved in both KUESA and Qt 3D over the past year. This talk is about showing what has changed and defining where we’d like to go with both KUESA, the designer-developer workflow package, and the Qt 3D module. Migrating from MFC to Qt, Nicolas Arnaud-Cormos: Microsoft Foundation Class Library (MFC) is a legacy C++ object-oriented library on Windows that exists since the dawn of Windows. It has been replaced lately with more up to date framework, but MFC legacy code still widely exists. While Qt and MFC are both over 25 years old, that’s where the similarity ends. Qt is actively maintained by a vibrant developer community, upgraded regularly to embrace the latest programming improvements, and expanded to cover new devices and operating systems. Qt is then a natural candidate for replacement of MFC. In this talk, we will discuss the strategy and steps to migrate a MFC legacy code base to Qt, how to map MFC code to Qt, as well as some of the traps to avoid. Practical application scripting with QML, Kevin Krammer In my talk “When all goes according to script” at Qt World Summit 2015, I explored the fundamentals of using the QML environment for the purpose of in-application scripting. In this follow-up I am going to look into more practical aspects, especially on how the powerful declarative approach in QML allows very easy handling of asychronous scripting tasks. The format will be a series of live-demos with Qt applications which either have been extended with such a scripting environment or which have even been built specifically around this capability. Similar to the original talk the goal is to show once again that QML is a versatile and powerful component for all Qt application development, above and beyond the already established use case of building advanced UIs. Full-stack Tracing With LTTng, Milian Wolff We all love and use C++ because of its performance. But how do we actually measure the performance of an application? How do we check whether an application is CPU- or I/O bound? How do we see if our application is influenced by others running on the same system? There are many good and valid answers to these questions. Tracing certainly is a very valuable addition to everyone’s toolset. It can offer in-depth insights into what a system is doing and why an application is performing in a given way. Done properly, we can use it to piece together multiple pieces of the picture: How is our hardware utilized, what is the kernel doing, what is my application doing? In this talk, we will give an introduction to LTTng, a tracing toolkit for Linux, and show how it can be applied on embedded Linux systems to get an answer to the following question: What can I do to optimize the startup time of my application? We will talk about configuring Qt and LTTng properly. We will discuss the most useful kernel trace points and demonstrate the tracing subsystem in Qt for custom user space trace points. And we will look at how to analyze the collected data in a way that doesn’t make you want to pull your hair out. The contents of this talk stem from the experience of successfully optimizing automotive Qt applications on embedded Linux applications. The lessons learned apply to a much broader audience and can also be used with other tracing toolkits such as ETW on Windows. QStringView — Past, Present, and Future, Marc Mutz Since QStringView was added in Qt 5.10, not much has happened, and the presenter duly asks for forgiveness for having stepped away from QStringView development for two years. But he’s back now, and Qt 5.14 will contain a significantly more complete QStringView (and QLatin1String) ecosystem than 5.13 did. If you do string processing with Qt, this talk is for you. After a very brief recap on QStringView’s general purpose in Qt, we will look at what’s new in Qt 5.14 with respect to QStringView and, time permitting, take a brief look into the near future. Testing & Profiling Qt on Android, Bogdan Vatra In this session we are going to learn how to test and profile our Qt application on Android. The talk will cover: how to use Qt Test on Android how to use Google tools to do profiling     Qt on the second Screen, Christoph Sterz Companion Apps, VNC remoting, streaming WebGL to the Browser or compiling your Code to WebASM – Adding a secondary screen to your application can either extend its functionality or the physical range for users operating it. Especially for Embedded Devices, offering remote monitoring, configuration or off-site maintenance adds benefit to users used to mobile and opens quick paths for support-teams to help effectively. This talk will summarize all options Qt has to offer for these needs and the usual paradigms to follow when you design your software to reach out further.   See the full agenda here. Check out the Training courses on the 4th and sign up for Qt World Summit 2019. The post KDAB talks at Qt World Summit 2019 appeared first on KDAB. [Less]
Posted over 4 years ago
KDAB will be supporting our training partner, froglogic, at this event in Munich dedicated to Squish, the automated GUI tester we are proud to support. Complete with live demo-ing, Tobias Naetterlund, KDAB’s foremost Squish expert, will be giving a ... [More] talk showing how to control multiple clients simultaneously on one or more machines, using this versatile tool: Load testing with Squish In a client-server architecture, ensuring the server can run perfectly, even under load, is as vital as making sure the client functionality is correct. In some cases, you can have your Squish tests talk directly to the server in order to test the functionality under load, but in order to set up as realistic server usage scenarios as possible, it would be beneficial to instead use Squish’s playback functionality for controlling multiple clients simultaneously, and have them talk to the server the way they normally would during daily usage. In this talk, the presenter will outline an approach for load testing a server backend by controlling multiple client frontends simultaneously. We will discuss a number of challenges you may run into, and various approaches of attacking them. As we will start from the basics, no previous Squish knowledge is necessary. Register for Squish Days Europe Get KDAB’s Squish training for your team. The post KDAB at Squish Days Europe, Munich appeared first on KDAB. [Less]
Posted over 4 years ago
In the previous two blogs in this series I showed how solving an apparently simple problem about loading a lot of data into RAM using mmap() also turned out to require a solution that improved CPU use across cores. In this blog, I’ll show how we ... [More] dealt with the bottleneck problems that ensued, and finally, how we turned to coarse threading to utilize the available cores as well as possible whilst keeping the physical memory usage doable. These are the stages we went through: Preprocessing Loading the Data Fine-grained Threading Now we move to Stage 4, where we tackle the bottleneck problem: 4 Preprocessing Reprise So far so good right? Well yes and no. We’ve improved things to use multithreading and SIMD but profiling in vtune still showed bottlenecks. Specifically, in the IO subsystem, paging the data from disk into system memory (via mmap). The access pattern through the data is the classic thing we see with textures in OpenGL where the texture data doesn’t all fit into GPU memory, so it ends up thrashing the texture cache with the typical throw-out-the-least-recently-used-stuff as of course we need the oldest stuff again on the next iteration of the outer loop. This is where the expanded, preprocessed data is biting us in the backside. We saved runtime cost at the expense of disk and RAM usage and this is now the biggest bottleneck to the point where we can’t feed the data from disk (SSD) to the CPU fast enough to keep it fully occupied. The obvious thing would be to reduce the data size, but how? We can’t use the old BED file format, as the quantization used is too coarse for the offset + scaled data. We can’t use lower precision floats as that only reduces by a small constant factor. Inspecting the data of some columns in the matrix, I noticed that there are very many repeated values. Which makes total sense given the highly quantized input data. So, we tried compressing each column using zlib. This worked like magic – the preprocessed data came out only 5% larger than the quantized original BED data file! Because we are compressing each column of the matrix independently, and the compression ratio varies depending upon the needed dictionary size and the distribution of repeated elements throughout the column, we need a way to be able to find the start and end of each column in the compressed preprocessed bed file. So, whilst preprocessing, we also write out a binary index companion file which, for each column, stores the offset of the column start in the main file and its byte size. So when wanting to process a column of data in the inner loop, we lookup in the index file the extent of the column’s compressed representation in the mmap()‘d file, decompress that into a buffer of the right size (we know how many elements each column has, it’s the number of people) and then wrap that up in the Eigen Map helper. Using zlib like this really helped in reducing the storage and memory needed. However, now profiling showed that the bottleneck had shifted to the decompression of the column data. Once again, we have improved things, but we still can’t keep the CPU fed with enough data to occupy it for that inner loop workload. 5 Coarse Threading How to proceed from here? What we need is a way to balance the CPU threads and cycles used for decompressing the column data with the threads and cycles used to then analyze each column. Remember that we are already using SIMD vectorization and parallel_for and parallel_reduce for the inner loop workload. After thinking over this problem for a while I decided to have a go at solving it with another feature of Intel TBB, the flow graph. The flow graph is a high-level interface and data driven way to construct parallel algorithms. Once again, behind the scenes this eventually gets decomposed into the threadpool + tasks as used by parallel_for and friends. The idea is that you construct a graph from various pre-defined node types into which you can plug lambdas for performing certain operations. You can then set options on the different node types and connect them with edges to form a data flow graph. Once set up, you send data into the graph via a simple message class/struct and it flows through until the results fall out of the bottom of the graph. There are many node types available but for our needs just a few will do: Function node: Use this along with a provided lambda to perform some operation on your data e.g. decompress a column of data or perform the doStuff() inner loop work. This node type can be customized as to how many parallel instantiations of tasks it can make from serial behavior to any positive number. We will have need for both as we shall see. Sequencer node: Use this node to ensure that data arrives at later parts of the flow graph in the correct order. Internally it buffers incoming messages and uses a provided comparison functor to re-order the messages ready for output to successor nodes. Limiter node: Use this node type to throttle the throughput of the graph. We can tell it a maximum number of messages to buffer from predecessor nodes. Once it reaches this limit it blocks any more input messages until another node triggers it to continue. I’ve made some very simple test cases of the flow graph in case you want to see how it works and how I built up to the final graph we used in practice. The final graph used looks like this: A few things to note here: We have a function node to perform the column decompression. This is allowed to use multiple parallel tasks as each column can be decompressed independently of the others due to the way we compressed the data at preprocess time. To stop this from decompressing the entire data set as fast as possible and blowing up our memory usage, we limit this with a limiter node set to some small number roughly equal to the number of cores. We have a second function node limited to sequential behavior that calls back to our algorithm class to do the actual work on each decompressed column of data. Then we have the two ordering nodes. Why do we need two of them? The latter one ensures that the data coming out of the decompression node tasks arrives in the order that we expect (as queued up by the inner loop). This is needed because, due to the kernel time-slicing the CPU threads, they may finish in a different order to which they were enqueued. The requirement for the first ordering node is a little more subtle. Without it, the limiter node may select messages from the input in an order such that it fills up its internal buffer but without picking up the first message which it needs to send as an output. Without the ordering node up front, the combination of the second ordering node and the limiter node may cause the graph to effectively deadlock. The second ordering node would be waiting for the nth message, but the limiter node is already filled up with messages which do not include the nth one. Finally, the last function node which processes the “sequential” (but still SIMD and parallel_for pimped up) part of the work uses a graph edge to signal back to the limiter node when it is done so that the limiter node can then throw the next column of data at the decompressor function node. With this setup, we have a high level algorithm which is self-balancing between the decompression steps and the sequential doStuff() processing! That is actually really nice, plus it is super simple to express in just a few lines of code and it remains readable for future maintenance. The code to setup this graph and to queue up the work for each iteration is available at github. The resulting code now uses 100% of all available cores and is balancing the work of decompression and processing the data. Meanwhile the data processing also utilizes all cores well. The upside of moving the inner loop to be represented by the flow graph means that the decompression + column processing went from 12.5s per iteration (on my hexacore i7) to 3s. The 12.5s was measured with the sequential workload already using parallel_for and SIMD. So this is another very good saving. Summary We have shown how a simple “How do I use mmap()?” mentoring project has grown beyond its initial scope and how we have used mmap, Eigen,parallel_for/parallel_reduce, flow graphs and zlib to nicely make the problem tractable. This has shown a nice set of performance improvements whilst at the same time keeping the disk and RAM usage within feasible limits. Shifted work that can be done once to a preprocessing step Kept the preprocessed data size down as low as possible with compression Managed to load even large datasets into memory at once with mmap Parallelized the inner loop operations at a low level with parallel_for Parallelized the high-level loop using the flow graph and made it self-balancing Fairly optimally utilizing the available cores whilst keeping the physical memory usage down (number of threads used * col size roughly). Thanks for reading this far! I hope this helps reduce your troubles, when dealing with big data issues. The post Little Trouble in Big Data – Part 3 appeared first on KDAB. [Less]
Posted over 4 years ago
In the previous two blogs in this series I showed how solving an apparently simple problem about loading a lot of data into RAM using mmap() also turned out to require a solution that improved CPU use across cores. In this blog, I’ll show how we ... [More] dealt with the bottleneck problems that ensued, and finally, how we turned to coarse threading to utilize the available cores as well as possible whilst keeping the physical memory usage doable. These are the stages we went through: Preprocessing Loading the Data Fine-grained Threading Now we move to Stage 4, where we tackle the bottleneck problem: 4 Preprocessing Reprise So far so good right? Well yes and no. We’ve improved things to use multithreading and SIMD but profiling in vtune still showed bottlenecks. Specifically, in the IO subsystem, paging the data from disk into system memory (via mmap). The access pattern through the data is the classic thing we see with textures in OpenGL where the texture data doesn’t all fit into GPU memory, so it ends up thrashing the texture cache with the typical throw-out-the-least-recently-used-stuff as of course we need the oldest stuff again on the next iteration of the outer loop. This is where the expanded, preprocessed data is biting us in the backside. We saved runtime cost at the expense of disk and RAM usage and this is now the biggest bottleneck to the point where we can’t feed the data from disk (SSD) to the CPU fast enough to keep it fully occupied. The obvious thing would be to reduce the data size, but how? We can’t use the old BED file format, as the quantization used is too coarse for the offset + scaled data. We can’t use lower precision floats as that only reduces by a small constant factor. Inspecting the data of some columns in the matrix, I noticed that there are very many repeated values. Which makes total sense given the highly quantized input data. So, we tried compressing each column using zlib. This worked like magic – the preprocessed data came out only 5% larger than the quantized original BED data file! Because we are compressing each column of the matrix independently, and the compression ratio varies depending upon the needed dictionary size and the distribution of repeated elements throughout the column, we need a way to be able to find the start and end of each column in the compressed preprocessed bed file. So, whilst preprocessing, we also write out a binary index companion file which, for each column, stores the offset of the column start in the main file and its byte size. So when wanting to process a column of data in the inner loop, we lookup in the index file the extent of the column’s compressed representation in the mmap()‘d file, decompress that into a buffer of the right size (we know how many elements each column has, it’s the number of people) and then wrap that up in the Eigen Map helper. Using zlib like this really helped in reducing the storage and memory needed. However, now profiling showed that the bottleneck had shifted to the decompression of the column data. Once again, we have improved things, but we still can’t keep the CPU fed with enough data to occupy it for that inner loop workload. 5 Coarse Threading How to proceed from here? What we need is a way to balance the CPU threads and cycles used for decompressing the column data with the threads and cycles used to then analyze each column. Remember that we are already using SIMD vectorization and parallel_for and parallel_reduce for the inner loop workload. After thinking over this problem for a while I decided to have a go at solving it with another feature of Intel TBB, the flow graph. The flow graph is a high-level interface and data driven way to construct parallel algorithms. Once again, behind the scenes this eventually gets decomposed into the threadpool + tasks as used by parallel_for and friends. The idea is that you construct a graph from various pre-defined node types into which you can plug lambdas for performing certain operations. You can then set options on the different node types and connect them with edges to form a data flow graph. Once set up, you send data into the graph via a simple message class/struct and it flows through until the results fall out of the bottom of the graph. There are many node types available but for our needs just a few will do: Function node: Use this along with a provided lambda to perform some operation on your data e.g. decompress a column of data or perform the doStuff() inner loop work. This node type can be customized as to how many parallel instantiations of tasks it can make from serial behavior to any positive number. We will have need for both as we shall see. Sequencer node: Use this node to ensure that data arrives at later parts of the flow graph in the correct order. Internally it buffers incoming messages and uses a provided comparison functor to re-order the messages ready for output to successor nodes. Limiter node: Use this node type to throttle the throughput of the graph. We can tell it a maximum number of messages to buffer from predecessor nodes. Once it reaches this limit it blocks any more input messages until another node triggers it to continue. I’ve made some very simple test cases of the flow graph in case you want to see how it works and how I built up to the final graph we used in practice. The final graph used looks like this: A few things to note here: We have a function node to perform the column decompression. This is allowed to use multiple parallel tasks as each column can be decompressed independently of the others due to the way we compressed the data at preprocess time. To stop this from decompressing the entire data set as fast as possible and blowing up our memory usage, we limit this with a limiter node set to some small number roughly equal to the number of cores. We have a second function node limited to sequential behavior that calls back to our algorithm class to do the actual work on each decompressed column of data. Then we have the two ordering nodes. Why do we need two of them? The latter one ensures that the data coming out of the decompression node tasks arrives in the order that we expect (as queued up by the inner loop). This is needed because, due to the kernel time-slicing the CPU threads, they may finish in a different order to which they were enqueued. The requirement for the first ordering node is a little more subtle. Without it, the limiter node may select messages from the input in an order such that it fills up its internal buffer but without picking up the first message which it needs to send as an output. Without the ordering node up front, the combination of the second ordering node and the limiter node may cause the graph to effectively deadlock. The second ordering node would be waiting for the nth message, but the limiter node is already filled up with messages which do not include the nth one. Finally, the last function node which processes the “sequential” (but still SIMD and parallel_for pimped up) part of the work uses a graph edge to signal back to the limiter node when it is done so that the limiter node can then throw the next column of data at the decompressor function node. With this setup, we have a high level algorithm which is self-balancing between the decompression steps and the sequential doStuff() processing! That is actually really nice, plus it is super simple to express in just a few lines of code and it remains readable for future maintenance. The code to setup this graph and to queue up the work for each iteration is available at github. The resulting code now uses 100% of all available cores and is balancing the work of decompression and processing the data. Meanwhile the data processing also utilizes all cores well. The upside of moving the inner loop to be represented by the flow graph means that the decompression + column processing went from 12.5s per iteration (on my hexacore i7) to 3s. The 12.5s was measured with the sequential workload already using parallel_for and SIMD. So this is another very good saving. Summary We have shown how a simple “How do I use mmap()?” mentoring project has grown beyond its initial scope and how we have used mmap, Eigen,parallel_for/parallel_reduce, flow graphs [maybe replace these two with Intel Thread Building Blocks] and zlib to nicely make the problem tractable. This has shown a nice set of performance improvements whilst at the same time keeping the disk and RAM usage within feasible limits. Shifted work that can be done once to a preprocessing step Kept the preprocessed data size down as low as possible with compression Managed to load even large datasets into memory at once with mmap Parallelized the inner loop operations at a low level with parallel_for Parallelized the high-level loop using the flow graph and made it self-balancing Fairly optimally utilizing the available cores whilst keeping the physical memory usage down (number of threads used * col size roughly). Thanks for reading this far! If you want to try Intel TBB with your CMake based projects, there’s a small howto here. The post Little Trouble in Big Data – Part 3 appeared first on KDAB. [Less]