Seite wählen

Tick Tock: What the heck is time-series data? by Tanay Pant | OSDC 2019

von | Mai 5, 2020 | Monitoring

This entry is part 6 of 6 in the series OSDC 2019 | Recap


The rise of IoT and smart infrastructure has led to the generation of massive amounts of complex data. In his talk at the Open Source Data Center Conference (OSDC) 2019 Tanay Pant brought up a question to gather insights: Tick Tock: What the heck is time-series data? See the video of Tanay‘s presentation and read a summary (below).

The former OSDC will be held for the first time in 2020 under the new name stackconf. With the changes in modern IT in recent years, the focus of the conference has increasingly shifted from a mainly static infrastructure approach to a broader spectrum that includes agile methods, continuous integration, container, hybrid and cloud solutions. This development is taken into account by changing the name of the conference and opening the topic area for further innovations.

Due to concerns around the coronavirus (COVID-19), the decision was made to hold stackconf 2020 as an online conference. The online event will now take place from June 16 – 18, 2020. Join us, live online! Save your ticket now at: stackconf.eu/ticket/

Tick Tock: What the heck is time-series data?

Today we are going to talk about topics like what is time-series and how the load of different file forms are distributed, different use cases where time-series are used frequently. Then we’ll talk about how Create-DB helps to communicate with machine files.

What are time series?

To answer this question we present a sensor that sends the files in a period of time. When we want to read in or display this file, the time would be an axis. Compared to other workloads this file is not added to the database as an update, the time-series is added as an input and this is the primary way for this process. Time-series in database is basically introducing efficiencies through temporal treatment and this allows us to intuitively have this set of files like monitoring in different times in all aspects of our operation.

Now we have a view on time-series. If you create an abstract, look at different use cases of time-series and the way the data was generated. You can categorize them in two different ways. The first one is IT and monitoring, what can be described as a traditional use of time-series databases. When we have a look at the properties in this, one can say there are tens or hundreds of metrics or sensors as well as a lot of complex data and queries that are often larger than several gigabytes. Flux DB is a good example in this category.

We have industrial sensor data and this is an emerging sector that has not been much talked about. There are also hundreds or thousands of sensors or metrics, too. So the real-time queries are under pressure, which must be able to access all the gigabytes of data. Create-DB is a good example in this case.

We start with core technology and see what exactly Create-DB is and how it differs from other databases in this segment. Create-DB is a new type of distribution continuation database that is best suited for handling industrial sensor data, due to its ease of use and ability to handle a lot of different data, as well as a thousand different sensor data. Create-DB supports distributed SQL with full-text search and data queries, and also coordinates different nodes in a DB Cluster seamlessly with one another. In addition, the execution of write and query operations across nodes in clusters are automatically distributed. Create-DB has columnar caches for time-series in memory SQL performance so time-series normally require all data in main memory to fit, which limits the amount of data that can be managed within a specific time.

One solution for time-series performance without data volume restrictions is to implement the residence of memory in filled caches at each node, so that the caches tell the query engine whether there are any records on this node and where those records are. Distributed query processing also contributes to fast performance and a query planner that makes wise decisions about which nodes are best suited for execution. And it has machine data functions with a cloud native that makes it seamless in the cloud. Finally, we look at a few advantages of Create-DB. The Create-DB installation is simple. You can create an instance of Create-DB with a single line on the terminal or docker. It has a distributed query engine that supports full-text queries. It can handle economic hardware and instances well, and it is easy to scale the architecture.

Saeid Hassan-Abadi
Saeid Hassan-Abadi
Junior Consultant

Saeid hat im September 2019 seine Ausbildung zum Fachinformatiker im Bereich Systemintegration gestartet. Der gebürtige Perser hat in seinem Heimatland Iran Wirtschaftsindustrie-Ingenieurwesen studiert. Er arbeitet leidenschaftlich gerne am Computer und eignet sich gerne neues Wissen an. Seine Hobbys sind Musik hören, Sport treiben und mit seinen Freunden Zeit verbringen.
Mehr Beiträge zum Thema Monitoring

Icinga Camp Berlin 2022

Das erste Icinga Camp nach 2019 fand nun fast 3 Jahre später statt. NETWAYS hat als Sponsor und Hilfe in der Organisation zusammen mit der Icinga das Event zusammen auf die Beine gestellt und zu einem vollen Erfolg gemacht. Es wurde der aktuelle Stand der Dinge...

Icinga for Windows Preview: Visualisiert eure Metriken!

Icinga hat in seinem Blogpost mitgeteilt, dass es mit Icinga for Windows v1.10.0 einige Änderungen an den Performance Metriken geben wird. Hier wollen wir einmal grob zusammenfassen, worum es geht und welche Auswirkungen diese Änderungen haben. Für alle Details ist...

Icinga DB v1.0 released

Das Icinga Team hat nach langjähriger Entwicklungszeit die Icinga DB veröffentlicht. Damit steht Icinga ein neues Backend zur Verfügung, welches mittelfristig die IDO ablösen wird. Was ist die Icinga DB? Die Icinga DB ist nicht eine zentrale Komponente, sondern...

Aus Graylog ENTERPRISE wird Graylog Operations

Wie im Blogpost von Graylog bekannt gegeben, wurde die Lösung Graylog ENTERPRISE umbenannt in Graylog Operations. Der neue Name spiegelt dabei das Ziel von Graylog wider, sich verstärkt in den operational Bereich einzugliedern und Herausforderungen, die sich aus...