University of Padua

Alarm logs of industrial packaging machines

Tosato, Diego and Dalle Pezze, Davide and Masiero, Chiara and Susto, Gian Antonio and Beghi, Alessandro (2022) Alarm logs of industrial packaging machines. [Data Collection]

Related publications

Original publication URL:

Collection description

The advent of the Industrial Internet of Things (IIoT) has led to the availability of huge amounts of data, that can be used to train advanced Machine Learning algorithms to perform tasks such as Anomaly Detection, Fault Classification and Predictive Maintenance. Even though not all pieces of equipment are equipped with sensors yet, usually most of them are already capable of logging warnings and alarms occurring during operation. Turning this data, which is easy to collect, into meaningful information about the health state of machinery can have a disruptive impact on the improvement of efficiency and up-time. The provided dataset consists of a sequence of alarms logged by packaging equipment in an industrial environment. The collection includes data logged by 20 machines, deployed in different plants around the world, from 2019-02-21 to 2020-06-17. There are 154 distinct alarm codes, whose distribution is highly unbalanced. This data can be used to address the following tasks: 1. Next alarm forecasting: this problem can be framed as a supervised multi-class classification task, or a binary classification task when a specific alarm code is considered. 2. Predicting alarms occurring in a future time frame: here the goal is to forecast the occurrence of certain alarm types in a future time window. Since many alarms can occur, this is a supervised multi-label classification. 3. Future alarm sequence prediction: here the goal is predicting an ordered sequence of future alarms, in a sequence-to-sequence forecasting scenario. 4. Anomaly Detection: the task is to detect abnormal equipment conditions, based on the pattern of alarms sequence. This task can be either unsupervised, if only the input sequence is considered, or supervised if future alarms are taken into account to assess whether or not there is an anomaly. All of the above tasks can also be studied from a continual learning perspective. Indeed, information about the serial code of the specific piece of equipment can be used to train the model; however, a scalable model should also be easy to apply to new machines, without the need of a new training from scratch.

DOI: 10.21227/nfv6-k750
Keywords: Artificial Intelligence, IoT, Machine Learning, Industry 4.0; Industrial IoT; Machine Learning; Alarm sequences data
Subjects: Physical:Sciences and Engineering > Computer Science and Informatics: Informatics and information systems, computer science, scientific computing, intelligent systems > Machine learning, statistical data processing and applications using signal processing (e.g. speech, image, video)
Physical:Sciences and Engineering > Computer Science and Informatics: Informatics and information systems, computer science, scientific computing, intelligent systems > Artificial intelligence, intelligent systems, multi agent systems
Physical:Sciences and Engineering > Products and Processes Engineering: Product design, process design and control, construction methods, civil engineering, energy processes, material engineering > Industrial design (product design, ergonomics, man-machine interfaces, etc.)
Department: Departments > Dipartimento di Ingegneria dell'Informazione (DEI)
Depositing User: Research Data Unipd
Date Deposited: 18 Oct 2023 10:26
Last Modified: 18 Oct 2023 10:26
Dalle Pezze,
Susto, Gian
Type of data: Text
Collection period:
Resource language: it
Metadata language: it
Additional information: In this dataset, we provide both raw and processed data. As for raw data, raw/alarms.csv is a comma-separated file with a row for each logged alarm. Each row provides the alarm code, the timestamp of occurrence, and the identifier of the piece of equipment generating the alarm. From this file, it is possible to generate data for tasks such as those described in the abstract. For the sake of completeness, we also provide the Python code to process data and generate input and output sequences that can be used to address the task of predicting which alarms will occur in a future time window, given the sequence of all alarms occurred in a previous time window (processed/all_alarms.pickle, processed/all_alarms.json, and processed/all_alarms.npz). The Python module to process raw data into input/output sequences is In particular, function create_dataset allows creating sequences already split in train/test and stored in a pickle file. It is also possible to use create_dataset_json and create_dataset_npz to obtain different output formats for the processed dataset. The ready-to-use datasets provided in the zipped folder were created by considering an input of 1720 minutes and an output window of 480 minutes. More information can be found in the attached file.
Publisher: IEEE DataPort
Status: Published
Original publication URL:
Related publications:
Date: 17 May 2022
Date type: Publication
Copyright holders: The Authors

Available Files

Full Archive



Cite As

Begin typing (e.g. Chicago or IEEE.) or use the drop down menu.

Begin typing (e.g. en-GB for English, Great Britain) or use the drop down menu.

Export As