Alessia, Zoroaster and Andrea, Calore and Paolo, Berzaghi and Anisseh, Sobhani and Nicoletta, Dainese and Anna, Granato and Severino, Segato and Lorenzo, Serva
(2026)
Dataset of honey quality traits analysed with NIR.
[Data Collection]
Collection description
Honey quality and authenticity assessment require rapid and reliable analytical tools capable of supporting both laboratory and on-site applications. Near-infrared (NIR) spectroscopy represents a non-destructive and cost-effective approach; however, its performance depends on instrument characteristics and chemometric strategies. This study compared one benchtop and two portable NIR spectrometers for predicting key physicochemical parameters (moisture, electrical conductivity, glucose, fructose, reducing sugars, pH, hydroxymethylfurfural, and diastatic activity) and for discriminating botanical origin in 80 Italian honey samples. Spectral data were processed using multiple pre-processing techniques and algorithms (PLS, k-NN, Random Forest, SVM), with and without wavelength selection (siPLS and CARS-PLS), under cross-validation schemes. The benchtop device achieved the highest regression performance (R² up to 0.91 for glucose and electrical conductivity) and the most reliable botanical classification (balanced accuracy = 0.90). Portable instruments showed moderate predictive ability for bulk compositional parameters (R² up to 0.86 for glucose) but limited classification performance. Wavelength selection resulted in only marginal improvements. Parameters present at low concentrations, such as hydroxymethylfurfural and diastatic activity, were poorly predicted across all devices. These findings indicate that portable NIR systems are suitable for rapid screening of major compositional traits, whereas benchtop instruments remain more appropriate for laboratory-level quantification and robust botanical authentication.
| DOI: |
10.25430/researchdata.cab.unipd.it.00001770 |
| Keywords: |
Portable NIR, honey authentication, chemometric analysis, food quality control, spectral data processing, machine learning, rapid screening |
| Subjects: |
Life Sciences > Applied Life Sciences, Biotechnology and Molecular and Biosystems engineering: Applied plant and animal sciences; forestry; food sciences; applied biotechnology; environmental, and marine biotechnology; applied bioengineering; biomass, biofuels; biohazard > Food sciences (including food technology, food safety, nutrition) |
| Department: |
Departments > Dipartimento di Medicina animale, produzioni e salute (MAPS) |
| Depositing User: |
Andrea Calore
|
| Date Deposited: |
10 Mar 2026 11:16 |
| Last Modified: |
10 Mar 2026 11:16 |
| Creators/Authors: |
|
| Corporate creators: |
Consorzio Nazionale Apicoltori (CONAPI), Istituto Zooprofilattico Sperimentale delle Venezie (IZSVe) |
| Type of data: |
Database |
| Research funder: |
University of Padua |
| Research project title: |
SID 2020 |
| Collection period: |
| From | To |
|---|
| 1 June 2025 | 12 December 2025 |
|
| Geographic coverage: |
Southern Italy |
| Bounding Area: |
| North Latitude | East Longitude | South Latitude | West Longitude |
|---|
| 41 | 17 | 39 | 14 |
|
| Data collection method: |
Eighty Italian honey samples collected in June 2024 from three regions of southern Italy – approximately 39–41 °N, 14–17 °E – were provided by the Consorzio Nazionale Apicoltori (CONAPI), headquartered in Monterenzio (Bologna, Italy). The dataset comprised two main botanical categories, including 30 polyfloral (PF) and 30 chestnut (CH) honey samples. To increase spectral variability for botanical origin clustering analysis, additional honey types were included: acacia (AC, n = 4), citrus honeys (CF, n = 4; including orange, clementine and lemon blossom), honeydew (HO, n = 5), French honeysuckle (FH, n = 2), and linden (LN, n = 3). Botanical origin was determined by melissopalynological analysis and certified by CONAPI, in accordance with current European honey legislation [13]. Each sample consisted of 150 g of unfiltered and un-pasteurized honey. Samples were stored under refrigerated conditions (4 ± 2 °C) and analysed within 60 days of collection. All analyses were conducted at the Istituto Zo-oprofilattico Sperimentale delle Venezie, Legnaro (PD), Italy.
This study compared one benchtop and two portable NIR spectrometers for predicting key physicochemical parameters (moisture, electrical conductivity, glucose, fructose, reducing sugars, pH, hydroxymethylfurfural, and diastatic activity) and for discrimi-nating botanical origin in 80 Italian honey samples. Spectral data were processed using multiple pre-processing techniques and algorithms (PLS, k-NN, Random Forest, SVM), with and without wavelength selection (siPLS and CARS-PLS), under cross-validation schemes. The benchtop device achieved the highest regression performance (R² up to 0.91 for glucose and electrical conductivity) and the most reliable botanical classifica-tion (balanced accuracy = 0.90). Portable instruments showed moderate predictive ability for bulk compositional parameters (R² up to 0.86 for glucose) but limited classi-fication performance. Wavelength selection resulted in only marginal improvements. Parameters present at low concentrations, such as hydroxymethylfurfural and dia-static activity, were poorly predicted across all devices |
| Resource language: |
english |
| Metadata language: |
english |
| Publisher: |
Research Data Unipd |
| Date: |
10 March 2026 |
| Copyright holders: |
The Author |
| URI: |
https://researchdata.cab.unipd.it/id/eprint/1770 |
| Creators/Authors: |
|
| Corporate creators: |
Consorzio Nazionale Apicoltori (CONAPI), Istituto Zooprofilattico Sperimentale delle Venezie (IZSVe) |
| Type of data: |
Database |
| Research funder: |
University of Padua |
| Research project title: |
SID 2020 |
| Collection period: |
| From | To |
|---|
| 1 June 2025 | 12 December 2025 |
|
| Geographic coverage: |
Southern Italy |
| Bounding Area: |
| North Latitude | East Longitude | South Latitude | West Longitude |
|---|
| 41 | 17 | 39 | 14 |
|
| Data collection method: |
Eighty Italian honey samples collected in June 2024 from three regions of southern Italy – approximately 39–41 °N, 14–17 °E – were provided by the Consorzio Nazionale Apicoltori (CONAPI), headquartered in Monterenzio (Bologna, Italy). The dataset comprised two main botanical categories, including 30 polyfloral (PF) and 30 chestnut (CH) honey samples. To increase spectral variability for botanical origin clustering analysis, additional honey types were included: acacia (AC, n = 4), citrus honeys (CF, n = 4; including orange, clementine and lemon blossom), honeydew (HO, n = 5), French honeysuckle (FH, n = 2), and linden (LN, n = 3). Botanical origin was determined by melissopalynological analysis and certified by CONAPI, in accordance with current European honey legislation [13]. Each sample consisted of 150 g of unfiltered and un-pasteurized honey. Samples were stored under refrigerated conditions (4 ± 2 °C) and analysed within 60 days of collection. All analyses were conducted at the Istituto Zo-oprofilattico Sperimentale delle Venezie, Legnaro (PD), Italy.
This study compared one benchtop and two portable NIR spectrometers for predicting key physicochemical parameters (moisture, electrical conductivity, glucose, fructose, reducing sugars, pH, hydroxymethylfurfural, and diastatic activity) and for discrimi-nating botanical origin in 80 Italian honey samples. Spectral data were processed using multiple pre-processing techniques and algorithms (PLS, k-NN, Random Forest, SVM), with and without wavelength selection (siPLS and CARS-PLS), under cross-validation schemes. The benchtop device achieved the highest regression performance (R² up to 0.91 for glucose and electrical conductivity) and the most reliable botanical classifica-tion (balanced accuracy = 0.90). Portable instruments showed moderate predictive ability for bulk compositional parameters (R² up to 0.86 for glucose) but limited classi-fication performance. Wavelength selection resulted in only marginal improvements. Parameters present at low concentrations, such as hydroxymethylfurfural and dia-static activity, were poorly predicted across all devices |
| Resource language: |
english |
| Metadata language: |
english |
| Publisher: |
Research Data Unipd |
| Date: |
10 March 2026 |
| Copyright holders: |
The Author |
| Last Modified: |
10 Mar 2026 11:16 |
|