This document details the file system organization and navigation procedures for accessing both unprocessed hyperspectral imaging (HSI) data and the derived phenotypic measurements of the plants captured by the Ag Alumni Seed Phenotyping Facility (AAPF). The data can be categorized into three main parts:
Raw Scanned Data
: This folder structure houses individual scans (.hdr, .md5, .raw, .tiff) for each measurement.Interim Data Product
: This data contains data used for quality control procedures and images representing various vegetation indices.Masterfile (.xlsx)
: This file contains a compilation of spreadsheets and a guide document explaining the data contents.
- File system structure
- How to Read Header Summary (header_summary_plants.csv, header_summary_ref.csv)
- How to Read Hyperspectral Data Processing Summary (.csv)
- Interpretation of VNIR-SWIR Coregistration Quality Assessment (.gif)
- How to Read Information in Masterfile (.xlsx, spreadsheet)
Example (does not contain all associated files):
401_80 (/depot/smarterag/data/HSI/401_80)
├── HSI_R_2401856_240606084952173_SWIR-SIDE-DAT_401_80_240606085015154.hdr
├── HSI_R_2401856_240606084952173_SWIR-SIDE-DAT_401_80_240606085015154.md5
├── HSI_R_2401856_240606084952173_SWIR-SIDE-DAT_401_80_240606085015154.raw
├── HSI_R_2401856_240606084952173_SWIR-SIDE-DAT_401_80_240606085015154.tiff
├── HSI_R_2401856_240606084952173_SWIR-TOP-DAT_401_80_240606085006035.hdr
├── HSI_R_2401856_240606084952173_SWIR-TOP-DAT_401_80_240606085006035.md5
├── HSI_R_2401856_240606084952173_SWIR-TOP-DAT_401_80_240606085006035.raw
└── HSI_R_2401856_240606084952173_SWIR-TOP-DAT_401_80_240606085006035.tiff
- Parent directory
- {Experiment_Number}_{Treatment}
- Example: 407_WL, 407_WW, 401_20-1, 401_20-2, 401_80
- Data files
Header File (.hdr)
: This file stores metadata associated with the data acquisition process. It includes information such as sensor specifications, date and time of capture, etc.MD5 File (.md5)
: This file contains a Message Digest 5 (MD5) checksum. The MD5 checksum is a unique identifier generated from the data itself and can be used to verify data integrity. Any accidental corruption during storage or transmission will result in a mismatch between the calculated MD5 checksum and the one stored in the file.Hyperspectral Datacube (.raw)
: This file stores the raw hyperspectral data. A hyperspectral datacube is a three-dimensional array where each layer represents a two-dimensional image captured at a specific wavelength band.Image File (.tiff)
: This format stores two image representations depending on the data type. For VNIR data, a True Color Composite (TCC) image is generated. This image combines red, green, and blue bands from the hyperspectral datacube, creating a color representation similar to the natural scene. Conversely, SWIR data utilizes a False Color Composite (FCC) image. Here, bands from the datacube are combined, but the chosen bands may not correspond to the typical red, green, and blue channels. It is important to note that both TCC and FCC images are primarily used for quality control purposes.- Filename convention:
HSI_R_{POT_BARCODE}_{Time_In}_{VNIR|SWIR}-{SIDE|TOP}-DAT_{Experiment_No}_{Treatment}_{Time_Out}.{hdr|md5|raw|tiff}
. For additional information, refer to this link.
Example (does not contain all associated files):
401 (/depot/smarterag/hsiproc/401 or /depot/smarterag/hsiproc/working/401)
├── Hyperspectral_data_AAPF_experiment_401.xlsx (masterfile)
├── quality_control
│ ├── header_summary_plants.csv
│ ├── header_summary_ref.csv
│ ├── HSI_R_2401761_240514080207482_VNIR-SIDE-DAT_401_20-1_240514080219626.gif
│ ├── HSI_R_2401761_240514080207482_VNIR-TOP-DAT_401_20-1_24051408022152.gif
│ ├── HSI_R_2401761_24061408172718_VNIR-SIDE-DAT_401_20-1_240614081746352.gif
└── VI_images
├── NDVI_default.zip
├── NDWI_default.zip
├── NLI_default.zip
└── NMDI_default.zip
- Parent directory
- {Experiment_Number}
- Example: 401, 407
- Subdirectories
- qualtiy control
- VI_images
- Data files
Masterfile (.xlsx)
: This file stores the derived phenotypic measurements of the plants and a guide document.HSI Data Processing Quality Control Data (.csv and .gif files under quality_control directory)
: This data contains information for quality assessment of the HSI data processing pipeline. It includes summaries of header information for plants, white and dark reference (header_summary_plants.csv, header_summary_ref.csv), a detailed record of the processing procedures employed (processing_strategy.csv), and animated gif files used to check coregistration quality between visible and near infrared (VNIR) and shortwave infrared (SWIR) data. The filename conventions for gif files are same as raw scanned data.Compressed Vegetation Index Images (.zip, under VI_images directory)
: This section houses compressed image files for various types of vegetation indices calculated from the HSI data.
The definition of column headers and corresponding values mostly originates from the sensor vendor and system integrator.
Column Head | Description |
---|---|
header path |
File path to the header file |
camera model |
Camera mode |
serial number |
Serial number of the camera |
roi left |
Camera- and height-specific parameter, not relevant to vegetation segments |
roi top |
Camera- and height-specific parameter, not relevant to vegetation segments |
roi width |
Camera- and height-specific parameter, not relevant to vegetation segments |
roi height |
Camera- and height-specific parameter, not relevant to vegetation segments |
samples |
Width of data cube (conveyer belt direction) |
bands |
Number of spectral channels |
lines |
Number of scanslines from top to bottom |
scanStartTime |
Scan start time |
scanEndTime |
Scan end time |
Wavelength |
List of central wavelengths for each channel (separated by "|") |
fwhm |
Full wavelength at half maximum (width of wavelengths, separated by "|" |
PlantID |
Plant ID (POT_BARCODE) |
CarrierID |
System-generated carrier ID |
PlantHeight |
Plnat height (❗ calculation method unclear) |
PlantWidth |
Plant width (❗ calculation method unclear) |
LotID |
Experiment ID |
CameraPosition |
The value in this column is set to "Both" when data from the corresponding opposing viewpoint (top for side views, side for top view) is also available. (❗ Verification pending) |
illumination |
"4" under normal illumination conditions (no malfunction) |
specdir |
Spectral range of the camera and viewing direction (VNIR-SIDE, VNIR-TOP, SWIR-SIDE, or SWIR-TOP) |
scan_order |
The 'Scan order' column assigns a unique integer value to each scan incrementally based on the order of image capture during the experiment within the AAPF. This value is identical for all cameras (VNIR-SIDE, VNIR-TOP, SWIR-SIDE, and SWIR-TOP) operating concurrently during a single scanning session, which captures data for the same plant on the same day |
dark white |
"Dark" or "White", only available for header_summary_ref.csv |
This table provides a concise overview of the data processing steps applied concurrently to each HSI acquisition session, encompassing output data products from VNIR-SIDE, VNIR-TOP, SWIR-SIDE, and SWIR-TOP sensors. A detailed explanation of each column header follows:
Column Head | Description |
---|---|
scan_order |
This column references the same scan_order defined in the Header Summary section. |
LotID |
This column references the same LotID code defined in the Header Summary section. |
VNIR-SIDE VNIR-TOP SWIR-SIDE SWIR-TOP |
These columns specify the corresponding header filenames for each sensor and view combination (SIDE or TOP) based on the designated scan order |
step1 |
This column outlines the processing strategy employed for side-view data (SIDE). When both VNIR and SWIR data are available, the strategy is Combine SIDE , involving coregistration of VNIR and SWIR data during processing. If only VNIR data exists, the strategy is Use VNIR-SIDE , where coregistration is not performed, and all data extraction relies solely on VNIR data. In the absence of VNIR data or both VNIR and SWIR data, the strategy is denoted as Pass . |
step2 |
This column mirrors the processing strategy descriptions in step1 , but specifically for top-down view data (TOP). Values in this column directly correspond to those in step1 , simply replacing "SIDE" with "TOP" to reflect the view orientation. |
Side view | Top-down view |
---|---|
This section details the interpretation of an animated GIF file used to evaluate the coregistration quality between VNIR (visible and near-infrared) and SWIR (short-wave infrared) imagery. The animation consists of four frames, each displaying a vegetation mask derived from a specific processing step:
1/4 frame
: Vegetation mask obtained from the VNIR data (presented upside-down to match the original data orientation for both side and top-down views).2/4 frame
: Vegetation mask derived from the SWIR data after an initial coarse alignment using affine transformation parameters (translation, scaling, and rotation) specified in HSI data processing parameters.3/4 frame
: Vegetation mask from the SWIR data following a subsequent geometric refinement using the cv.findTransformECC function. Significant misalignment between3/4 frame
and4/4 frame
indicate the need to adjust the translation, scaling, and rotation parameters within HSI data processing parameters.4/4 frame
: The vegetation mask from1/4 frame
with an additional morphological filter applied. Thedefault
parameter applies no filtering, whileupper
,middle
,andlower
parameters utilize only the upper, middle, or lower third of the plant segments, respectively (applicable to side view images only). Future iterations may incorporate additional morphological filter options.
Phenotypic measurements derived from the HSI data and auxiliary measurements are stored in spreadsheets. The master data file (*.xlsx) is organized into two distinct groups:
Vegetative_indices_Side
,Vegetative_indices_Top
: This worksheet houses information on various vegetation indices calculated for side and top views.Reflectance_Side
,Reflectance_Top
: This worksheet contains reflectance values at each measured wavelength, categorized by side and top views.
Here's a breakdown of the column headers for the above worksheets:
Column Head | Description |
---|---|
Filename-VNIR-SIDE |
Filename of input VNIR-SIDE data |
Filename-VNIR-TOP |
Filename of input VNIR-TOP data |
Filename-SWIR-SIDE |
Filename of input SWIR-SIDE data |
Filename-SWIR-TOP |
Filename of input SWIR-TOP data |
EXP ID |
Internal experiment number assigned in PPEW |
POT_BARCODE |
Unique number assigned to plant |
VARIETY |
Variety assigned in PPEW |
TREATMENT |
Treatment applied to plant |
SCAN_TIME |
Scan start time |
SCAN_DATE |
Scan start date |
DFP |
Age of plant in days from planting at time of imaging |
{vegetation index}_{statistic} (e.g., NDVI_avg) |
Vegetation index (VI) statistics within masked vegetation areas List of vegetation indices List of statistics: avg (average), sd(standard deviation), max, min, p## (##-th percentile) |
Wavelength in nm (e.g., 402.8) |
Average reflectance over vegetation mask area (ratio of reflected energy to incident energy) |