Hybrid Coarse-to-Fine Regression for HVAC Output Prediction under Limited Training Data
Document Type
Article
Publication Date
Summer 5-15-2026
Abstract
Accurate prediction of HVAC outlet conditions and power consumption is essential for efficient system design, monitoring, and energy management. In many practical deployments, however, the availability of high-quality training data is limited, causing conventional regression models to overfit and exhibit poor generalization. This work proposes a coarse-to-fine hybrid regression framework that decomposes the target variable in the output space via quantization, yielding a coarse component and a residual component. The coarse component is modeled using a constrained Gradient Boosted Machine (GBM) to capture dominant nonlinear trends, while the residual is modeled using regularized learners, including Ridge regression (L2-regularized linear regression) and Support Vector Regression (SVR), to capture finer-scale variations. This decomposition reduces estimation variance and improves robustness under data-scarce conditions. The proposed framework is evaluated alongside Gaussian Process Regression (GPR), standalone GBM, SVR, and Ridge regression using experimental datasets collected from six commercial HVAC systems provided by three Original Equipment Manufacturers (OEMs), covering both direct expansion (DX) and hybrid evaporative configurations. Across multiple outputs, the hybrid models demonstrate consistent improvements for power prediction, achieving RMSE reductions of approximately 5–10% relative to the best standalone baseline models under limited-data conditions. To assess scalability, a sensitivity study is conducted using the ASHRAE Great Energy Predictor III dataset, spanning approximately 3.5 × 102 to 2.4 × 105 valid samples. The results show that the hybrid models outperform standalone GBM in data-limited regimes, with the performance gap gradually decreasing as the sample size increases, and both approaches converging toward similar accuracy at higher data availability levels. Computational analysis indicates that the Hybrid-GBM-Ridge (HGR) configuration exhibits runtime and memory scaling behavior comparable to standalone GBM, introducing only marginal overhead, whereas Hybrid-GBM-SVR (HGS) becomes computationally expensive at large sample sizes due to kernel scaling. These results highlight the proposed framework as a practical and data-efficient solution for HVAC performance prediction.
Recommended Citation
safwat, hesham, "Hybrid Coarse-to-Fine Regression for HVAC Output Prediction under Limited Training Data" (2026). Mechanical Engineering. 274.
https://buescholar.bue.edu.eg/mech_eng/274