Discrete Neighborhood Representations and Modified Stacked Generalization Methods for Distributed Regression

Héctor Allende; Raúl Monge; Claudio Moraga

doi:10.3217/jucs-021-06-0842

JUCS - Journal of Universal Computer Science 21(6): 842-855, doi: 10.3217/jucs-021-06-0842

Discrete Neighborhood Representations and Modified Stacked Generalization Methods for Distributed Regression

Héctor Allende-Cid^‡, Héctor Allende^§, Raúl Monge^§, Claudio Moraga^|

‡ Pontifícia Universidad Católica de Valparaíso, Valparaíso, Chile§ Universidad Técnica Federico Santa María, Valparaíso, Chile| European Centre for Soft Computing, Mieres, Spain

Corresponding author: Héctor Allende-Cid

This article is freely available under the J.UCS Open Content License.

Citation: Allende-Cid H, Allende H, Monge R, Moraga C (2015) Discrete Neighborhood Representations and Modified Stacked Generalization Methods for Distributed Regression. JUCS - Journal of Universal Computer Science 21(6): 842-855. https://doi.org/10.3217/jucs-021-06-0842

Abstract

When distributed data sources have different contexts the problem of Distributed Re-gression becomes severe. It is the underlying law of probability that constitutes the context of a source. A new Distributed Regression System is presented, which makes use of a discrete rep-resentation of the probability density functions (pdfs). Neighborhoods of similar datasets are detected by comparing their approximated pdfs. This information supports an ensemble-basedapproach, and the improvement of a second level unit, as it is the case in stacked generalization. Two synthetic and six real data sets are used to compare the proposed method with otherstate-of-the-art models. The obtained results are positive for most datasets.

Keywords

distributed machine learning, context-aware regression, similarity representation