DC Field
Value
Language
dc.contributor.author
Matys Grygar, Tomas
-
dc.contributor.author
Radojičić, Una
-
dc.contributor.author
Pavlu, Ivana
-
dc.contributor.author
Greven, Sonja
-
dc.contributor.author
Nešlehová, Johanna G.
-
dc.contributor.author
Tůmová, Štěpánka
-
dc.contributor.author
Hron, Karel
-
dc.date.accessioned
2025-01-21T13:09:49Z
-
dc.date.available
2025-01-21T13:09:49Z
-
dc.date.issued
2024-04
-
dc.identifier.citation
<div class="csl-bib-body">
<div class="csl-entry">Matys Grygar, T., Radojičić, U., Pavlu, I., Greven, S., Nešlehová, J. G., Tůmová, Š., & Hron, K. (2024). Exploratory functional data analysis of multivariate densities for the identification of agricultural soil contamination by risk elements. <i>Journal of Geochemical Exploration</i>, <i>259</i>, Article 107416. https://doi.org/10.1016/j.gexplo.2024.107416</div>
</div>
-
dc.identifier.issn
0375-6742
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/209198
-
dc.description.abstract
Geochemical mapping of risk element concentrations in soils is performed in many countries around the world. It results in numerous large datasets of high analytical quality, which can be used to identify soils that violate individual legislative limits for safe food production. However, there is a lack of advanced data mining tools that would be suitable for sensitive exploratory data analysis of big data while respecting the natural variability of soil composition. To distinguish anthropogenic contamination from natural variations, the analysis of the entire data distribution for smaller subareas is key. In this article, we propose a new data mining methodology for geochemical mapping data based on functional data analysis of probability densities in the framework of Bayes spaces after post-stratification of a big dataset to smaller districts. The tools we propose allow us to analyse the entire distribution, going well beyond a superficial detection of extreme concentration anomalies. We illustrate the proposed methodology on a dataset gathered according to the Czech national legislation (1990–2009), whose information content has not yet been fully exploited. Taking into account specific properties of probability density functions and recent results for orthogonal decomposition of multivariate densities enabled us to reveal real contamination patterns that were so far only suspected in Czech agricultural soils. We process the above Czech soil composition dataset for Cu, Pb, and Zn by first compartmentalizing it into spatial units, the so-called districts, and by subsequently clustering these districts according to diagnostic features of their uni- and multivariate distributions at high concentration levels. These clusters were seen to correspond to compartments that show known features of contamination, such as historical metallurgy of non-ferrous metals and iron and steel production. Comparison between compartments, notably neighbouring districts with similar natural factors controlling soil variability, is key to the reliable distinction of diffuse contamination. In this work, we used soil contamination by Cu-bearing pesticides as an example for empirical testing of the proposed data mining approach. In general, there are no natural and justifiable thresholds of risk element concentrations that would be valid for geographical areas with too much natural heterogeneity. Therefore, national (or larger) soil geochemistry datasets cannot be processed as a whole. As we demonstrate in this paper, empirical knowledge and careful tailoring of statistical tools for the characteristic types of soil contamination are essential for unequivocal identification of the anthropogenic component in real datasets.
en
dc.description.sponsorship
FWF - Österr. Wissenschaftsfonds
-
dc.language.iso
en
-
dc.publisher
ELSEVIER
-
dc.relation.ispartof
Journal of Geochemical Exploration
-
dc.subject
Bayes spaces
en
dc.subject
Compartmentalisation
en
dc.subject
Cu-bearing pesticides
en
dc.subject
FDA for geochemical maps
en
dc.subject
FDA of univariate and multivariate densities
en
dc.subject
Identification of Czech agricultural soil contamination
en
dc.title
Exploratory functional data analysis of multivariate densities for the identification of agricultural soil contamination by risk elements
en
dc.type
Article
en
dc.type
Artikel
de
dc.identifier.scopus
2-s2.0-85184990454
-
dc.identifier.url
https://api.elsevier.com/content/abstract/scopus_id/85184990454
-
dc.contributor.affiliation
Czech Academy of Sciences, Institute of Inorganic Chemistry, Czechia
-
dc.contributor.affiliation
Palacký University Olomouc, Czechia
-
dc.contributor.affiliation
Humboldt-Universität zu Berlin, Germany
-
dc.contributor.affiliation
McGill University, Canada
-
dc.contributor.affiliation
Palacký University Olomouc, Czechia
-
dc.relation.grantno
I 5799-N
-
dc.type.category
Original Research Article
-
tuw.container.volume
259
-
tuw.journal.peerreviewed
true
-
tuw.peerreviewed
true
-
wb.publication.intCoWork
International Co-publication
-
tuw.project.title
Generalisierte relative Daten und Robustheit in Bayes Räumen
-
tuw.researchTopic.id
C4
-
tuw.researchTopic.name
Mathematical and Algorithmic Foundations
-
tuw.researchTopic.value
100
-
dcterms.isPartOf.title
Journal of Geochemical Exploration
-
tuw.publication.orgunit
E105-06 - Forschungsbereich Computational Statistics
-
tuw.publisher.doi
10.1016/j.gexplo.2024.107416
-
dc.date.onlinefirst
2024
-
dc.identifier.articleid
107416
-
dc.identifier.eissn
1879-1689
-
tuw.author.orcid
0000-0003-0931-0390
-
tuw.author.orcid
0000-0003-0495-850X
-
tuw.author.orcid
0000-0001-9634-4796
-
tuw.author.orcid
0000-0002-6214-8786
-
tuw.author.orcid
0000-0002-1847-6598
-
wb.sci
true
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Wirtschaftswissenschaften
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
5020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
10
-
wb.sciencebranch.value
20
-
wb.sciencebranch.value
70
-
item.openairetype
research article
-
item.cerifentitytype
Publications
-
item.grantfulltext
none
-
item.languageiso639-1
en
-
item.openairecristype
http://purl.org/coar/resource_type/c_2df8fbb1
-
item.fulltext
no Fulltext
-
crisitem.project.funder
FWF - Österr. Wissenschaftsfonds
-
crisitem.project.grantno
I 5799-N
-
crisitem.author.dept
Czech Academy of Sciences, Institute of Inorganic Chemistry
-
crisitem.author.dept
E105-06 - Forschungsbereich Computational Statistics
-
crisitem.author.dept
Palacký University Olomouc
-
crisitem.author.dept
Humboldt-Universität zu Berlin
-
crisitem.author.dept
McGill University
-
crisitem.author.dept
Palacký University Olomouc
-
crisitem.author.orcid
0000-0003-0931-0390
-
crisitem.author.orcid
0000-0003-0495-850X
-
crisitem.author.orcid
0000-0002-6214-8786
-
crisitem.author.orcid
0000-0002-1847-6598
-
crisitem.author.parentorg
E105 - Institut für Stochastik und Wirtschaftsmathematik
-
Appears in Collections: