<div class="csl-bib-body">
<div class="csl-entry">Jung, A., Eldar, Y. C., & Görtz, N. (2016). On the Minimax Risk of Dictionary Learning. <i>IEEE Transactions on Information Theory</i>, <i>62</i>(3), 1501–1515. https://doi.org/10.1109/tit.2016.2517006</div>
</div>
-
dc.identifier.issn
0018-9448
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/148791
-
dc.description.abstract
We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size.
de
dc.description.abstract
We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size.
en
dc.language.iso
en
-
dc.publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
-
dc.relation.ispartof
IEEE Transactions on Information Theory
-
dc.subject
Computer Science Applications
-
dc.subject
Information Systems
-
dc.subject
Library and Information Sciences
-
dc.subject
Signal Processing
-
dc.subject
Dictionary learning
-
dc.subject
Sparse Modelling
-
dc.title
On the Minimax Risk of Dictionary Learning
en
dc.type
Artikel
de
dc.type
Article
en
dc.description.startpage
1501
-
dc.description.endpage
1515
-
dc.type.category
Original Research Article
-
tuw.container.volume
62
-
tuw.container.issue
3
-
tuw.journal.peerreviewed
true
-
tuw.peerreviewed
true
-
wb.publication.intCoWork
International Co-publication
-
tuw.researchTopic.id
I7
-
tuw.researchTopic.name
Telecommunication
-
tuw.researchTopic.value
100
-
dcterms.isPartOf.title
IEEE Transactions on Information Theory
-
tuw.publication.orgunit
E389-03 - Forschungsbereich Signal Processing
-
tuw.publisher.doi
10.1109/tit.2016.2517006
-
dc.identifier.eissn
1557-9654
-
dc.description.numberOfPages
15
-
wb.sci
true
-
wb.sciencebranch
Elektrotechnik, Elektronik, Informationstechnik
-
wb.sciencebranch.oefos
2020
-
wb.facultyfocus
Telekommunikation
de
wb.facultyfocus
Telecommunications
en
wb.facultyfocus.faculty
E350
-
item.languageiso639-1
en
-
item.grantfulltext
none
-
item.cerifentitytype
Publications
-
item.openairetype
research article
-
item.openairecristype
http://purl.org/coar/resource_type/c_2df8fbb1
-
item.fulltext
no Fulltext
-
crisitem.author.dept
E389 - Institute of Telecommunications
-
crisitem.author.dept
E389-03 - Forschungsbereich Signal Processing
-
crisitem.author.orcid
0000-0002-0715-2627
-
crisitem.author.parentorg
E350 - Fakultät für Elektrotechnik und Informationstechnik