<div class="csl-bib-body">
<div class="csl-entry">Torfah, H., Shah, S., Chakraborty, S., Akshay, S., & Seshia, S. A. (2021). Synthesizing Pareto-Optimal Interpretations for Black-Box Models. In <i>Proceedings of the 21st Conference on Formal Methods in Computer-Aided Design – FMCAD 2021</i> (pp. 153–162). TU Wien Academic Press. https://doi.org/10.34727/2021/isbn.978-3-85448-046-4_24</div>
</div>
We present a new multi-objective optimization approach
for synthesizing interpretations that “explain” the behavior
of black-box machine learning models. Constructing
human-understandable interpretations for black-box models often
requires balancing conflicting objectives. A simple interpretation
may be easier to understand for humans while being less precise
in its predictions vis-a-vis a complex interpretation. Existing
methods for synthesizing interpretations use a single objective
function and are often optimized for a single class of interpretations.
In contrast, we provide a more general and multi-objective
synthesis framework that allows users to choose (1) the class of
syntactic templates from which an interpretation should be synthesized,
and (2) quantitative measures on both the correctness
and explainability of an interpretation. For a given black-box,
our approach yields a set of Pareto-optimal interpretations with
respect to the correctness and explainability measures. We show
that the underlying multi-objective optimization problem can be
solved via a reduction to quantitative constraint solving, such as
weighted maximum satisfiability. To demonstrate the benefits of
our approach, we have applied it to synthesize interpretations
for black-box neural-network classifiers. Our experiments show
that there often exists a rich and varied set of choices for
interpretations that are missed by existing approaches.
en
dc.language.iso
en
-
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
-
dc.subject
formal methods
en
dc.subject
formale Methode
de
dc.title
Synthesizing Pareto-Optimal Interpretations for Black-Box Models
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.rights.license
Creative Commons Namensnennung 4.0 International
de
dc.rights.license
Creative Commons Attribution 4.0 International
en
dc.identifier.doi
10.34727/2021/isbn.978-3-85448-046-4_24
-
dc.contributor.affiliation
University of California, Berkeley, United States of America (the)
-
dc.contributor.affiliation
Indian Institute of Technology Bombay, India
-
dc.contributor.affiliation
Indian Institute of Technology Bombay, India
-
dc.contributor.affiliation
Indian Institute of Technology Bombay, India
-
dc.contributor.affiliation
University of California, Berkeley, United States of America (the)
-
dc.relation.isbn
978-3-85448-046-4
-
dc.relation.doi
10.34727/2021/isbn.978-3-85448-046-4
-
dc.description.volume
2
-
dc.description.startpage
153
-
dc.description.endpage
162
-
dc.type.category
Full-Paper Contribution
-
dc.relation.eissn
2708-7824
-
tuw.booktitle
Proceedings of the 21st Conference on Formal Methods in Computer-Aided Design – FMCAD 2021
-
tuw.peerreviewed
true
-
tuw.relation.haspart
10.34727/2021/isbn.978-3-85448-046-4
-
tuw.relation.publisher
TU Wien Academic Press
-
tuw.relation.publisherplace
Wien
-
tuw.book.chapter
24
-
tuw.publication.orgunit
E192-04 - Forschungsbereich Formal Methods in Systems Engineering
-
dc.identifier.libraryid
AC17204489
-
dc.description.numberOfPages
10
-
tuw.relation.ispartoftuwseries
Conference Series: Formal Methods in Computer-Aided Design