Synthesizing Pareto-Optimal Interpretations for Black-Box Models

Torfah, Hazem; Shah, Shetal; Chakraborty, Supratik; Akshay, S.; Seshia, Sanjit A.

doi:10.34727/2021/isbn.978-3-85448-046-4_24

DC Element

Wert

Sprache

dc.contributor.author

Torfah, Hazem

dc.contributor.author

Shah, Shetal

dc.contributor.author

Chakraborty, Supratik

dc.contributor.author

Akshay, S.

dc.contributor.author

Seshia, Sanjit A.

dc.date.accessioned

2021-10-14T08:07:23Z

dc.date.available

2021-10-14T08:07:23Z

dc.date.issued

2021-09

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Torfah, H., Shah, S., Chakraborty, S., Akshay, S., & Seshia, S. A. (2021). Synthesizing Pareto-Optimal Interpretations for Black-Box Models. In <i>Proceedings of the 21st Conference on Formal Methods in Computer-Aided Design – FMCAD 2021</i> (pp. 153–162). TU Wien Academic Press. https://doi.org/10.34727/2021/isbn.978-3-85448-046-4_24</div> </div>

dc.identifier.uri

http://hdl.handle.net/20.500.12708/18643

dc.identifier.uri

https://doi.org/10.34727/2021/isbn.978-3-85448-046-4_24

dc.description.abstract

We present a new multi-objective optimization approach for synthesizing interpretations that “explain” the behavior of black-box machine learning models. Constructing human-understandable interpretations for black-box models often requires balancing conflicting objectives. A simple interpretation may be easier to understand for humans while being less precise in its predictions vis-a-vis a complex interpretation. Existing methods for synthesizing interpretations use a single objective function and are often optimized for a single class of interpretations. In contrast, we provide a more general and multi-objective synthesis framework that allows users to choose (1) the class of syntactic templates from which an interpretation should be synthesized, and (2) quantitative measures on both the correctness and explainability of an interpretation. For a given black-box, our approach yields a set of Pareto-optimal interpretations with respect to the correctness and explainability measures. We show that the underlying multi-objective optimization problem can be solved via a reduction to quantitative constraint solving, such as weighted maximum satisfiability. To demonstrate the benefits of our approach, we have applied it to synthesize interpretations for black-box neural-network classifiers. Our experiments show that there often exists a rich and varied set of choices for interpretations that are missed by existing approaches.

dc.language.iso

dc.rights.uri

http://creativecommons.org/licenses/by/4.0/

dc.subject

formal methods

dc.subject

formale Methode

dc.title

Synthesizing Pareto-Optimal Interpretations for Black-Box Models

dc.type

Inproceedings

dc.type

Konferenzbeitrag

dc.rights.license

Creative Commons Namensnennung 4.0 International

dc.rights.license

Creative Commons Attribution 4.0 International

dc.identifier.doi

10.34727/2021/isbn.978-3-85448-046-4_24

dc.contributor.affiliation

University of California, Berkeley, United States of America (the)

dc.contributor.affiliation

Indian Institute of Technology Bombay, India

dc.contributor.affiliation

Indian Institute of Technology Bombay, India

dc.contributor.affiliation

Indian Institute of Technology Bombay, India

dc.contributor.affiliation

University of California, Berkeley, United States of America (the)

dc.relation.isbn

978-3-85448-046-4

dc.relation.doi

10.34727/2021/isbn.978-3-85448-046-4

dc.description.volume

dc.description.startpage

153

dc.description.endpage

162

dc.type.category

Full-Paper Contribution

dc.relation.eissn

2708-7824

tuw.booktitle

Proceedings of the 21st Conference on Formal Methods in Computer-Aided Design – FMCAD 2021

tuw.peerreviewed

true

tuw.relation.haspart

10.34727/2021/isbn.978-3-85448-046-4

tuw.relation.publisher

TU Wien Academic Press

tuw.relation.publisherplace

Wien

tuw.book.chapter

tuw.publication.orgunit

E192-04 - Forschungsbereich Formal Methods in Systems Engineering

dc.identifier.libraryid

AC17204489

dc.description.numberOfPages

tuw.relation.ispartoftuwseries

Conference Series: Formal Methods in Computer-Aided Design

tuw.author.orcid

0000-0002-9628-1200

tuw.author.orcid

0000-0002-7897-4900

tuw.author.orcid

0000-0002-7527-7675

tuw.author.orcid

0000-0002-2471-5997

tuw.author.orcid

0000-0001-6190-8707

dc.rights.identifier

CC BY 4.0

dc.rights.identifier

CC BY 4.0

item.mimetype

application/pdf

item.openairetype

conference paper

item.cerifentitytype

Publications

item.grantfulltext

open

item.languageiso639-1

item.openairecristype

http://purl.org/coar/resource_type/c_5794

item.openaccessfulltext

Open Access

item.fulltext

with Fulltext

crisitem.author.dept

University of California, Berkeley

crisitem.author.dept

Indian Institute of Technology Bombay

crisitem.author.dept

Indian Institute of Technology Bombay

crisitem.author.dept

Indian Institute of Technology Bombay

crisitem.author.dept

University of California, Berkeley

Enthalten in den Sammlungen:

Open Access Series
Conference Paper

Volltext (Version of Record (published version))

Adobe PDF

(321.9 kB)

CC BY 4.0

Zur Kurzanzeige

Seiten Aufrufe

617

aufgerufen am 19.11.2023

Download(s)

152

aufgerufen am 19.11.2023

Google Scholar^TM

Check

Seiten Aufrufe

Download(s)

Google ScholarTM

Google Scholar^TM