%0 Journal Article %D 2020 %T Auditing and Debugging Deep Learning Models via Decision Boundaries: Individual-level and Group-level Analysis %A Roozbeh Yousefzadeh %A Dianne P. O'Leary %X

Deep learning models have been criticized for their lack of easy interpretation, which undermines confidence in their use for important applications. Nevertheless, they are consistently utilized in many applications, consequential to humans' lives, mostly because of their better performance. Therefore, there is a great need for computational methods that can explain, audit, and debug such models. Here, we use flip points to accomplish these goals for deep learning models with continuous output scores (e.g., computed by softmax), used in social applications. A flip point is any point that lies on the boundary between two output classes: e.g. for a model with a binary yes/no output, a flip point is any input that generates equal scores for "yes" and "no". The flip point closest to a given input is of particular importance because it reveals the least changes in the input that would change a model's classification, and we show that it is the solution to a well-posed optimization problem. Flip points also enable us to systematically study the decision boundaries of a deep learning classifier. The resulting insight into the decision boundaries of a deep model can clearly explain the model's output on the individual-level, via an explanation report that is understandable by non-experts. We also develop a procedure to understand and audit model behavior towards groups of people. Flip points can also be used to alter the decision boundaries in order to improve undesirable behaviors. We demonstrate our methods by investigating several models trained on standard datasets used in social applications of machine learning. We also identify the features that are most responsible for particular classifications and misclassifications.

%8 1/2/2020 %G eng %U https://arxiv.org/abs/2001.00682 %0 Journal Article %D 2019 %T Interpreting Neural Networks Using Flip Points %A Roozbeh Yousefzadeh %A Dianne P. O'Leary %X

Neural networks have been criticized for their lack of easy interpretation, which undermines confidence in their use for important applications. Here, we introduce a novel technique, interpreting a trained neural network by investigating its flip points. A flip point is any point that lies on the boundary between two output classes: e.g. for a neural network with a binary yes/no output, a flip point is any input that generates equal scores for "yes" and "no". The flip point closest to a given input is of particular importance, and this point is the solution to a well-posed optimization problem. This paper gives an overview of the uses of flip points and how they are computed. Through results on standard datasets, we demonstrate how flip points can be used to provide detailed interpretation of the output produced by a neural network. Moreover, for a given input, flip points enable us to measure confidence in the correctness of outputs much more effectively than softmax score. They also identify influential features of the inputs, identify bias, and find changes in the input that change the output of the model. We show that distance between an input and the closest flip point identifies the most influential points in the training data. Using principal component analysis (PCA) and rank-revealing QR factorization (RR-QR), the set of directions from each training input to its closest flip point provides explanations of how a trained neural network processes an entire dataset: what features are most important for classification into a given class, which features are most responsible for particular misclassifications, how an adversary might fool the network, etc. Although we investigate flip points for neural networks, their usefulness is actually model-agnostic.

%8 03/20/2019 %G eng %U https://arxiv.org/abs/1903.08789 %0 Journal Article %D 2019 %T A Probabilistic Framework and a Homotopy Method for Real-time Hierarchical Freight Dispatch Decisions %A Roozbeh Yousefzadeh %A Dianne P. O'Leary %X

We propose a real-time decision framework for multimodal freight dispatch through a system of hierarchical hubs, using a probabilistic model for transit times. Instead of assigning a fixed time to each transit, we advocate using historical records to identify characteristics of the probability density function for each transit time. We formulate a nonlinear optimization problem that defines dispatch decisions that minimize expected cost, using this probabilistic information. Finally, we propose an effective homotopy algorithm that (empirically) outperforms standard optimization algorithms on this problem by taking advantage of its structure, and we demonstrate its effectiveness on numerical examples.

%8 2019/12/8 %G eng %U https://arxiv.org/abs/1912.03733 %0 Journal Article %J Journal of Computational Physics %D 2014 %T Adaptive change of basis in entropy-based moment closures for linear kinetic equations %A Graham W. Alldredge %A Cory D. Hauck %A Dianne P. O'Leary %A André L. Tits %X Entropy-based (M_N) moment closures for kinetic equations are defined by a constrained optimization problem that must be solved at every point in a space-time mesh, making it important to solve these optimization problems accurately and efficiently. We present a complete and practical numerical algorithm for solving the dual problem in one-dimensional, slab geometries. The closure is only well-defined on the set of moments that are realizable from a positive underlying distribution, and as the boundary of the realizable set is approached, the dual problem becomes increasingly difficult to solve due to ill-conditioning of the Hessian matrix. To improve the condition number of the Hessian, we advocate the use of a change of polynomial basis, defined using a Cholesky factorization of the Hessian, that permits solution of problems nearer to the boundary of the realizable set. We also advocate a fixed quadrature scheme, rather than adaptive quadrature, since the latter introduces unnecessary expense and changes the computationally realizable set as the quadrature changes. For very ill-conditioned problems, we use regularization to make the optimization algorithm robust. We design a manufactured solution and demonstrate that the adaptive-basis optimization algorithm reduces the need for regularization. This is important since we also show that regularization slows, and even stalls, convergence of the numerical simulation when refining the space-time mesh. We also simulate two well-known benchmark problems. There we find that our adaptive-basis, fixed-quadrature algorithm uses less regularization than alternatives, although differences in the resulting numerical simulations are more sensitive to the regularization strategy than to the choice of basis. %B Journal of Computational Physics %V 258 %P 489 - 508 %8 2014/02/01 %G eng %U http://arxiv.org/abs/1306.2881v1 %! Journal of Computational Physics %R 10.1016/j.jcp.2013.10.049 %0 Journal Article %J MultiLing (Workshop on Multilingual Multi-document Summarization) %D 2013 %T Multilingual Summarization: Dimensionality Reduction and a Step Towards Optimal Term Coverage %A John M. Conroy %A Sashka T. Davis %A Jeff Kubina %A Yi-Kai Liu %A Dianne P. O'Leary %A Judith D. Schlesinger %X In this paper we present three term weighting approaches for multi-lingual document summarization and give results on the DUC 2002 data as well as on the 2013 Multilingual Wikipedia feature articles data set. We introduce a new intervalbounded nonnegative matrix factorization. We use this new method, latent semantic analysis (LSA), and latent Dirichlet allocation (LDA) to give three term-weighting methods for multi-document multi-lingual summarization. Results on DUC and TAC data, as well as on the MultiLing 2013 data, demonstrate that these methods are very promising, since they achieve oracle coverage scores in the range of humans for 6 of the 10 test languages. Finally, we present three term weighting approaches for the MultiLing13 single document summarization task on the Wikipedia featured articles. Our submissions signifi- cantly outperformed the baseline in 19 out of 41 languages. %B MultiLing (Workshop on Multilingual Multi-document Summarization) %P 55-63 %8 2013/08/09 %G eng %U http://aclweb.org/anthology/W/W13/W13-3108.pdf %0 Journal Article %J Quantum Information and Computation %D 2009 %T Locality Bounds on Hamiltonians for Stabilizer Codes %A Stephen S. Bullock %A Dianne P. O'Leary %X In this paper, we study the complexity of Hamiltonians whose groundstate is a stabilizer code. We introduce various notions of k-locality of a stabilizer code, inherited from the associated stabilizer group. A choice of generators leads to a Hamiltonian with the code in its groundspace. We establish bounds on the locality of any other Hamiltonian whose groundspace contains such a code, whether or not its Pauli tensor summands commute. Our results provide insight into the cost of creating an energy gap for passive error correction and for adiabatic quantum computing. The results simplify in the cases of XZ-split codes such as Calderbank-Shor-Steane stabilizer codes and topologically-ordered stabilizer codes arising from surface cellulations. %B Quantum Information and Computation %V 9 %8 2009/09/22 %G eng %U http://www.cs.umd.edu/~oleary/reprints/j91.pdf %0 Journal Article %J Physical Review A %D 2009 %T Quadratic fermionic interactions yield effective Hamiltonians for adiabatic quantum computing %A Michael J. O'Hara %A Dianne P. O'Leary %X Polynomially-large ground-state energy gaps are rare in many-body quantum systems, but useful for adiabatic quantum computing. We show analytically that the gap is generically polynomially-large for quadratic fermionic Hamiltonians. We then prove that adiabatic quantum computing can realize the ground states of Hamiltonians with certain random interactions, as well as the ground states of one, two, and three-dimensional fermionic interaction lattices, in polynomial time. Finally, we use the Jordan-Wigner transformation and a related transformation for spin-3/2 particles to show that our results can be restated using spin operators in a surprisingly simple manner. A direct consequence is that the one-dimensional cluster state can be found in polynomial time using adiabatic quantum computing. %B Physical Review A %V 79 %8 2009/3/24 %G eng %U http://arxiv.org/abs/0808.1768v1 %N 3 %! Phys. Rev. A %R 10.1103/PhysRevA.79.032331 %0 Journal Article %J Physical Review A %D 2008 %T The adiabatic theorem in the presence of noise %A Michael J. O'Hara %A Dianne P. O'Leary %X We provide rigorous bounds for the error of the adiabatic approximation of quantum mechanics under four sources of experimental error: perturbations in the initial condition, systematic time-dependent perturbations in the Hamiltonian, coupling to low-energy quantum systems, and decoherent time-dependent perturbations in the Hamiltonian. For decoherent perturbations, we find both upper and lower bounds on the evolution time to guarantee the adiabatic approximation performs within a prescribed tolerance. Our new results include explicit definitions of constants, and we apply them to the spin-1/2 particle in a rotating magnetic field, and to the superconducting flux qubit. We compare the theoretical bounds on the superconducting flux qubit to simulation results. %B Physical Review A %V 77 %8 2008/4/22 %G eng %U http://arxiv.org/abs/0801.3872v1 %N 4 %! Phys. Rev. A %R 10.1103/PhysRevA.77.042319 %0 Journal Article %J Physical Review A %D 2006 %T Parallelism for Quantum Computation with Qudits %A Dianne P. O'Leary %A Gavin K. Brennen %A Stephen S. Bullock %X Robust quantum computation with d-level quantum systems (qudits) poses two requirements: fast, parallel quantum gates and high fidelity two-qudit gates. We first describe how to implement parallel single qudit operations. It is by now well known that any single-qudit unitary can be decomposed into a sequence of Givens rotations on two-dimensional subspaces of the qudit state space. Using a coupling graph to represent physically allowed couplings between pairs of qudit states, we then show that the logical depth of the parallel gate sequence is equal to the height of an associated tree. The implementation of a given unitary can then optimize the tradeoff between gate time and resources used. These ideas are illustrated for qudits encoded in the ground hyperfine states of the atomic alkalies $^{87}$Rb and $^{133}$Cs. Second, we provide a protocol for implementing parallelized non-local two-qudit gates using the assistance of entangled qubit pairs. Because the entangled qubits can be prepared non-deterministically, this offers the possibility of high fidelity two-qudit gates. %B Physical Review A %V 74 %8 2006/9/28 %G eng %U http://arxiv.org/abs/quant-ph/0603081v1 %N 3 %! Phys. Rev. A %R 10.1103/PhysRevA.74.032334 %0 Journal Article %J Physical Review Letters %D 2005 %T Asymptotically Optimal Quantum Circuits for d-level Systems %A Stephen S. Bullock %A Dianne P. O'Leary %A Gavin K. Brennen %X As a qubit is a two-level quantum system whose state space is spanned by |0>, |1>, so a qudit is a d-level quantum system whose state space is spanned by |0>,...,|d-1>. Quantum computation has stimulated much recent interest in algorithms factoring unitary evolutions of an n-qubit state space into component two-particle unitary evolutions. In the absence of symmetry, Shende, Markov and Bullock use Sard's theorem to prove that at least C 4^n two-qubit unitary evolutions are required, while Vartiainen, Moettoenen, and Salomaa (VMS) use the QR matrix factorization and Gray codes in an optimal order construction involving two-particle evolutions. In this work, we note that Sard's theorem demands C d^{2n} two-qudit unitary evolutions to construct a generic (symmetry-less) n-qudit evolution. However, the VMS result applied to virtual-qubits only recovers optimal order in the case that d is a power of two. We further construct a QR decomposition for d-multi-level quantum logics, proving a sharp asymptotic of Theta(d^{2n}) two-qudit gates and thus closing the complexity question for all d-level systems (d finite.) Gray codes are not required, and the optimal Theta(d^{2n}) asymptotic also applies to gate libraries where two-qudit interactions are restricted by a choice of certain architectures. %B Physical Review Letters %V 94 %8 2005/6/14 %G eng %U http://arxiv.org/abs/quant-ph/0410116v2 %N 23 %! Phys. Rev. Lett. %R 10.1103/PhysRevLett.94.230502 %0 Journal Article %J Physical Review A %D 2005 %T Criteria for Exact Qudit Universality %A Gavin K. Brennen %A Dianne P. O'Leary %A Stephen S. Bullock %X We describe criteria for implementation of quantum computation in qudits. A qudit is a d-dimensional system whose Hilbert space is spanned by states |0>, |1>,... |d-1>. An important earlier work of Mathukrishnan and Stroud [1] describes how to exactly simulate an arbitrary unitary on multiple qudits using a 2d-1 parameter family of single qudit and two qudit gates. Their technique is based on the spectral decomposition of unitaries. Here we generalize this argument to show that exact universality follows given a discrete set of single qudit Hamiltonians and one two-qudit Hamiltonian. The technique is related to the QR-matrix decomposition of numerical linear algebra. We consider a generic physical system in which the single qudit Hamiltonians are a small collection of H_{jk}^x=\hbar\Omega (|k>

k iff H_{jk}^{x,y} are allowed Hamiltonians. One qudit exact universality follows iff this graph is connected, and complete universality results if the two-qudit Hamiltonian H=-\hbar\Omega |d-1,d-1>