# jemdoc: showsource = Brendan O'Donoghue ~~~ {}{img_left}{profile.jpeg}{Brendan O'Donoghue}{350}{350} Brendan O'Donoghue, Ph.D. \n Research Scientist at [http://deepmind.com/ DeepMind] \n Former advisor: [http://www.stanford.edu/~boyd/ Professor Stephen Boyd] \n [https://scholar.google.com/citations?user=0Pzjj-cAAAAJ Google scholar profile] \n \n Contact: [bodonoghue85@gmail.com email], [https://twitter.com/bodonoghue85 twitter], [https://github.com/bodono github] ~~~ == Education - Ph.D., M.S., Computer Science, \n [http://www.stanford.edu Stanford University], January 2013 \n - B.A., M.A., M.Eng., Information and Computer Engineering, \n [http://www.cai.cam.ac.uk/ Gonville and Caius College], \n [http://www.cam.ac.uk/ Cambridge University], June 2007 == Interests and current research - Artificial intelligence - Machine learning - Reinforcement learning - Dynamic systems and control - Convex optimization == Software - [https://www.cvxgrp.org/scs/ SCS]: Large-scale convex quadratic cone solver. == Journal Articles :{[https://arxiv.org/abs/2212.14530 POMRL: No-regret learning-to-plan with increasing horizons]} K. Khetarpal, C. Vernade, B. O'Donoghue, S. Singh, and T. Zahavy \n /Transactions on Machine Learning Research (TMLR)/, 2023. :{[publications/quad_scs.pdf Operator splitting for a homogeneous embedding of the monotone linear complementarity problem]} B. O'Donoghue \n /SIAM Journal on Optimization/, 31(3), pp. 1999-2023, August 2021. :{[http://www.stanford.edu/~boyd/papers/nonexp_global_aa1.html Globally convergent type-I Anderson acceleration for non-smooth fixed-point iterations]} J. Zhang, B. O'Donoghue, and S. Boyd \n /SIAM Journal on Optimization/, 30(4), pp. 3170-3197, November 2020. :{[https://rdcu.be/4sNU Clinically applicable deep learning for diagnosis and referral in retinal disease]} J. De Fauw, /et al./ \n /Nature Medicine/, 24(9), pp. 1342-1350, August 2018. :{[http://www.stanford.edu/~boyd/papers/scs.html Conic optimization via operator splitting and homogeneous self-dual embedding]} B. O'Donoghue, E. Chu, N. Parikh, and S. Boyd \n /Journal of Optimization Theory and Applications/, 169(3), pp. 1042-1068, June 2016. :{[publications/wireless.pdf Large-scale convex optimization for dense wireless cooperative networks]} Y. Shi, J. Zhang, B. O'Donoghue, and K. Letaief \n /IEEE Transactions on Signal Processing/, 63(18), pp. 4729-4743, September 2015.\n *IEEE 2016 SPS Young Author Best Paper Award*. :{[http://www.stanford.edu/~boyd/papers/adp_iter_bellman.html Approximate dynamic programming via iterated Bellman Inequalities]} Y. Wang, B. O'Donoghue, and S. Boyd \n /International Journal of Robust and Nonlinear Control/, 25(10), pp. 1472-1496, July 2015. :{[publications/adap_restart.pdf Adaptive restart for accelerated gradient schemes]} B. O'Donoghue and E. J. Candès \n /Foundations of computational mathematics/, 15(3), pp. 715-732, June 2015. :{[https://scholarship.rice.edu/bitstream/handle/1911/94752/Optimization-Methods.pdf?sequence=4 Fast alternating direction optimization methods]} T. Goldstein, B. O'Donoghue, S. Setzer, and R. Baraniuk \n /SIAM Journal on Imaging Sciences/, 7(3), pp.1588-1623, August 2014. :{[publications/oplc.pdf A spread-return mean-reverting model for credit spread dynamics]} B. O'Donoghue, M. Peacock, J. Lee, and L. Capriotti \n /International Journal of Theoretical and Applied Finance/, 17(3), pp. 1-14, May 2014. :{[http://www.stanford.edu/~boyd/papers/port_opt_bound.html Performance bounds and suboptimal policies for multi-period investment]} S. Boyd, M. Mueller, B. O'Donoghue, and Y. Wang \n /Foundations and Trends in Optimization/, 1(1), pp. 1-69, January 2014. :{[http://www.stanford.edu/~boyd/papers/oper_splt_ctrl.html A splitting method for optimal control]} B. O'Donoghue, G. Stathopoulos, and S. Boyd \n /IEEE Transactions on Control Systems Technology/, 21(6), pp. 2432-2442, November 2013. == Conference Articles :{[https://arxiv.org/abs/2311.13294 Probabilistic inference in reinforcement learning done right]} J. Tarbouriech, T. Lattimore, and B. O'Donoghue \n /Advances in Neural Information Processing Systems (NeurIPS)/, 2023. :{[https://arxiv.org/abs/2301.03236 Optimistic meta-gradients]} S. Flennerhag, T. Zahavy, B. O'Donoghue, H. van Hasselt, A. György, and S. Singh \n /Advances in Neural Information Processing Systems (NeurIPS)/, 2023. :{[http://arxiv.org/abs/2302.09339 Efficient exploration via epistemic-risk-seeking policy optimization]} B. O'Donoghue \n /Proceedings of the International Conference on Machine Learning (ICML)/, 2023. :{[https://arxiv.org/abs/2302.01275 ReLOAD: Reinforcement learning with optimistic ascent-descent for last-iterate convergence in constrained MDPs]} T. Moskovitz, B. O'Donoghue, V. Veeriah, S. Flennerhag, S. Singh, and T. Zahavy \n /Proceedings of the International Conference on Machine Learning (ICML)/, 2023. :{[https://arxiv.org/abs/2110.04629 The neural testbed: Evaluating joint predictions]} I. Osband, Z. Wen, S. Asghari, V. Dwaracherla, B. Hao, M. Ibrahimi, D. Lawson, X. Lu, B. O'Donoghue, and B. Van Roy \n /Advances in Neural Information Processing Systems (NeurIPS)/, 2022. :{[http://arxiv.org/abs/2110.15688 Variational Bayesian optimistic sampling]} B. O'Donoghue and T. Lattimore \n *Spotlight* /Advances in Neural Information Processing Systems (NeurIPS)/, 2021. :{[https://arxiv.org/pdf/2106.00661.pdf Reward is enough for convex MDPs]} T. Zahavy, B. O'Donoghue, G. Desjardins, and S. Singh \n *Spotlight* /Advances in Neural Information Processing Systems (NeurIPS)/, 2021. :{[https://arxiv.org/abs/1807.09647 Variational Bayesian reinforcement learning with regret bounds]} B. O'Donoghue \n /Advances in Neural Information Processing Systems (NeurIPS)/, 2021. :{[http://www.optimization-online.org/DB_HTML/2021/06/8439.html Practical large-scale linear programming using primal-dual hybrid gradient]} D. Applegate, M. Díaz, O. Hinder, H. Lu, M. Lubin, B. O'Donoghue, and W. Schudy \n /Advances in Neural Information Processing Systems (NeurIPS)/, 2021. :{[https://arxiv.org/abs/2006.05145 Matrix games with bandit feedback]} B. O'Donoghue, T. Lattimore, and I. Osband \n /Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence/, (UAI), 2021. :{[https://openreview.net/forum?id=PUkhWz65dy5 Discovering a set of policies for the worst case reward]} T. Zahavy, A. Barreto, D. Mankowitz, S. Hou, B. O'Donoghue, I. Kemaev, and S. Singh \n *Spotlight* /Proceedings of the International Conference on Learning Representations (ICLR)/, 2021. :{[https://web.stanford.edu/~boyd/papers/conv_reinforce.html Sample Efficient Reinforcement Learning with REINFORCE]} J. Zhang, J. Kim, B. O'Donoghue, and S. Boyd \n /Proceedings of the AAAI Conference on Artificial Intelligence/, 35(12), 10887-10895, 2021. :{[https://arxiv.org/abs/2001.00805 Making sense of reinforcement learning and probabilistic inference]} B. O'Donoghue, I Osband, and C. Ionescu \n *Spotlight* /Proceedings of the International Conference on Learning Representations (ICLR)/, 2020. :{[https://arxiv.org/abs/1906.02608 Hamiltonian descent for composite objectives]} B. O'Donoghue and C. J. Maddison \n /Advances in Neural Information Processing Systems (NeurIPS)/, 2019. :{[https://debug-ml-iclr2019.github.io/cameraready/DebugML-19_paper_6.pdf Visualizations of decision regions in the presence of adversarial examples]} G. Swirszcz, B. O'Donoghue, and P. Kohli \n /Debugging Machine Learning Models Workshop/, ICLR, 2019. :{[https://openreview.net/pdf?id=HyeFAsRctQ Verification of non-linear specifications for neural networks]} C. Qin, K. (Dj) Dvijotham, B. O'Donoghue, R. Bunel, R. Stanforth, S. Gowal, J. Uesato, G. Swirszcz, and P. Kohli \n /Proceedings of the International Conference on Learning Representations (ICLR)/, 2019. :{[https://arxiv.org/abs/1802.05666 Adversarial risk and the dangers of evaluating against weak attacks]} J. Uesato, B. O'Donoghue, A. van den Oord, and P. Kohli \n /Proceedings of the International Conference on Machine Learning (ICML)/, pp. 5025-5034, 2018. :{[https://arxiv.org/abs/1709.05380 The uncertainty Bellman equation and exploration]} B. O'Donoghue, I. Osband, R. Munos, and V. Mnih \n *Oral* /Proceedings of the International Conference on Machine Learning (ICML)/, pp. 3836-3845. 2018. :{[https://arxiv.org/abs/1611.01626 Combining policy gradient and Q-learning]} B. O'Donoghue, R. Munos, K. Kavukcuoglu, and V. Mnih \n /Proceedings of the International Conference on Learning Representations (ICLR)/, 2017. :{[http://www.stanford.edu/~boyd/papers/it_avf.html Iterated approximate value functions]} B. O'Donoghue, Y. Wang, and S. Boyd \n /Proceedings European Control Conference/, pp. 3882-3888, Zurich, July 2013. :{[http://www.stanford.edu/~boyd/papers/min_max_adp.html Min-max approximate dynamic programming]} B. O'Donoghue, Y. Wang, and S. Boyd \n /Proceedings IEEE Multi-Conference on Systems and Control/, pp. 424-431, September 2011. == Other :{[https://arxiv.org/abs/2210.12160 On the connection between Bregman divergence and value in regularized Markov decision processes]} B. O'Donoghue \n Technical note, 2022. :{[https://arxiv.org/abs/2012.13349 Solving mixed integer programs using neural networks]} V. Nair\*, S. Bartunov\*, F. Gimeno\*, I. von Glehn\*, P. Lichocki\*, I. Lobov\*, B. O'Donoghue\*, N. Sonnerat\*, C. Tjandraatmadja\*, P. Wang\*, /et al./ \n (\* Equal contribution). In submission, 2021. :{[https://arxiv.org/pdf/2106.00669.pdf Discovering diverse nearly optimal policies with successor features]} T. Zahavy, B. O'Donoghue, A. Barreto, V. Mnih, S. Flennerhag, and S. Singh \n Working draft, 2021. :{[https://arxiv.org/abs/1811.09300 Strength in numbers: Trading-off robustness and computation via adversarially-trained ensembles]} E. Grefenstette, R. Stanforth, B. O'Donoghue, J. Uesato, G. Swirszcz, and P. Kohli \n Working draft, 2018. :{[https://arxiv.org/abs/1809.05042 Hamiltonian descent methods]} C. J. Maddison, D. Paulin, Y. W. Teh, B. O'Donoghue, and A. Doucet \n Working draft, 2018. :{[https://arxiv.org/abs/1805.10265 Training verified learners with learned verifiers]} K. (Dj) Dvijotham, S. Gowal, R. Stanforth, R. Arandjelovic, B. O'Donoghue, J. Uesato, and P. Kohli \n Working draft, 2018. :{[http://www.stanford.edu/~boyd/papers/pdos.html A primal-dual operator splitting method for conic optimization]} E. Chu, B. O'Donoghue, N. Parikh, and S. Boyd \n /Stanford internal report/, (2013). == Ph.D. Thesis :{[thesis/bod_thesis.pdf Suboptimal control policies via convex optimization]} B. O'Donoghue