Alber, Maximilian, Sebastian Lapuschkin, Philipp Seegerer, Miriam Hägele, Kristof T. Schütt, Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller, Sven Dähne, and Pieter-Jan Kindermans. 2019. “iNNvestigate Neural Networks!” Journal of Machine Learning Research 20 (93): 1–8.

Allaire, JJ, and François Chollet. 2019. keras: R Interface to Keras.

Allison, P. 2014. “Measures of fit for logistic regression.” In Proceedings of the Sas Global Forum 2014 Conference. Cary, NC: SAS Institute Inc.

Alvarez-Melis, David, and Tommi S. Jaakkola. 2018. “On the Robustness of Interpretability Methods.” ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), June.

Apley, Dan. 2018. ALEPlot: Accumulated Local Effects (Ale) Plots and Partial Dependence (Pd) Plots.

Apley, Daniel W., and Jingyu Zhu. 2020. “Visualizing the effects of predictor variables in black box supervised learning models.” Journal of the Royal Statistical Society Series B 82 (4): 1059–86.

Azure. 2019. Microsoft Cognitive Services.

Bach, Sebastian, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. “On pixel-wise explanations for non-linear classifier decisions by layer-Wise relevance propagation.” Edited by Oscar Deniz Suarez. Plos One 10 (7): e0130140.

Berrar, D. 2019. “Performance measures for binary classification.” In Encyclopedia of Bioinformatics and Computational Biology Volume 1, 546–60. Elsevier.

Biecek, Przemyslaw. 2018. “DALEX: Explainers for complex predictive models in R.” Journal of Machine Learning Research 19 (84): 1–5.

———. 2019. “Model Development Process.” CoRR abs/1907.04461.

Biecek, Przemyslaw, Hubert Baniecki, Adam Izdebski, and Katarzyna Pekala. 2019. ingredients: Effects and Importances of Model Ingredients.

Biecek, Przemyslaw, and Marcin Kosinski. 2017. “archivist: An R Package for Managing, Recording and Restoring Data Analysis Results.” Journal of Statistical Software 82 (11): 1–28.

Binder, Alexander, Grégoire Montavon, Sebastian Bach, Klaus-Robert Müller, and Wojciech Samek. 2016. “Layer-Wise Relevance Propagation for Neural Networks with Local Renormalization Layers.” In Artificial Neural Networks and Machine Learning - 25th International Conference on Artificial Neural Networks, Icann 2016, Proceedings, 9887 LNCS:63–71. Lecture Notes in Computer Science. Springer Verlag.

Bischl, Bernd, Michel Lang, Lars Kotthoff, Julia Schiffner, Jakob Richter, Erich Studerus, Giuseppe Casalicchio, and Zachary M. Jones. 2016. “mlr: Machine Learning in R.” Journal of Machine Learning Research 17 (170): 1–5.

Boehm, Barry. 1988. “A Spiral Model of Software Development and Enhancement.” IEEE Computer, IEEE 21(5): 61–72.

Breiman, Leo. 2001a. “Random Forests.” Machine Learning 45: 5–32.

———. 2001b. “Statistical modeling: The two cultures.” Statistical Science 16 (3): 199–231.

Breiman, Leo, Adele Cutler, Andy Liaw, and Matthew Wiener. 2018. randomForest: Breiman and Cutler’s Random Forests for Classification and Regression.

Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Monterey, CA: Wadsworth; Brooks.

Brentnall, A. R., and J. Cuzick. 2018. “Use of the concordance index for predictors of censored survival data.” Statistical Methods in Medical Research 27: 2359–73.

Buuren, S. van. 2012. Flexible Imputation of Missing Data. Boca Raton, FL: Chapman; Hall/CRC.

Casey, Bryan, Ashkon Farhangi, and Roland Vogl. 2019. “Rethinking explainable machines: The GDPR’s Right to Explanation debate and the rise of algorithmic audits in enterprise.” Berkeley Technology Law Journal 34: 143–88.

Chapman, Pete, Julian Clinton, Randy Kerber, Thomas Khabaza, Thomas Reinartz, Colin Shearer, and Rudiger Wirth. 1999. The CRISP-DM 1.0 Step-by-step data mining guide. SPSS Inc.

Chen, Tianqi, and Carlos Guestrin. 2016. “XGBoost: A Scalable Tree Boosting System.” In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–94. KDD ’16. ACM.

Cortes, Corinna, and Vladimir Vapnik. 1995. “Support-Vector Networks.” Machine Learning, 273–97.

Dastin, Jeffrey. 2018. “Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women.” Reuters.

Deng, J., W. Dong, R. Socher, L. Li, Kai Li, and Li Fei-Fei. 2009. “ImageNet: A large-scale hierarchical image database.” In 2009 Ieee Conference on Computer Vision and Pattern Recognition, 248–55. Los Alamitos, CA, USA: IEEE Computer Society.

Diaz, Mark, Isaac Johnson, Amanda Lazar, Anne Marie Piper, and Darren Gergle. 2018. “Addressing Age-Related Bias in Sentiment Analysis.” In Proceedings of the 2018 Chi Conference on Human Factors in Computing Systems, 412:1–412:14. Chi ’18. Montreal QC, Canada: ACM.

Dobson, A. J. 2002. Introduction to Generalized Linear Models (2nd Ed.). Boca Raton, FL: Chapman; Hall/CRC.

Donizy, Piotr, Przemyslaw Biecek, Agnieszka Halon, and Rafal Matkowski. 2016. “BILLCD8 – a Multivariable Survival Model as a Simple and Clinically Useful Prognostic Tool to Identify High-Risk Cutaneous Melanoma Patients.” Anticancer Research 36 (September): 4739–48.

Dorogush, Anna Veronika, Vasily Ershov, and Andrey Gulin. 2018. “CatBoost: gradient boosting with categorical features support.” CoRR abs/1810.11363.

Duffy, Clare. 2019. “Apple co-founder Steve Wozniak says Apple Card discriminated against his wife.” CNN Business.

Efron, Bradley, and Trevor Hastie. 2016. Computer Age Statistical Inference: Algorithms, Evidence, and Data Science (1st Ed.). New York, NY: Cambridge University Press.

Ehrlinger, John. 2016. ggRandomForests: Exploring Random Forest Survival.

Faraway, Julian. 2005. Linear Models with R (1st Ed.). Boca Raton, Florida: Chapman; Hall/CRC.

Fisher, Aaron, Cynthia Rudin, and Francesca Dominici. 2019. “All Models Are Wrong, but Many Are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously.” Journal of Machine Learning Research 20 (177): 1–81.

Foster, David. 2017. XgboostExplainer: An R Package That Makes Xgboost Models Fully Interpretable.

Friedman, Jerome H. 2000. “Greedy Function Approximation: A Gradient Boosting Machine.” Annals of Statistics 29: 1189–1232.

Galecki, A., and T. Burzykowski. 2013. Linear Mixed-Effects Models Using R: A Step-by-Step Approach. New York, NY: Springer-Verlag New York.

GDPR. 2018. The EU General Data Protection Regulation (GDPR) is the most important change in data privacy regulation in 20 years.

Goldstein, Alex, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2015. “Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation.” Journal of Computational and Graphical Statistics 24 (1): 44–65.

Goodman, Bryce, and Seth Flaxman. 2017. “European Union Regulations on Algorithmic Decision-Making and a ‘Right to Explanation’.” AI Magazine 38 (3): 50–57.

Gosiewska, Alicja, and Przemyslaw Biecek. 2018. auditor: Model Audit - Verification, Validation, and Error Analysis.

———. 2019. iBreakDown: Uncertainty of Model Explanations for Non-additive Predictive Models.

Greenwell, Brandon. 2020. fastshap: Fast Approximate Shapley Values.

Greenwell, Brandon M. 2017. “Pdp: An R Package for Constructing Partial Dependence Plots.” The R Journal 9 (1): 421–36.

Grolemund, Garrett, and Hadley Wickham. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media.

Gulli, Antonio, and Sujit Pal. 2017. Deep Learning with Keras. Birmingham, UK: Packt Publishing Ltd.

Hall, Patrick, Navdeep Gill, and Nicholas Schmidt. 2019. “Proposed Guidelines for the Responsible Use of Explainable Machine Learning.” arXiv 1906.03533.

Harrell, F. E. Jr. 2015. Regression Modeling Strategies (2nd Ed.). Cham, Switzerland: Springer.

Harrell, F. E. Jr., K. L. Lee, and D. B. Mark. 1996. “Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.” Statistics in Medicine 15: 361–87.

Harrell Jr, Frank E. 2018. Rms: Regression Modeling Strategies.

Hastie, T., R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning. Data Mining, Inference, and Prediction (2nd Ed.). New York, NY: Springer.

Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long Short-Term Memory.” Neural Computation 9 (8): 1735–80.

Hoover, Benjamin, Hendrik Strobelt, and Sebastian Gehrmann. 2020. “ExBERT: A Visual Analysis Tool to Explore Learned Representations in Transformer Models.” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 187–96. Online: Association for Computational Linguistics.

Hothorn, Torsten, Kurt Hornik, and Achim Zeileis. 2006. “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics 15 (3): 651–74.

Jacobson, Ivar, Grady Booch, and James Rumbaugh. 1999. The Unified Software Development Process. Boston, MA: Addison-Wesley.

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2014. An Introduction to Statistical Learning: With Applications in R. New York, NY: Springer.

Jiangchun, Li. 2018. Python Partial Dependence Plot Toolbox.

Karbowiak, Ewelina, and Przemyslaw Biecek. 2019. EIX: Explain Interactions in Gradient Boosting Models.

Kruchten, Philippe. 1998. The Rational Unified Process. Addison-Wesley.

Kuhn, Max. 2008. “Building Predictive Models in R Using the Caret Package.” Journal of Statistical Software 28 (5): 1–26.

Kuhn, Max, and Kjell Johnson. 2013. Applied Predictive Modeling. New York, NY: Springer.

Kuhn, Max, and Davis Vaughan. 2019. Parsnip: A Common Api to Modeling and Analysis Functions.

Kutner, M. H., C. J. Nachtsheim, J. Neter, and W. Li. 2005. Applied Linear Statistical Models. New York: McGraw-Hill/Irwin.

Landram, F., A. Abdullat, and V. Shah. 2005. “The coefficient of prediction for model specification.” Southwestern Economic Review 32: 149–56.

Larson, Jeff, Surya Mattu, Lauren Kirchner, and Julia Angwin. 2016. “How We Analyzed the COMPAS Recidivism Algorithm.” ProPublica.

Lazer, David, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. “The Parable of Google Flu: Traps in Big Data Analysis.” Science 343 (6176): 1203–5.

LeDell, Erin, Navdeep Gill, Spencer Aiello, Anqi Fu, Arno Candel, Cliff Click, Tom Kraljevic, et al. 2019. H2o: R Interface for H2O.

Liaw, Andy, and Matthew Wiener. 2002. “Classification and regression by randomForest.” R News 2 (3): 18–22.

Little, R. J. A., and D. B. Rubin. 2002. Statistical Analysis with Missing Data (2nd Ed.). Hoboken, NJ: Wiley.

Lundberg, Scott. 2019. SHAP (SHapley Additive exPlanations).

Lundberg, Scott M., Gabriel G. Erion, and Su-In Lee. 2018. “Consistent Individualized Feature Attribution for Tree Ensembles.” CoRR abs/1802.03888.

Lundberg, Scott M, and Su-In Lee. 2017. “A Unified Approach to Interpreting Model Predictions.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 4765–74. Montreal: Curran Associates.

Maksymiuk, Szymon, Alicja Gosiewska, and Przemyslaw Biecek. 2019. shapper: Wrapper of Python library shap.

Max, Kuhn, and Hadley Wickham. 2018. Tidymodels: Easily Install and Load the ’Tidymodels’ Packages.

Meyer, David, Evgenia Dimitriadou, Kurt Hornik, Andreas Weingessel, and Friedrich Leisch. 2019. E1071: Manual Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), Tu Wien.

Molenberghs, G., and M. G. Kenward. 2007. Missing Data in Clinical Studies. Chichester, England: Wiley.

Molnar, Christoph. 2019. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable.

Molnar, Christoph, Bernd Bischl, and Giuseppe Casalicchio. 2018. “iml: An R package for Interpretable Machine Learning.” Journal of Open Source Software 3 (26): 786.

Nagelkerke, N. J. D. 1991. “A note on a general definition of the coefficient of determination.” Biometrika 78: 691–92.

Nolan, Deborah, and Duncan Temple Lang. 2015. Data Science in R: A Case Studies Approach to Computational Reasoning and Problem Solving. New York, NY: Chapman; Hall/CRC.

O’Connell, Mark, Catherine Hurley, and Katarina Domijan. 2017. “Conditional Visualization for Statistical Models: An Introduction to the Condvis Package in R.” Journal of Statistical Software, Articles 81 (5): 1–20.

Olhede, S., and P. Wolfe. 2018. “The AI spring of 2018.” Significance 15 (3): 6–7.

O’Neil, Cathy. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York, NY: Crown Publishing Group.

Paluszynska, Aleksandra, and Przemyslaw Biecek. 2017. RandomForestExplainer: A Set of Tools to Understand What Is Happening Inside a Random Forest.

Pedersen, Thomas Lin, and Michaël Benesty. 2019. lime: Local Interpretable Model-Agnostic Explanations.

Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, et al. 2011. “Scikit-Learn: Machine Learning in Python.” Journal of Machine Learning Research 12: 2825–30.

Plotly Technologies Inc. 2015. Collaborative Data Science. Montreal, QC.

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “"Why should I trust you?": Explaining the Predictions of Any Classifier.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Kdd San Francisco, ca, 1135–44. New York, NY: Association for Computing Machinery.

Ridgeway, Greg. 2017. Gbm: Generalized Boosted Regression Models.

Robnik-Šikonja, Marco, and Igor Kononenko. 2008. “Explaining Classifications for Individual Instances.” IEEE Transactions on Knowledge and Data Engineering 20 (5): 589–600.

Robnik-Šikonja, Marko. 2018. ExplainPrediction: Explanation of Predictions for Classification and Regression Models.

Ross, Casey, and Ike Swetliz. 2018. “IBM’s Watson supercomputer recommended ‘unsafe and incorrect’ cancer treatments, internal documents show.” Statnews.

Rossum, Guido van, and Fred L. Drake. 2009. Python 3 Reference Manual. Scotts Valley, CA: CreateSpace.

Rufibach, K. 2010. “Use of Brier score to assess binary predictions.” Journal of Clinical Epidemiology 63: 938–39.

Ruiz, Javier. 2018. “Machine learning and the right to explanation in GDPR.” Open Rights Group.

Salzberg, Steven. 2014. “Why Google Flu is a failure.” Forbes.

Samek, Wojciech, Thomas Wiegand, and Klaus-Robert Müller. 2018. Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. ITU Journal: ICT Discoveries - Special Issue 1 - the Impact of Artificial Intelligence (AI) on Communication Networks and Services. Vol. 1.

Schafer, J. L. 1997. Analysis of Incomplete Multivariate Data. Boca Raton, FL: Chapman; Hall/CRC.

Shapley, Lloyd S. 1953. “A Value for n-Person Games.” In Contributions to the Theory of Games Ii, edited by Harold W. Kuhn and Albert W. Tucker, 307–17. Princeton: Princeton University Press.

Sheather, Simon. 2009. A Modern Approach to Regression with R. Springer Texts in Statistics. New York, NY: Springer.

Shmueli, G. 2010. “To explain or to predict?” Statistical Science 25: 289–310.

Shrikumar, Avanti, Peyton Greenside, and Anshul Kundaje. 2017. “Learning Important Features Through Propagating Activation Differences.” In ICML, edited by Doina Precup and Yee Whye Teh, 70:3145–53. Proceedings of Machine Learning Research.

Simonyan, Karen, Andrea Vedaldi, and Andrew Zisserman. 2014. “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps.” In ICLR (Workshop Poster), edited by Yoshua Bengio and Yann LeCun.

Simonyan, Karen, and Andrew Zisserman. 2015. “Very Deep Convolutional Networks for Large-Scale Image Recognition.” In International Conference on Learning Representations. San Diego, CA: ICLR 2015.

Sing, T., O. Sander, N. Beerenwinkel, and T. Lengauer. 2005. “ROCR: visualizing classifier performance in R.” Bioinformatics 21 (20): 7881.

Sokolva, M., and G. Lapalme. 2009. “A systematic analysis of performance measures for classification tasks.” Information Processing and Management 45: 427–37.

Staniak, Mateusz, Przemyslaw Biecek, Krystian Igras, and Alicja Gosiewska. 2019. localModel: LIME-Based Explanations with Interpretable Inputs Based on Ceteris Paribus Profiles.

Steyerberg, E. W. 2019. Clinical Prediction Models. A Practical Approach to Development, Validation, and Updating (2nd Ed.). Cham, Switzerland: Springer.

Steyerberg, E. W., A. J. Vickers, N. R. Cook, T. Gerds, M. Gonen, N. Obuchowski, M. J. Pencina, and M. W. Kattan. 2010. “Assessing the performance of prediction models: a framework for traditional and novel measures.” Epidemiology 21: 128–38.

Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. 2014. “Sequence to Sequence Learning with Neural Networks.” In NIPS, edited by Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger, 3104–12.

Štrumbelj, Erik, and Igor Kononenko. 2010. “An Efficient Explanation of Individual Classifications Using Game Theory.” Journal of Machine Learning Research 11 (March): 1–18.

Štrumbelj, Erik, and Igor Kononenko. 2014. “Explaining prediction models and individual predictions with feature contributions.” Knowledge and Information Systems 41 (3): 647–65.

Tibshirani, Robert. 1994. “Regression Shrinkage and Selection via the lasso.” Journal of the Royal Statistical Society, Series B 58: 267–88.

Todeschini, Roberto. 2010. Useful and unuseful summaries of regression models.

Tsoumakas, G., I. Katakis, and I. Vlahavas. 2010. “Mining multi-label data.” In Data Mining and Knowledge Discovery Handbook, 667–85. Springer, Boston, MA.

Tufte, Edward R. 1986. The Visual Display of Quantitative Information. Cheshire, CT, USA: Graphics Press.

Tukey, John W. 1977. Exploratory Data Analysis. Boston, MA: Addison-Wesley.

van Houwelingen, H.C. 2000. “Validation, calibration, revision and combination of prognostic survival models.” Statistics in Medicine 19: 3401–15.

Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with S (4th Ed.). New York, NY: Springer.

Wes, McKinney. 2012. Python for Data Analysis (1st Ed.). O’Reilly Media, Inc.

Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.

Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (1st Ed.). O’Reilly Media, Inc.

Wikipedia. 2019. CRISP DM: Cross-industry standard process for data mining.

Wright, Marvin N., and Andreas Ziegler. 2017. “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software 77 (1): 1–17.

Xie, Yihui. 2018. bookdown: Authoring Books and Technical Documents with R Markdown.