class: center, middle, inverse, title-slide # Creating and Sharing Code for Reproducible Research and Scalable Impact ## Making your code
and research
make a difference ### Robin Lovelace, 10DS Fellow, Turing Fellow ### Leeds’
Institute for Transport Studies
### ADR Fellows Code sharing Workshop, 2021-01-21 (updated: 2022-01-19) Reproducible source code:
github.com/Robinlovelace/presentations
--- <!-- Talk in context: --> <!-- From 11:05 to 11:10 --> ## The problem  Source: ['Inside the black box' report](https://www.co.pierce.wa.us/DocumentCenter/View/755/A-GuideToModeling?bidId=) -- - Black boxes obfuscate methods, reduce trust in research, stifle innovation and reduce the ability of future work to build on (and cite) your research -- - Academic research was developed at a time when computers, let alone open source software and reproducible code, were available: black boxes are the norm --- ### The solution: get your code out there  Source: Lovelace, Robin 2021: Open Source Tools for Geographic Analysis in Transport Planning. Journal of Geographical Systems. https://doi.org/10.1007/s10109-020-00342-2, accessed January 17, 2021. --- ## Preparing code for publication .pull-left[ #### At a minum - Code hosting website (e.g. GitHub) - Good README with instructions to run code (+optional badges) - Minimum example (synthetic?) input dataset - Clear directory structure - Select and follow a style guide <!-- - Identify a style guide you like and stick to it --> - Small readable 'chunks' (functions/scripts) #### Advanced - Active issue tracker - Packaging + documentation - Code review - Continuous integration - Community chat (e.g. discord) ] .pull-right[ ### Example: stplanr <!--  --> [](https://github.com/r-hub/cranlogs.app) [](https://cran.r-project.org/package=stplanr) [](https://cran.r-project.org/package=stplanr) [](https://lifecycle.r-lib.org/articles/stages.html) [](https://github.com/ropensci/software-review/issues/10) [](https://github.com/ropensci/stplanr/actions) <a href='https://docs.ropensci.org/stplanr/'><img src='https://docs.ropensci.org/stplanr/reference/figures/stplanr.png' align="right" height=215/></a> Demo of packaged code **stplanr** (Lovelace and Ellison 2018) https://docs.ropensci.org/stplanr/ Example of code for a paper: https://github.com/Robinlovelace/odjitter ] --- ## Beyond code 1: dissemination  Source: https://twitter.com/robinlovelace/status/1351477455203299328 --- ## Beyond code 2: open access tools  - Outlines 'network effects' of open research and putting things 'out there' - Conclusion: open access is particularly important for policy relevant research --- ## Case study of publishing code  Code processing confidential data hosted online: https://github.com/npct/pct-scripts/blob/master/03.2_school_prepare_OD_file.R Source: Goodman, et al. 2019. Scenarios of Cycling to School in England, and Associated Health and Carbon Impacts: Application of the ‘Propensity to Cycle Tool.’ Journal of Transport & Health 12: 263–278. --- ## Aggregate outputs published - We commissioned a dataset from DfE, sensitive data, processed securely - Open code and (more importantly) aggregated derived data increased impact  --- ### Skills and lessons learned along the way .pull-left[ ### Skills General - Communication (beyond usual ppl) - Forward planning - Agile workflow Technical - Version control (Git/GitHub) - Kanban boards - Receiving feedback on code - Integrating manuscript prose + code, with RMarkdown and Quarto ] .pull-right[ ### Lessons - Get collaborators onboard <!-- Talk about working with many people --> - Don't be afraid to publish 'unfinished' code - Publishing code and contributing online can lead to surprising benefits - E.g. link with Italian PhD student and paper - Ask questions of the *community* ] --- .left-column[ #### Beyond code 3: Community engagement Sharing code happens in context of open source communities They are usually friendly communities Getting involved can lead to collaborations Source: https://github.com/r-spatial/sf/issues/966 ] .right-column[  ```r milan_car_crashes <- data.frame( ID = 1:5, # Reproducible example: X = c(1513037, 1513008, 1515473, 1514039, 1515748), Y = c(5034945, 5034750, 5036177, 5036820, 5037396) ) ```  Resulting Publication: Gilardi, A., Mateu, J., Borgoni, R., Lovelace, R., 2022. Multivariate hierarchical analysis of car crashes. ] --- # Thanks for listening! ## Further information: robinlovelace.net + @robinlovelace on Twitter see slides infoRming policy talk for R-focussed take. [](https://www.robinlovelace.net/presentations/c4p-slides.html#1)