November 14, 2018

mwouts/jupytext_pyparis_2018

Outline

  • Introduction
    • Jupyter notebooks
    • And their JSON representation
  • Jupytext demo
    • Refactor a notebook
    • Jupytext in Jupyter
    • Paired notebooks
    • Collaborate using Jupytext
  • Jupytext formats
  • Related projects

Introduction

Capital Fund Management

CFM (www.cfm.fr) is a global asset management company, based in Paris,

  • with offices in New York City, London, Tokyo and Sydney,
  • $10B in assets under management,
  • 250+ collaborators.


Python plays a central role at CFM.

Jupyter notebooks

Jupyter notebooks
and version control

Jupyter notebooks are huge JSON files.

Inputs are mixed with outputs: changes are not easy to follow.

Merging JSON is hard: forget one comma, and the notebook becomes invalid.

JSON representation of a Jupyter notebook

Jupyter notebooks as plain text

« Turn my beautiful interactive notebook into a plain and static text file?? »

Jupytext’s promise:

  • a text representation focused on inputs cells only,
  • clear version control,
  • merge, combine, slice and dice notebooks at will.


« What about my outputs and widgets? »

We also have a solution for that!

Demo

Example notebook: Greenhouse Gas Emissions

PWC’s Low Carbon Economy Index 2018:

« Not one of the G20 countries achieved the 6.4% rate required to limit warming to two degrees this year. That goal is slipping further out of reach — at current levels of decarbonisation, the global carbon budget for two degrees will run out in 2036. »

Demo I. Refactoring

  • Alice authored a notebook with an introduction on greenhouse gas emissions and some data from the World Bank.
  • She converts the notebook to a Python script: jupytext --to py Greenhouse_gas_emissions.ipynb.
  • Then, she refactors code in the script.
  • And updates the input cells in the original notebook with jupytext --to ipynb --update Greenhouse_gas_emissions.py.

Demo II. Jupytext in Jupyter

  • Configure Jupyter notebook:
# Append to .jupyter/jupyter_notebook_config.py
c.NotebookApp.contents_manager_class = "jupytext.TextFileContentsManager"
  • Restart jupyter notebook or jupyter lab.
  • Now Jupyter can open any Python script as a Jupyter notebook!

Paired notebooks

Pair a traditional ipynb notebook with a py file:

  • Work on the notebook in Jupyter (and update both files),
  • Edit the py file in your favorite editor,
  • Refresh the notebook in Jupyter:
    • input cells from the py file
    • output cells from the ipynb file.


Activate paired notebooks:

  • Add "jupytext": {"formats": "ipynb,py"}, to the notebook metadata.
  • Deactivate Jupyter’s autosave by running %autosave 0 in a cell.

Demo III: Collaborating with Jupytext

  • Alice shares the py representation of her notebook.
  • Bob opens the py file as a notebook.
  • He contributes a few plots and a conclusion.
  • Simultaneously, Alice contributes an interactive visualization.
  • A merge conflict occurs! We solve it on the py file.

Demo wrap-up

Use Jupytext to:

  • Refactor notebooks using an IDE, pep8, 2to3, etc,
  • Merge, slice and dice notebooks,
  • Do version control or collaborate on notebooks,
  • Edit and run scripts as notebooks in Jupyter
  • Run and debug notebooks as scripts in your favorite IDE,
  • Run py.test or import a notebook, etc…

Formats for Jupyter notebooks as text

Jupyter notebooks as Python scripts

In the demo we have used the light format (created for jupytext), which converts notebooks to valid Python scripts:

  • Markdown cells are commented.
  • Code cells are included verbatim (except Jupyter magics, which are commented).
  • Cells are separated with one or more blank lines.
  • When a code cell contains blank lines outside of Python paragraphs, we mark the start of the cell with # +, and its end with # -.

Python scripts with # %% cells

Jupyter notebooks as (R) Markdown

R Markdown by Yihui Xie (2012) is an inspiring notebook format. See the recent R Markdown: The Definitive Guide.

Jupyter notebooks as (R) Markdown documents:

  • notedown by Aaron O’Leary (2014): Markdown and R Markdown,
  • ipymd by Cyril Rossant (2015): Markdown, Python scripts, OpenDocument,
  • ipymd forked by Gregor Sturm (2017): R Markdown and R HTML notebooks,
  • And now jupytext: formats md and Rmd.

Why design jupytext? I was obsessed with the lossless round trip conversion!

Related projects

Notebook diff and merge

nbdime, by the Jupyter team:

  • nbdiff: command line diff,
  • nbmerge: three-way merge,
  • nbdiff-web: diff notebooks in a browser,
  • nbmerge-web: merge notebooks in a browser.

nbdime can merge the outputs cells of Jupyter notebooks — unlike jupytext.

nbmerge-web

Online collaboration on Jupyter notebooks

What’s next?

Jupytext is now available on conda-forge.

And we are looking for contributors for a Jupyter extension. We would like to offer a user-friendly interface (buttons) for

  • configuring Jupytext formats,
  • and options (comment magics, metadata filters).

Documentation

Special thanks to

  • Gregor Sturm, who proposed the idea of paired notebooks. He had a lot of other great advice. And he just contributed jupytext to conda-forge.
  • Eric Lebigot, who tested early versions of the program, and offered helpful advice on technical evolutions and on communication.
  • To the early adopters for their encouraging and useful feedback.
  • To people reporting issues, and to our contributors.
  • And to our stargazers — it’s so great to know that people like the project!

Thanks!