Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Literature seminar "USING JUPYTERHUB IN THE CLA...

Literature seminar "USING JUPYTERHUB IN THE CLASSROOM: SETUP AND LESSONS LEARNED"

2022/10/19 論文調査スライド

ONOYAMA Shodai

October 19, 2022
Tweet

More Decks by ONOYAMA Shodai

Other Decks in Technology

Transcript

  1. Today’s paper USING JUPYTERHUB IN THE CLASSROOM: SETUP AND LESSONS

    LEARNED Jeff Brown Department of Mathematics and Statistics, University of North Carolina Wilmington
  2. Abstract • This paper describes the details of setting up

    a jupyter hub environment on a server running CentOS 7. • It includes a discussion of lessons learned from using this system in data science classes.
  3. Jupyter Notebook • A jupyter notebook runs in a Web

    browser and consists of a series of cells. • Markdown cells display formatted text, images, formulas, and video clips. • Code cells contain programs that can be executed, showing the results in the notebook.
  4. Jupyter Notebook • Jupyter notebooks is very useful for PBL

    learning. • The student can be presented with examples, and then open new cells to extend or modify the given content. • The basic notebook environment is also being extended for use by other educators.
  5. Jupyter hub • Jupyter hub is a Web application that

    runs on a server. • The client browser communicates with the hub through an http proxy, and the proxy communicates with multiple single-user notebook servers.
  6. Jupyter hub The property of allowing access to computing resources

    without requiring special software on the client machine makes the jupyter hub ideal for use in a classroom.
  7. Jupyter hub installation • We will describe the installation and

    setup on a server running CentOS 7.4.1708. • Installation will require the use of three different package managers. • yum: CentOS package manager • npm: a package manager that will be used to install the http proxy • pip: a python package manager to install the hub and notebook
  8. Jupyter hub installation • On a linux machine Anaconda is

    installed through a shell script. • We downloaded Anaconda3-5.0.1-Linux-x86_64.sh from https://anaconda.com. sudo bash Anaconda3-5.0.1-Linux-x86_64.sh
  9. Jupyter hub installation • The installer has you accept the

    license, accept the default installation location or enter your own, and offers to update the user’s .bashrc file so their command path will contain the new software. # added by Anaconda3 installer export PATH=”/opt/anaconda3/bin:$PATH”
  10. Jupyter hub installation • Install Extra Packages for Enterprise Linux

    – epel in order to install node.js. sudo yum install epel-release sudo yum install nodejs • NodeJS includes the package manager npm and we use it to install the http proxy. sudo npm install --g configurable-http-proxy
  11. Jupyter hub installation • Anaconda includes the python package manager

    pip, and we used it to install JupyterHub and Notebook with these commands. sudo /opt/anaconda3/bin/pip install jupyterhub sudo /opt/anaconda3/bin/pip install --upgrade notebook
  12. Configuration • We put the configuration files in the directory

    /etc/jupyterhub. The following commands create that directory and generate a default configuration file named jupyterhub_config.py . sudo mkdir /etc/jupyterhub cd /etc/jupyterhub sudo /opt/anaconda3/bin/jupyterhub --generate-config
  13. Configuration • The JupyterHub requires https, so you must acquire

    a certificate for your server. • The following commands create a directory named keys in the /etc/jupyterhub and make it readable only by root. Put your certificate and key in the keys directory. The key file should also be readable only by root. sudo mkdir keys sudo chmod 700 keys
  14. Configuration • There are many configuration options in jupyterhub_config.py. Here

    are the changes we made to make the hub work. • c.JupyterHub.ip is the numeric IP address of the server. • c.JupyterHub.ssl_cert is the path to your ssl certificate. • c.JupyterHub.ssl_key is the path to your ssl key • c.Spawner.cmd default value is “jupyterhub-singleuser”. We had to replace that with the full path “/opt/anaconda3/bin/jupyterhub- singleuser”.
  15. Configuration • The hub uses a 32-byte key, encoded as

    hex, to encrypt cookies. • This key can be stored in the configuration file or in an environment variable or in a file whose default location is /etc/jupyterhub/jupyterhub_cookie_secret. sudo su openssl rand -hex 32 > jupyterhub_cookie_secret chmod 700 jupyterhub_cookie_secret exit
  16. Fire wall • The default port for accessing the hub

    is 8000, so you need to allow access through the firewall on that port. sudo firewall-cmd --zone=public --add-port=8000/tcp -permanent sudo firewall-cmd -reload
  17. Security-Enhanced Linux (SELinux) • CentOS runs SELinux by default, and

    your hub will not function without making some changes to the SELinux environment. • Put SELinux in permissive mode. This means it does not prevent any activities, but it still logs activities that would have been prevented. sudo setenforce 0
  18. Security-Enhanced Linux (SELinux) • Start the hub from the command

    line as root, telling it where to find the configuration file. sudo /opt/anaconda3/bin/jupyterhub -f /etc/jupyterhub/jupyterhub_config.py • Login to the hub from a remote machine by using this URL in a browser. https://<address of server>:8000 • The SELinux audit log file is /var/log/audit/audit.log. Use grep to find the lines in the log file that contain the word denied. sudo grep denied /var/log/audit/audit.log
  19. Security-Enhanced Linux (SELinux) • For us the output of the

    grep command produced several lines, and they all contained comm=jupyterhub. The comm field gives the name of the command that resulted in the denied activity. • Next we use these lines to tell SELinux to allow these activities. • That is why you first check that you are only allowing activities associated with the jupyterhub command.
  20. Security-Enhanced Linux (SELinux) • Pipe the audit log lines to

    the program audit2allow and give the module a name. We named it jh-module. sudo grep denied audit.log | audit2allow -M jh-module • This command produces two files: jh-module.pp and jh-module.te, and it prompts you to make the new policy active with the following command. sudo semodule -I jh-module.pp
  21. Security-Enhanced Linux (SELinux) • Set SELinux back to enforce mode.

    sudo setenforce 1 Now you should be able to start and use the jupyter hub.
  22. Conclusions • Jupyter notebooks are widely viewed as valuable pedagogical

    tools. • A jupyter hub is particularly useful for the classroom. It allows you to provide a fully configured computing environment that only requires a Web browser on the student computer. • Depending on the computational demands of the course, a server with limited resources may be sufficient. Steps should be taken to conserve server memory. • Setup and configuration of a jupyter hub is a complex process, but it will probably become easier as the software becomes more widely used.
  23. References • jupyterhub/configurable-http-proxy: node-http-proxy plus a REST API • https://github.com/jupyterhub/configurable-http-proxy

    • OpenSSLとは - 意味をわかりやすく - IT用語辞典 e-Words • https://e-words.jp/w/OpenSSL.html • CentOS 7 : SELinux : audit2allowを利用する : Server World • https://www.server- world.info/query?os=CentOS_7&p=selinux&f=9
  24. References • 【SELinux】audit2allowのインストールと使い方 ~ポリシー追加, .teの 書き方, make方法, .ppの内容確認方法~ | SEの道標

    • https://milestone-of-se.nesuke.com/sv-advanced/selinux/add- av-rules-module/ • SELinuxの動作モードをコマンドラインで切り替えるには - @IT • https://atmarkit.itmedia.co.jp/flinux/rensai/linuxtips/979selinux enforce.html
  25. Supplemental Data • Make a path: Add the location of

    executable files. export PATH=”/opt/anaconda3/bin:$PATH” • grep: Search for a word in a string. sudo grep denied /var/log/audit/audit.log • | (pipe): Pass output as input to another program. sudo grep denied audit.log | audit2allow -M jh-module
  26. Supplemental Data • epel-release: Repository for yum • yum fetches

    packages from the Internet when you run yum install command. You can define which repository (where the installation packages are stored) to go to when doing so. • Nodejs: JavaScript environment • configurable-http-proxy: Nodejs library • Configurable-http-proxy provides a way to update and manage proxy tables using a command line interface or REST API.
  27. Supplemental Data • Open SSL: A program that implements the

    functions of SSL and TLS, the standard cryptographic communication protocols used on the Internet. $ openssl rand 2 �� $ openssl rand -hex 2 89a4
  28. Supplemental Data • chmod: Change file permissions $ ls –al

    myfile.txt -rw-r--r-- 1 onoyama onoyama 690850711 May 11 03:35 myfile.txt $ chmod 700 myfile.txt $ ls –al myfile.txt -rwx------ 1 oshanqq oshanqq 690850711 May 11 03:35 myfile.txt
  29. Supplemental Data • setenforce: Switch between Enforcing(0) and Permissive(1) •

    Enforcing(0): Deny unauthorized access. • Permissive(1): Even when there is unauthorized access, it only outputs a log but does not actually control access
  30. Supplemental Data • audit2allow: A command that analyzes denial logs

    and generates permission rules for SELinux policies. • A [module].pp file will be generated, install it with the semodule -i command. $ grep denied audit.log | audit2allow -M jh-module $ semodule -i jh-module.pp