Maizego Summer Tutorial






Lecture 2

Notices for working with HPC

Maizego Summer Tutorial

Contents


  • Introduction to NCPGR HPC


  • How to build your working environment


  • Notes on LSF job system


  • Better Manage Your Data

Maizego Summer Tutorial

Introduction to NCPGR HPC

Already well documented in WIKI


Highlights

1. Every bit and every CPU ns counts;

2. Each account has limits:

  • total jobs (400 CPUs)
  • total file numbers and storage (diskquota)

3. Queue limits: smp, high, normal, interactive

4. Login hosts: time and RAM limits

  • Always run jobs on compute hosts (git clone, conda install, R install, download ...)

Maizego Summer Tutorial

Build your workshop

1. Load pre-built softwares with module

2. Installation without root: conda, brew, make, rpm

3. Using container: singularity

Maizego Summer Tutorial

Build your workshop

1. Load pre-built softwares with module

module is a ENV manager

source /public/home/software/opt/moudles/Modules/3.2.10/init/bash
module avail
module load / unload
module purge
module list
module show

Maizego Summer Tutorial

Build your workshop

2. Installation without root: conda, brew, make, rpm

  • conda or mamba: better always create new environment, unless you new clearly the dependencies

  • homebrew: not friendly to newbies

  • make: change the configure path, manually install dependencises

  • rpm:

    # search and download rpm packages
    # Ubuntu: apt-get
    # CentOS: yum
    yum search;
    yumdownloader;
    rpm2cpio XXX.rpm | cpio -idvm
    # will untar the packages to pwd
    # then add ./usr/bin to $PATH
    

Maizego Summer Tutorial

Build your workshop

3. Using container: singularity

Refer to the docs here: http://hpc.ncpgr.cn/app/007-singularity/

singularity pull tensorflow.sif docker://tensorflow/tensorflow:latest
singularity exec tensorflow.sif python example.py
singularity shell tensorflow.sif
  • mount your local dir
  • can build from docker
  • cannot build container from HPC => needs root for sandbox
  • can build from your laptop, then copy the image to HPC
  • containers may be delay to update

Maizego Summer Tutorial

Notes on LSF job system

Refer to the docs here: http://hpc.ncpgr.cn/cluster/016-LSF/
and here: https://www.ibm.com/docs/en/spectrum-lsf

bjobs, bhosts, bqueues, bkill, bsub, lsload, bmod, bbot, btop, bstop, bresume ...
bsub -q interactive -XF -Is bash # 48h only

Something you should know for jobs submit

# bsub options
-q, -J, -n, -m, -R, -M, -W, -w, -K, -P, -r ...
  • ONLY use multi cores -n $cpus and large memery [mem=${mem_size}G] when necessary: avoid resource waste
  • ALWAYS use -R span[hosts=1] unless you know for sure your software support MPI
  • Trace the LOGs to make sure your jobs successfully finished
  • Set time lag between each bsub

Maizego Summer Tutorial

Suggestions on bsub:

1. use meaningful log prefix:

```bash
# what the ncpgr wiki suggest:
bsub -q normal -J blast -n 2 -R  span[host=1] -o %J.out -e %J.err "blast ...."
# what I think is better:
bsub -q normal -J blast -n 2 -R span[host=1] -o blast.log -e blast.log "blast ...."
```

2. through script rather than CLI

  • explicit
  • maintenance
  • reproductive
  • extendable

Maizego Summer Tutorial

Suggestions on bsub:

3. batch submit with bash script

  • sth about HERE DOC: escape with \, "EOF", 'EOF'
  • batch bsub: gstbsub
    ls *.lsf | while read lsf; do bsub < $lsf; sleep 1; done
  • batch bkill: gstjobs & gstkill
    gstjobs | grep 'blast_' | cut -f 1 | gstkill
  • batch check log: unsuccess_bjobs
    ls *.out | grep -L "^Successfully completed."
  • easy to re-run single job
  • easy to maintain and reproduce:
    just keep the batch shell script, and remove the .lsf .log

Maizego Summer Tutorial

Better Manage Your Data

>>> Save money for the lab. More importantly, save time for yourself!


  • never challenge your memory

  • make the directory readable even without README

  • clearn up intermediate files timely

  • seperate your formal data, codes, test data, analysis results ...

  • tar | compress your data when finished use them

Maizego Summer Tutorial

Better Manage Your Data

!!! Almost EVERY bioinfomatics data format can be compressed (or already been compressed), with only slightly losing the manipulating performance (which 90% of biologist didn't care about) !!!


clean up !

  • scan and compress your large data: HZAU vpn only tutorial here
  • remove your intermediate files, and unused files
  • tar your small-yet-many files together
  • at least, mv data from /public to our own disk
  • or do not generate too many intermediate files: Lecture 3

Maizego Summer Tutorial

Better Manage Your Data

backup !

  • github, gitee, gitlab, zendo, figshare ... for your codes
  • NCBI, CNSA, GSA, figshare, Cyverse, ZEAMAP ... for your data
  • local copy of your important unpublished data

but not too much ...

🐮:

~~ Oh boy! I made 100 copies of my sequence data, and I copied all your genome data to my directories! Now nobody, even myself can total remove all my data, fail safe!

🐴:

~~ Holy! You genius! I should go download all NCBI data, in case their services died one day ...

Maizego Summer Tutorial

Take Home Massages

  • Master the installation of more than 90% bioinformatics tools: Conda, Brew, Make, Container ...
  • The ability to run heavy jobs through HPC: LSF
  • Clean up your work environment and your data



Next Lecture:

1. pitfalls for working with shell/bash


2. how to write safety and reusable pipelines


3. Practice: build an incredibly 💯 accurate app!