2. set -o nounset and set -o errexit: usefull in shell scripts
3. Or check the exit status explicitly
if [ $? ne 0 ];thenecho"[ERROR]: Some error msg" >&2
exit 1
fi
4. check if the output is valid:
file exist, var not empty, var number cutoff, string lenght ...
5. cmds you should use with caution
rm -f, mv, tar,
Maizego Summer Tutorial
Why do we need pipelines?
maintenance
do not re-invent wheels
share with colleagues
help to sort your thoughts
help to build more complex projects
What do we want from pipelines?
parameters reset
check and chain dependencies
step controllable
Maizego Summer Tutorial
Ways to build complex pipelines
make: Very hacker-like, not widely used
shell script
other (usually high-level) programming languages: perl, python, R, julia ...
Third party tools: snakemake, nextflow, wdl ...
We will only cover bash script based pipelines here
Maizego Summer Tutorial
Pipelines with bash: things you should know ahead
check dependencies: which, -x, --version ...
logging, stdout & stderr: >&1, >&2
parse arguments: getopts, getopt (GNU), DIY: $@ + case
command chains: |, &&, ||
step check and skip:
# check result files (when one main result file is needed)if [[ ! -s "output.txt" ]];then# your CMDs go here and generate result file$CMDs > output.txt
if [ $? -ne 0 ];then rm output.txt; exit 1; fifi# check step tag file (when many result files are needed, or you want to control manually)if [ ! -s "step1.done" ];then# your CMDs go here$CMD1 &&\
$CMD2if [ $? -ne 0 ];thenexit 1;fiecho"done" > step1.done
fi
Maizego Summer Tutorial
Learn with real problem:
>>> build an extremely accurate body index predictor
A prototype goes here:
Maizego Summer Tutorial
List the demands
A user input hinter
A supaaaar cooool ~~ calculating progress indicator
1. install cmatrix the hard way: complie from source code
# step 1: download the source code
git clone https://github.com/abishekvashok/cmatrix.git
# step 2: learn how to compile: from INSTALL, or README# step 3: build:
autoreconf --install
./configure # you may get errors if you weren't root
make && make install
# step 3': haha, root-less makes we meet
./configure --prefix=/path/to/your/path
make && make install
# step 4: add the executable to the $PATHcd /path/to/cmatrix/excutable
echo"export PATH=$PWD:\$PATH" >> $HOME/.bashrc
source$HOME/.bashrc
# testwhich cmatrix
cmatrix
Maizego Summer Tutorial
Build the app: Coding
#!/usr/bin/evn bash# first things first: set optionsset -o nounset
set -o pipefail
# define comon log functions: with colorfunctionmylog () {
local info=$1echo -e "\033[36m[$(date +'%y-%m-%d %H:%M')]\033[0m $info" >&2
}
functionmyrcd () {
local info=$1echo -e "\033[32m>>>------------>\033[0m $info" >&2
}
functionmywarn () {
local info=$1echo -e "\033[35m[WARNING]\033[0m --> $info" >&2
}
export -f mylog myrcd mywarn
# step 1: get user's height in cm# set the main variable to store inputexport height=0
echo"Please input your height (in cm):" >&2 # hint to stderrread height
# get input height util it is validwhile [[ $height -gt 1000 || $height -lt 10 ]]
do
mywarn "Really? Your height is $height cm? \nI can only predict human.\nPlease input again (height in cm):" >&2
read height
done# step 2: fake calculating functionfunctionfake_calc_prg(){
# we generate random strings for logging
mylog "Start calculating ..."
sleep 2
mylog "Building models ..."
sleep 2
cat /dev/urandom | strings -n 10 | head -n 20 |\
whileread str;do
myrcd "$str$str"
sleep 0.2
done
}
export -f mylog myrcd fake_calc_prg
# step 3: main procedure: do fake calc and out
fake_calc_prg
mylog "Generating result reports ..."
sleep 3
export outtext="##= Your height is $height cm =##"
cmatrix -rM "$(echo $outtext)" -u 1
Save the above code locally to mzg_height_predictor.sh, then run bash mzg_height_predictor.sh
Maizego Summer Tutorial
Build the app: Debug and Improvement
try run the script with 179.999 as input
❓ Homework:
1. How to fix the bug above?
2. Can we add fake multi-thread feature?
if the user choice like 4 threads, the whole "calculation" would be 3~4 times faster