Swiss Army Knife

Simple introduction to some very useful command line tools.

Tools

pup

pup is a command line tool for processing HTML. It reads from stdin, prints to stdout, and allows the user to filter parts of the page using CSS selectors.

curl -s https://news.ycombinator.com/ | pup 'table table tr:nth-last-of-type(n+2) td.title a'

jq

jq is like sed for JSON data - you can use it to slice and filter and map and transform structured data with the same ease that sed, awk, grep and friends let you play with text.

curl 'https://api.github.com/repos/stedolan/jq/commits?per_page=5' | jq '.[0] | {message: .commit.message, name: .commit.committer.name}'

xargs

xargs build and execute command lines from standard input.

# download all pdfs from a website
URL='https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/' curl -s $URL | pup 'a attr{href}' | grep pdf | xargs -I {} wget $URL{}

Intro to xargs

parallel

GNU parallel is a shell tool for executing jobs in parallel using one or more computers.

Cases

获取台北RAW餐厅可预订时段信息

PEOPLE=4; START_DATE='2015-12-24'; URL='https://api.eztable.com/v3/restaurants/2128/quotas?date=%s&people=%s&premium=true\n'; seq 0 10 | xargs -I {} date -d $START_DATE" {} days" +%Y-%m-%d | xargs -I {} printf $URL {} $PEOPLE | parallel "curl -s {} | jq '.premium_quotas[] | select(.availability==true) | {datetime: .datetime, purchase_link: .purchase_link}'"

References