MIT - The missing semester writeup
Course details: https://missing.csail.mit.edu/
About
I have been taking the missing semester recently, and this is my writeup for the exercises that I think is useful.
Data Wrangling
Find the number of words (in
/usr/share/dict/words
) that contain at least threea
s and don’t have a's
ending.For this question, all we need is to use cat to obtain the data. Then delete all words with an 's' ending. Finally use awk to display all words with three 'a's. Counting the number with
wc -l
but here I won't pipe the command as further use of data is required.cat /usr/share/dict/words | sed -E '/^.*s$/d' | awk -Fa '{if (NF-1>=3) {print $0}}'
What are the three most common last two letters of those words?
Note that when using
uniq
command the data should be sorted first. Another way to solve this is to usesort -u
cat /usr/share/dict/words | sed -E '/^.*s$/d' | awk -Fa '{if (NF-1>=3){ print substr($0,length($0)-1)}}' | sort | uniq -c | sort | tail -n 3
How many of those two-letter combinations are there?
cat /usr/share/dict/words | sed -E '/^.*s$/d' | awk -Fa '{if (NF-1>=3){ print substr($0,length($0)-1)}}' | sort | uniq -c | wc -l
Which combinations do not occur?
First, we need a wordlist contain all combinations, which can be generated using
echo {a..z}{a..z}
. Then use tr to replace space into newline and use cat to concat the stream. Finally, we find the total unique count of the combinations. The value of 1 is what we wanted, as the combinations we found should have a total count of 2. Because when we add the wordlist in the stream, every combination should start with a total count of one.cat /usr/share/dict/words | sed -E '/^.*s$/d' | awk -Fa '{if (NF-1>=3){ print substr($0,length($0)-1)}}' | sort | uniq | cat <(echo {a..z}{a..z} | tr ' ' '\n') - | sort | uniq -c | awk '$1=="1" {print $2}'
2. To do in-place substitution it is quite tempting to do something like sed s/REGEX/SUBSTITUTION/ input.txt > input.txt
. However this is a bad idea, why? Is this particular to sed
? Use man sed
to find out how to accomplish this.
When we use
>
operator to redirect the STDOUT, the target file has been truncated by your shell, which means the sed program can only read an empty file and write nothing, causing the input.txt lost.Use -i option to achieve the goal.
3. Find your average, median, and max system boot time over the last ten boots. Use journalctl
on Linux and log show
on macOS, and look for log timestamps near the beginning and end of each boot.
journalctl | grep "Startup finished in" | head -n5 | sed -E 's/^.*= (.*)s\./\1/g' | R --slave -e 'x <- scan(file="stdin", quiet=TRUE); summary(x)'
Command-line Environment
From what we have seen, we can use some
ps aux | grep
commands to get our jobs’ pids and then kill them, but there are better ways to do it. Start asleep 10000
job in a terminal, background it withCtrl-Z
and continue its execution withbg
. Now usepgrep
to find its pid andpkill
to kill it without ever typing the pid itself. (Hint: use the-af
flags).pkill -af "sleep 10000"
Say you don’t want to start a process until another completes. How would you go about it? In this exercise, our limiting process will always be
sleep 60 &
. One way to achieve this is to use thewait
command. Try launching the sleep command and having anls
wait until the background process finishes.However, this strategy will fail if we start in a different bash session, since
wait
only works for child processes. One feature we did not discuss in the notes is that thekill
command’s exit status will be zero on success and nonzero otherwise.kill -0
does not send a signal but will give a nonzero exit status if the process does not exist. Write a bash function calledpidwait
that takes a pid and waits until the given process completes. You should usesleep
to avoid wasting CPU unnecessarily.
Version Control (Git)
Who was the last person to modify
README.md
? (Hint: usegit log
with an argument).git log README.md | grep "Author" | sed -E 's/Author: (.) <.@.*>/\1/' | head -n1
What was the commit message associated with the last modification to the
collections:
line of_config.yml
? (Hint: usegit blame
andgit show
).git blame _config.yml | grep collections | head -c8 | xargs git show --format=%B | head -n1
Last updated