Learn just enough Linux to get things done
Different operating systems have long catered to different audiences: Windows for the business professional, Mac for the creative professional and Linux for the software developer. For OS providers, this sort of market segmentation greatly simplified product vision, technical requirements, user experience and marketing direction. However, it also reinforced workplace norms which bucket individuals into narrow, non-overlapping domains: business people can offer no insight into the creative process, and developers no insight into business problems.
In reality, knowledge and skill are fluid, spanning multiple disciplines and fields. The notion that "you can only be good at one thing" is not a roadmap to mastery but rather a prescription for premature optimization. You can only know what you're good at once you've sampled a lot of things - and you may just find that you're good at a lot of them.
For modern business analysts, bridging the gap between business and software development is especially important. Business analysts must be "dual platform," able to leverage command-line tools available only on Linux (or OS X) yet still benefit from the power of Microsoft Office on Windows. Understandably, the world of Linux is intimidating for those with a business degree. Fortunately, as with most things, you only need to learn 20% of the information to accomplish 80% of the work. Here is my 20%.
Why modern business analysts should know Linux
Due to its open source roots, Linux benefited from the contributions of thousands of developers over time. They built programs and utilities not only to make their jobs easier, but also the jobs of programmers who followed them. As a result, open source development created a network effect: the more developers built utilities on the platform, the more other developers could leverage those utilities to write their programs right away.
What resulted was an expansive suite of programs and utilities (collectively, software) that were written in Linux, for Linux - much of which was never ported to Windows. One example of this is the popular version control system (VCS) called git. Developers could have written this software to work on Windows, but they didn't. They wrote it to work on the command line for Linux because it was the ecosystem which already had all the tools they needed.
Concretely, development on Windows runs into two main problems:
- Basic tasks, like file parsing, job scheduling, and text search are more involved than running a command-line utility
- Programming languages (eg. Python, C++) and their associated code libraries will throw errors because they are expecting certain Linux parameters or file system locations
Together, this means more time spent rewriting basic tools already available in Linux and troubleshooting OS compatibility errors. This is not a surprise - the Windows ecosystem simply wasn't designed with software development in mind.
With the case made for Linux development, let's begin with the basics.
The fundamental unit of Linux: the "shell"
The shell (also known as the terminal, console or command line) is a text-based user interface through which commands are sent to the machine. On Linux, the shell's default language is called bash. Unlike Windows users who primarily point-and-click inside of windows, Linux developers stick to their keyboard and type commands into the shell. While this transition is at first unnatural for those without a programming background, the benefits of developing in Linux easily outweigh the initial learning investment.
Learning the few important concepts
Compared to a full-fledged programming language, bash only has a few major concepts that need to be learned. Once these are covered, the rest of bash is just memorization. I'll restate for clarity: being good at bash is simply memorizing about 20-30 commands and their most common arguments.
Linux seems impenetrable to non-developers because of the way that developers seem to effortlessly regurgitate esoteric terminal commands at will. The reality is that they committed only a few dozen commands to memory - for anything more complicated, they too (like all mere mortals) consult Google.
With that out of the way, here are the main concepts in bash.
Command syntax
Commands follow the syntax of: {command} {arguments..}
For example, in 'grep -inr', grep is the command (to search for a string of text) and -inr are flags/arguments which change what grep does by default. The only way to learn what these mean is to look them up through Google or by typing 'man grep'. I recommend learning the commands and their most common arguments together; it's too burdensome otherwise to remember what each and every flag does.
Directory aliases
- The present directory (ie. where am I?): .
- The parent directory of the present directory: ..
- The user's home directory: ~
- The file system root (or the parent of all parents): /
For example, to change from the current directory to the parent directory, one would type: "cd .."
Similarly, to copy a file located at "/path/to/file.txt" into the present directory, one would enter "cp /path/to/file.txt ." (note the period at the end of the command). Since these are no more than aliases, the actual path name could be used in their place instead.
STDIN / STDOUT
Anything you type into the window and submit (via ENTER) is called standard input (STDIN).
Anything that a program prints back out to the terminal (eg. text from within a file) is called standard output (STDOUT).
Piping
|
A pipe takes the STDOUT of the command to the left of the pipe and makes it the STDIN to the command on the right of the pipe.
example: echo 'test text' | wc -l>
A greater-than sign takes the STDOUT of the command on the left and writes/overwrites to a new file on the right
example: ls > tmp.txt>>
Two greater-than signs takes the STDOUT of the command on the left and appends to a new or existing file on the right.
example: date >> tmp.txt
Wildcards
You can think of this like SQL's % symbol - for example, you might write "WHERE first_name LIKE 'John%'" to catch any first name starting with John.
In bash, you would write "John*". If you want to list all of the files ending with ".json" in a folder, you would write: "ls *.json"
Tab completion
Bash will often finish off commands intelligently for you if you start typing a command and hit your TAB key.
That being said, you should really use something like zsh or fish for autocomplete since it is hard to remember the commands and all their parameters - rather, these tools will autocomplete your commands based on your command history!
Quitting
Sometime's you'll get stuck in some program and you can't get out. This is a very frequent occurrence for beginners in Linux and it is extremely demotivating. Often, quitting has something to do with q. It's good to memorize the following and try them all when you're trapped.
- Bash
- CTRL+c
- q
- exit
- Python: quit()
- Nano: CTRL+x
- Vim: <Esc> :q!
My memorized list of bash commands
Here are the commands I use most frequently in Linux (sorted from most to least frequently used). As I mentioned before, knowing just a handful of commands will accomplish the vast majority of programmable tasks you need to perform.
- cd {directory}
- change directory
- ls -lha
- list directory (verbose)
- vim or nano
- command line editor
- touch {file}
- create a new empty file
- cp -R {original_name} {new_name}
- copy a file or directory (and all of its contents)
- mv {original_name} {new_name}
- move or rename a file
- rm {file}
- delete a file
- rm -rf {file/folder}
- permanently delete a file or folder [use with caution!]
- pwd
- print the present working directory
- cat or less or tail or head -n10 {file}
- STDOUT contents of a file
- mkdir {directory}
- make an empty directory
- grep -inr {string}
- find a string in any files in this directory or child directories
- tabview <delimited_file>
- display a delimited file in columnar format
- ssh {username}@{hostname}
- connect to a remote machine
- connect to a remote machine
- tree -LhaC 3
- show directory structure 3 levels down (with file sizes and including hidden directories)
- htop
- task manager
- pip install --user {pip_package}
- Python package manager to install packages to ~/.local/bin
- pushd . ; popd ; dirs; cd -
- push/pop/view directories onto the stack + change back to last directory
- sed -i "s/{find}/{replace}/g" {file}
- replace a string in a file
- find . -type f -name '*.txt' -exec sed -i "s/{find}/{replace}/g" {} \;
- replace a string for each file in this and child folders with a name like *.txt
- tmux new -s session, tmux attach -t session
- create another terminal session without creating a new window [advanced]
- wget {link}
- download a webpage or web resource
- curl -X POST -d "{key: value}" http://www.google.com
- send an HTTP request to a web server
- find <directory>
- list all directory contents and their children, recursively
Advanced and infrequently commands
I find it's good to keep a list of commands that are useful in certain situations (eg. which process is blocking a certain network port), even though those situations don't happen very often. These are some uncommon commands I keep nearby:
- lsof -i :8080
- list open file descriptors (-i flag for network interfaces)
- list open file descriptors (-i flag for network interfaces)
- netstat | head -n20
- list currently open Internet/UNIX sockets and related information
- dstat -a
- stream current disk, network, CPU activity & more
- nslookup <IP address>
- find hostname for a remote IP address
- strace -f -e <syscall> <cmd>
- trace system calls of a program (-e flag to filter for certain system calls)
- trace system calls of a program (-e flag to filter for certain system calls)
- ps aux | head -n20
- print currently active processes
- file <file>
- check what a file type is (eg. executable, binary, ASCII-text file)
- uname -a
- kernel information
- lsb_release -a
- OS information
- hostname
- check the hostname of your machine (ie. the name so other computers can reach you)
- check the hostname of your machine (ie. the name so other computers can reach you)
- visualize process forks
- time <cmd>
- execute a command and report statistics about how long it took
- CTRL + z ; bg; jobs; fg
- send a process in current tty into background and back to foreground
- cat file.txt | xargs -n1 | sort | uniq -c
- count unique words in a file
- wc -l <file>
- line count in a file
- du -ha
- show size on disk for directories and their contents
- show size on disk for directories and their contents
- zcat <file.gz>
- display contents of a zipped text file
- scp <user@remote_host> <local_path>
- copy a file from remote to local server, or vice versa
- copy a file from remote to local server, or vice versa
- man {command}
- show manual (ie. documentation) for a command, but you're probably better off using Google