Introduction to Command Line (Unix/Linux)

Abstract:

The command line (also known as terminal , shell , or CLI ) is a text-based interface that allows you to interact with your computer by typing commands instead of using a graphical user interface (GUI). It’s one of the most powerful tools for data scientists and developers—letting you navigate your system, manage files, run scripts, and automate tasks efficiently.


Why Learn the Command Line?

  • Speed and Efficiency: Tasks like file management and program execution are faster with commands.
  • Script Automation: Automate repetitive tasks using shell scripts.
  • Remote Work: Most remote servers (e.g., AWS, Google Cloud) are accessed via command line.
  • Tool Integration: Many data science tools rely on command line usage, like Git, Python scripts, Docker, etc.

Getting Started with the Unix/Linux Shell

If you're on Linux or macOS , open the Terminal application.
On Windows , use WSL (Windows Subsystem for Linux) for a Unix-like experience. See the installation instrunctions here .


Basic Command Line Concepts

The Prompt

When you open the terminal, you'll see a prompt like this:

user@machine:~$
   

This is where you type your commands.


Essential Commands to Know

Here are foundational commands every data scientist should be comfortable with:

Command Description
pwd Print Working Directory – shows your current location
ls List files and directories
cd Change Directory
mkdir Make a new directory
touch Create a new empty file
cp Copy files
mv Move or rename files
rm Remove files
cat Display file contents
echo Print text to the terminal
clear Clear the terminal screen
man [command] Show the manual for a command

Examples:


pwd
cd Documents
ls
mkdir data
touch data/sample.csv
cp sample.csv backup.csv
mv backup.csv archive.csv
rm archive.csv
   

Wildcards and Shortcuts

  • * matches any number of characters
    ls *.csv lists all CSV files
  • .. refers to the parent directory
    cd .. moves one level up
  • . refers to the current directory

Working with Files and Directories

cd ~/projects/data-science
ls -l     # Long listing format
ls -a     # Show hidden files
   

Viewing and Searching File Contents

cat filename.txt         # Print file contents
head filename.txt        # First 10 lines
tail filename.txt        # Last 10 lines
grep "search_term" file  # Search for a pattern
   

Running Python or Shell Scripts

If you have a Python script named analyze.py :

python analyze.py
   

To run a shell script:

chmod +x script.sh     # Make it executable
./script.sh            # Run the script
   

Writing a Simple Shell Script

Create a file called hello.sh :

#!/bin/bash
echo "Hello, Data Scientist!"
   

Then run:

chmod +x hello.sh
./hello.sh
   

Working with Pipes and Redirection

  • > redirects output to a file
  • >> appends output
  • | pipes output from one command to another

Example:

ls -l > filelist.txt
cat filelist.txt | grep ".py"
   

Installing Tools and Packages

On most Unix systems, use apt , brew , or yum to install software.
For example:

sudo apt install git
   

sudo allows administrative permissions—use it carefully.


Remote Access & Servers

First Create an SSH Key (Linux/Mac/WSL):

  1. Open a terminal.
  2. Run the command:
    ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
  3. When prompted to "Enter a file in which to save the key," press Enter to accept the default path (~/.ssh/id_rsa).
  4. Choose whether to set a passphrase (adds security, optional), or press Enter again to skip.
  5. Your public key will be saved as ~/.ssh/id_rsa.pub and the private key as ~/.ssh/id_rsa.
  6. Don't share the private key.

To copy the public key to a server (assuming that you have the password to connect to server):

ssh-copy-id user@server_address

If you don't have a password: copy the content of ~/.ssh/id_rsa.pub and pass to the server admin.

You can connect to remote machines using:

ssh username@hostname
   

This is essential for accessing cloud environments or remote research servers.


Advanced Tips

  • Use tab completion to auto-complete file names
  • Use Ctrl + C to stop a running command
  • Use history to view your command history
  • Use the up/down arrow keys to repeat previous commands

Common Pitfalls & How to Avoid Them

Issue Solution
Permission denied Use chmod to give permission or run with sudo
Command not found Ensure the program is installed and in your PATH
Accidentally deleted files with rm Use rm -i for confirmation prompts or alias safer alternatives

Learn More:

https://www.youtube.com/watch?v=LKCVKw9CzFo

https://www.youtube.com/watch?v=ROjZy1WbCIA

https://fosswire.com/post/2007/08/unixlinux-command-cheat-sheet/


Leave a Comment

Comments

Are You a Physicist?


Join Our
FREE-or-Land-Job Data Science BootCamp