DSA(DataStructures and Algo)
The Hot Cake Course of every institute
Algorithmic complexity / Big-O / Asymptotic analysis
Nothing to implement
There are a lot of videos here. Just watch enough until you understand it. You can always come back and review
If some lectures are too mathy, you can jump down to the bottom and watch the discrete mathematics videos to get the background knowledge
TopCoder (includes recurrence relations and master theorem):
Data Structures
Arrays
Implement an automatically resizing vector.
Description:
UC Berkeley CS61B - Linear and Multi-Dim Arrays (video) (Start watching from 15m 32s)
Implement a vector (mutable array with automatic resizing):
Practice coding using arrays and pointers, and pointer math to jump to an index instead of using indexing.
New raw data array with allocated memory
can allocate int array under the hood, just not use its features
start with 16, or if starting number is greater, use power of 2 - 16, 32, 64, 128
size() - number of items
capacity() - number of items it can hold
is_empty()
at(index) - returns item at given index, blows up if index out of bounds
push(item)
insert(index, item) - inserts item at index, shifts that index's value and trailing elements to the right
prepend(item) - can use insert above at index 0
pop() - remove from end, return value
delete(index) - delete item at index, shifting all trailing elements left
remove(item) - looks for value and removes index holding it (even if in multiple places)
find(item) - looks for value and returns first index with that value, -1 if not found
resize(new_capacity) // private function
when you reach capacity, resize to double the size
when popping an item, if size is 1/4 of capacity, resize to half
Time
O(1) to add/remove at end (amortized for allocations for more space), index, or update
O(n) to insert/remove elsewhere
Space
contiguous in memory, so proximity helps performance
space needed = (array capacity, which is >= n) * size of item, but even if 2n, still O(n)
Linked Lists
C Code (video) - not the whole video, just portions about Node struct and memory allocation
Linked List vs Arrays:
Gotcha: you need pointer to pointer knowledge: (for when you pass a pointer to a function that may change the address where that pointer points) This page is just to get a grasp on ptr to ptr. I don't recommend this list traversal style. Readability and maintainability suffer due to cleverness.
Implement (I did with tail pointer & without):
size() - returns number of data elements in list
empty() - bool returns true if empty
value_at(index) - returns the value of the nth item (starting at 0 for first)
push_front(value) - adds an item to the front of the list
pop_front() - remove front item and return its value
push_back(value) - adds an item at the end
pop_back() - removes end item and returns its value
front() - get value of front item
back() - get value of end item
insert(index, value) - insert value at index, so current item at that index is pointed to by new item at index
erase(index) - removes node at given index
value_n_from_end(n) - returns the value of the node at nth position from the end of the list
reverse() - reverses the list
remove_value(value) - removes the first item in the list with this value
Doubly-linked List
No need to implement
Stack
Will not implement. Implementing with array is trivial
Queue
Implement using linked-list, with tail pointer:
enqueue(value) - adds value at position at tail
dequeue() - returns value and removes least recently added element (front)
empty()
Implement using fixed-sized array:
enqueue(value) - adds item at end of available storage
dequeue() - returns value and removes least recently added element
empty()
full()
Cost:
a bad implementation using linked list where you enqueue at head and dequeue at tail would be O(n) because you'd need the next to last element, causing a full traversal each dequeue
enqueue: O(1) (amortized, linked list and array [probing])
dequeue: O(1) (linked list and array)
empty: O(1) (linked list and array)
Hash table
Online Courses:
distributed hash tables:
Implement with array using linear probing
hash(k, m) - m is size of hash table
add(key, value) - if key already exists, update value
exists(key)
get(key)
remove(key)
More Knowledge
Binary search
Implement:
binary search (on sorted array of integers)
binary search using recursion
Bitwise operations
Bits cheat sheet - you should know many of the powers of 2 from (2^1 to 2^16 and 2^32)
Get a really good understanding of manipulating bits with: &, |, ^, ~, >>, <<
Good intro: Bit Manipulation (video)
Swap values:
Absolute value:
Trees
Trees - Notes & Background
basic tree construction
traversal
manipulation algorithms
BFS(breadth-first search) and DFS(depth-first search) (video)
BFS notes:
level order (BFS, using queue)
time complexity: O(n)
space complexity: best: O(1), worst: O(n/2)=O(n)
DFS notes:
time complexity: O(n)
space complexity: best: O(log n) - avg. height of tree worst: O(n)
inorder (DFS: left, self, right)
postorder (DFS: left, right, self)
preorder (DFS: self, left, right)
Binary search trees: BSTs
C/C++:
Implement:
insert // insert value into tree
get_node_count // get count of values stored
print_values // prints the values in the tree, from min to max
delete_tree
is_in_tree // returns true if given value exists in the tree
get_height // returns the height in nodes (single node's height is 1)
get_min // returns the minimum value stored in the tree
get_max // returns the maximum value stored in the tree
is_binary_search_tree
delete_value
get_successor // returns next-highest value in tree after given value, -1 if none
Heap / Priority Queue / Binary Heap
visualized as a tree, but is usually linear in storage (array, linked list)
Implement a max-heap:
insert
sift_up - needed for insert
get_max - returns the max item, without removing it
get_size() - return number of elements stored
is_empty() - returns true if heap contains no elements
extract_max - returns the max item, removing it
sift_down - needed for extract_max
remove(i) - removes item at index x
heapify - create a heap from an array of elements, needed for heap_sort
heap_sort() - take an unsorted array and turn it into a sorted array in-place using a max heap or min heap
Sorting
Notes:
Implement sorts & know best case/worst case, average complexity of each:
no bubble sort - it's terrible - O(n^2), except when n <= 16
Stability in sorting algorithms ("Is Quicksort stable?")
Which algorithms can be used on linked lists? Which on arrays? Which on both?
I wouldn't recommend sorting a linked list, but merge sort is doable.
For heapsort, see Heap data structure above. Heap sort is great, but not stable
Merge sort code:
Quick sort code:
Implement:
Mergesort: O(n log n) average and worst case
Quicksort O(n log n) average case
Selection sort and insertion sort are both O(n^2) average and worst case
For heapsort, see Heap data structure above
Not required, but I recommended them:
As a summary, here is a visual representation of 15 sorting algorithms. If you need more detail on this subject, see "Sorting" section in Additional Detail on Some Subjects
Graphs
Graphs can be used to represent many problems in computer science, so this section is long, like trees and sorting were.
Notes:
There are 4 basic ways to represent a graph in memory:
objects and pointers
adjacency matrix
adjacency list
adjacency map
Familiarize yourself with each representation and its pros & cons
BFS and DFS - know their computational complexity, their trade offs, and how to implement them in real code
When asked a question, look for a graph-based solution first, then move on if none
MIT(videos):
Skiena Lectures - great intro:
Graphs (review and more):
Full Coursera Course:
I'll implement:
DFS with adjacency list (recursive)
DFS with adjacency list (iterative with stack)
DFS with adjacency matrix (recursive)
DFS with adjacency matrix (iterative with stack)
BFS with adjacency list
BFS with adjacency matrix
single-source shortest path (Dijkstra)
minimum spanning tree
DFS-based algorithms (see Aduni videos above):
check for cycle (needed for topological sort, since we'll check for cycle before starting)
topological sort
count connected components in a graph
list strongly connected components
check for bipartite graph
Even More Knowledge
Recursion
Stanford lectures on recursion & backtracking:
When it is appropriate to use it?
How is tail recursion better than not?
Dynamic Programming
You probably won't see any dynamic programming problems in your interview, but it's worth being able to recognize a problem as being a candidate for dynamic programming.
This subject can be pretty difficult, as each DP soluble problem must be defined as a recursion relation, and coming up with it can be tricky.
I suggest looking at many examples of DP problems until you have a solid understanding of the pattern involved.
Videos:
the Skiena videos can be hard to follow since he sometimes uses the whiteboard, which is too small to see
List of individual DP problems (each is short): Dynamic Programming (video)
Yale Lecture notes:
Object-Oriented Programming
SOLID OOP Principles: SOLID Principles (video)
Design patterns
Learn these patterns:
strategy
singleton
adapter
prototype
decorator
visitor
factory, abstract factory
facade
observer
proxy
delegate
command
state
memento
iterator
composite
flyweight
I know the canonical book is "Design Patterns: Elements of Reusable Object-Oriented Software", but Head First is great for beginners to OO.
Combinatorics (n choose k) & Probability
Khan Academy:
Course layout:
Just the videos - 41 (each are simple and each are short):
NP, NP-Complete and Approximation Algorithms
Know about the most famous classes of NP-complete problems, such as traveling salesman and the knapsack problem, and be able to recognize them when an interviewer asks you them in disguise.
Know what NP-complete means.
Peter Norvig discusses near-optimal solutions to traveling salesman problem:
Pages 1048 - 1140 in CLRS if you have it.
Processes and Threads
Computer Science 162 - Operating Systems (25 videos):
for processes and threads see videos 1-11
Covers:
Processes, Threads, Concurrency issues
Difference between processes and threads
Processes
Threads
Locks
Mutexes
Semaphores
Monitors
How they work?
Deadlock
Livelock
CPU activity, interrupts, context switching
Modern concurrency constructs with multicore processors
Process resource needs (memory: code, static storage, stack, heap, and also file descriptors, i/o)
Thread resource needs (shares above (minus stack) with other threads in the same process but each has its own pc, stack counter, registers, and stack)
Forking is really copy on write (read-only) until the new process writes to memory, then it does a full copy.
Context switching
How context switching is initiated by the operating system and underlying hardware?
Testing
To cover:
how unit testing works
what are mock objects
what is integration testing
what is dependency injection
Dependency injection:
Scheduling
In an OS, how it works?
Can be gleaned from Operating System videos
String searching & manipulations
If you need more detail on this subject, see "String Matching" section in Additional Detail on Some Subjects.
Tries
Note there are different kinds of tries. Some have prefixes, some don't, and some use string instead of bits to track the path
I read through code, but will not implement
Short course videos:
Floating Point Numbers
Endianness
Big And Little Endian Inside/Out (video)
Very technical talk for kernel devs. Don't worry if most is over your head.
The first half is enough.
Networking
if you have networking experience or want to be a reliability engineer or operations engineer, expect questions
Otherwise, this is just good to know
DSA practice problems:
Big-O Notation (Asymptotic Analysis) Practice Problems
Check some MCQs on space and time complexity here.
You can see some problems with solutions here: Time complexity of an algorithm
Stack and Queue Practice Problems
spoj.com - JNEXT
spoj.com - STPAR
spoj.com - ONP
codechef.com - COMPILER
spoj.com - MMASS
spoj.com - HISTOGRA
codeforces.com - D. Maximum Xor Secondary
spoj.com - ANARC09A
codeforces.com - C. Minimal string
codeforces.com - B. Alternating Current
codeforces.com - C. Longest Regular Bracket Sequence
Prime Numbers and Divisibility Practice Problems:
community.topcoder.com - DivisorInc
community.topcoder.com - Prime Polynom
community.topcoder.com - Prime Anagrams
community.topcoder.com - Refactoring
Dynamic programming (Basic DP) Practice Problems
Binary Search Problems
Last updated
Was this helpful?