Algorithms and Data Structures Tutorial - Full Course for Beginners -- notes
Readwise highlights:: Algorithms and Data Structures Tutorial - Full Course for Beginners
Source
Algorithms and Algorithmic thinking
- An algorithm is a set of steps to be followed in order to solve a specific problem, with clearly defined input and output values.
- Algorithmic thinking is about being able to break a problem down and identify what algorithms and/or data structures best solves that problem.
- In Computer Science, some problems are very common and were studied deeply, which allowed for determining solutions that solve them in the best way possible.
- Learning about algorithm complexity and efficiency allows for programmers to evaluate those aspects in their own solutions.
Time complexity in algorithms
- Time is one of the main measures of efficiency when it comes to algorithms.
- There are 3 ways to measure how well an algorithm performs:
- Best case scenario
- Worst case scenario
- Average of various scenarios
- The most used way is by using the worst case scenario because that means that the algorithm cannot perform worse than that. That's usually a more useful metric.
Big O
- Big O is a notation for algorithmic complexity that's based on the worst case scenario.
- It can be used for both time and space complexity.
- It's represent by the capital letter "O" followed by something between parentheses. Here are some examples:
- O(n)
- O(log n)
- O (1)
- The Big O gives us a function of how much an algorithm runtime increases based on its input set size.
Comparing search algorithms
Linear Search
O(n)
- Given a list and a target value, this algorithm starts at the bottom of the range and looks 1 by one till it finds the target value.
Binary Search
O(log n)
- This algorithm needs the list to be sorted to work. Otherwise, it might discard the target value by mistake and not get the desired result.
- Given a list and a target value, this algorithm repeatedly splits the list in half based on how high the target number is in it.
- First it evaluates the middle element. If it's the target, the algorithm is done.
- If the target is lower than the middle value, the algorithm discards the upper half of the list and repeats the previous steps using the lower half as the new list.
- If the target is higher, the steps are repeated with the upper half instead.
- That set of steps is repeated until the target is found or the list has only one element.
- If the last element is not the target, then it wasn't on the list
Common complexity times
- Constant runtimes -
O(1)
- Reading a value from a list
- Getting the length of a list (depends on the language)
- Logarithmic runtimes -
O(log n)
- Binary search (if the list is ordered)
- Linear runtimes -
O(n)
- Linear search algorithm
- Quadratic runtimes -
O(n^2)
- Algorithms with 2D matrices
- Cubic runtimes -
O(n^3)
- Algorithms with 3D matrices
- Quasi linear -
O(n * log n)
- Sorting algorithms, such as the Merge Sort
- Lies somewhere between the linear and quadratic runtimes
Polynomial runtimes vs exponential runtimes
Polynomial runtimes
- These algorithms complexity can be defined by
O(n^k)
. If that is defined on a graph, all algorithms that lie below the formed line have polynomial runtimes. - They are considered efficient algorithms and are most likely to be used in common scenarios.
Exponential runtimes
- Algorithms with exponential runtimes are defined by
O(x^n)
, which means their runtimes increase exponentially based on the input. They lie over the Polynomial runtime graph line. - These algorithms are not considered efficient and are not used for larger input sets. Even if the computer could handle doing that, they would take so much time to execute that they wouldn't be useful.
How to determine an algorithm's complexity?
- Calculate the complexity of each steps
- An algorithm's complexity is always based to least performative step.
- So, if an algorithm has two steps with constant time (
O(1)
) and one step with logarithmic time (O(log n
), then its time complexity isO(log n)
.
- So, if an algorithm has two steps with constant time (
Space complexity
- Space complexity takes into account the additional space required for the algorithm to run, not the total required space.
- So, if the space used by an algorithm doesn't change throughout its execution, it has constant space complexity, or
O(1)
.
- So, if the space used by an algorithm doesn't change throughout its execution, it has constant space complexity, or
Iterative vs Recursive algorithms
Recursive algorithms
- Recursive algorithms have functions that call themselves.
- Need stop conditions (base cases) in order to avoid infinite loops.
- The number of calls a recursive function makes to itself when running an algorithm is called recursion depth.
- Some languages implement something called tail optimization, which allows for recursion functions to be used without increasing space complexity.
- Are usually preferred when working with functional programming languages because they allow writing immutable code.
Iterative algorithm
- These usually use some kind of loop structure instead of recursion.
- Depending on the programming language, these have less influence in space complexity than recursive alternatives.
- Also, not every language can handle high recursion depth values.
Data Structures
- What is a data structure?
- What are arrays?
- They can be homogeneous or heterogeneous
- The data structure that most resembles an array in Python is the list (different from linked list)
- Stored in contiguous memory (sequential addresses)
- This means the structure only needs to store a reference to the first element, then it can loop through the other values by adding an offset from the first element
- Accessing contiguous memory spaces takes almost constant time
- Not using contiguous memory spaces can add significant overhead when accessing values from a data structure
- Heterogeneous arrays don't allow for contiguous memory because you can't know the size of each element
- Python goes around this problem by storing pointers instead of values and uses other mechanisms to decrease the overhead
- Data structures are expected to allow 4 kinds of operations
- accessing/reading values
- searching for specific values
- inserting more values
- deleting values
- Searching for items in an array using Python
- The in operator, which runs linear search under the hood
- Manually looping through the array and comparing each element
- Or, if the array was sorted, using a binary search algorithm
- Inserting elements into an array
- Adding at a specific position
- Has to shift elements to the right, which means linear runtime for worst case scenario
- Appending
- Always adds the element to the end so it has constant runtime
- Concatenating another array to it
- Adding all the elements from a second array to the end of the original one.
- Adding at a specific position
- Inserting elements might require resizing the array
- Arrays might need to be resized before adding new elements
- Depending on the language, the resizing is automatic and happens at specific points.
- In Python, for instance, when a resize operation occurs, it adds more than 1 extra space so that it doesn't need to resize every time.
- Even though the resize operation might influence some steps complexity, because it doesn't happen for every insert, the overall complexity doesn't change. For instance, the append operation has an Ammortized Constant Space complexity.
- Deleting an array element
- Just as with inserting an element at a specific position, deleting a specific element might require shifting elements to fill the gap.
- So, they have linear runtime
- Arrays are good for reading values but are bad for inserting and deleting. Linked Lists can be a better option for those operations.
- Linked Lists
- Linear data structures
- Formed by Nodes, structures that store references to their neighbors
- To loop through a Linked List you start at the head element and follow the references to the next elements until you get to the tail.
- Can be singly or doubly-linked
- Singly: nodes only store references to the next neighbor
- Doubly: nodes store references to both the previous and next neighbor, so you can loop through the list in both directions
- Insert operations don't need to shift elements in memory, they just need to swap the neighbor references in adjacent nodes.
- Still, before inserting, you need to loop through the list to find the position where you want to insert the new element.
- Linked Lists are useful for learning purposes but are rarely used in real scenarios. There usually are more efficient options.
- Merge sort algorithm
- Steps
- Split the array in 2 recursively until you have multiple single-element arrays
- Then start comparing the sub-arrays and merge them back until the whole array is sorted
- Steps