Sorting: rearrange a collection of items into increasing or decreasing order.
$$ T(n) = \begin{cases} O(n^2), & Naive/Brute-force~~algorithm, insertion sort\\ O(n log n), & heapsort, merge sort, quick sort \end{cases} $$
Insertion sort is both faster and well-arguably more simplistic than both bubble sort and selection sort. Funny enough, it’s how many people sort their cards when playing a card game!
On each loop iteration, insertion sort removes one element from the array. It then finds the location where that element belongs within another sorted array and inserts it there. It repeats this process until no input elements remain.
class Solution:
"""
Basic idea:
- take out the number from left to right one by one
- For the i-th number, find out the position in the previous (left) i-1 numbers
by shifting the larger one to right by one position recursively.
Time complexity: O(n^2)
"""
def __init__(self,array):
self.array = array
def insert_sort(self):
array = self.array
for i in range(len(array)):
cursor = array[i] # take out the i-th number
position = i
# looking for the position in which to insert the i-th number
while position >0 and array[position-1]>cursor:
array[position] = array[position-1] # shift the number to right by 1
position = position -1 # shift to next-left one
array[position] = cursor # finally find out the position for the number cursor
return array
import numpy as np
import time
A = np.random.randint(0,100,50) # randomly take 50 integer numbers in between 0 and 100
ts = time.time()
print(Solution(A).insert_sort() )
print("Time elapse:",time.time()-ts)
Merge sort is a perfectly elegant example of a Divide and Conquer algorithm. It simple uses the 2 main steps of such an algorithm:
Continuously divide the unsorted list until you have N sublists, where each sublist has 1 element that is “unsorted” and N is the number of elements in the original array.
Repeatedly merge i.e conquer the sublists together 2 at a time to produce new sorted sublists until all elements have been fully merged into a single sorted array.
def merge_sort(arr):
# The last array split
if len(arr) <= 1:
return arr
mid = len(arr) // 2
# Perform merge_sort recursively on both halves
left, right = merge_sort(arr[:mid]), merge_sort(arr[mid:])
# Merge each side together
return merge(left, right, arr.copy())
def merge(left, right, merged):
left_cursor, right_cursor = 0, 0
while left_cursor < len(left) and right_cursor < len(right):
# Sort each one and place into the result
if left[left_cursor] <= right[right_cursor]:
merged[left_cursor+right_cursor]=left[left_cursor]
left_cursor += 1
else:
merged[left_cursor + right_cursor] = right[right_cursor]
right_cursor += 1
for left_cursor in range(left_cursor, len(left)):
merged[left_cursor + right_cursor] = left[left_cursor]
for right_cursor in range(right_cursor, len(right)):
merged[left_cursor + right_cursor] = right[right_cursor]
return merged
import time
#A = np.random.randint(0,100,4) # randomly take 50 integer numbers in between 0 and 100
A = [5,4,3,2,1,0]
ts = time.time()
print(merge_sort(A))
print(time.time()-ts)
Quick sort is also a divide and conquer algorithm like merge sort. Although it’s a bit more complicated, in most standard implementations it performs significantly faster than merge sort and rarely reaches its worst case complexity of $O(n^2)$. It has 3 main steps:
def partition(array, begin, end):
pivot_idx = begin
for i in range(begin+1, end+1):
if array[i] <= array[begin]:
pivot_idx += 1
array[i], array[pivot_idx] = array[pivot_idx], array[i]
array[pivot_idx], array[begin] = array[begin], array[pivot_idx]
return pivot_idx
def quick_sort_recursion(array, begin, end):
if begin >= end:
return
pivot_idx = partition(array, begin, end)
quick_sort_recursion(array, begin, pivot_idx-1)
quick_sort_recursion(array, pivot_idx+1, end)
def quick_sort(array, begin=0, end=None):
if end is None:
end = len(array) - 1
return quick_sort_recursion(array, begin, end)
import time
A = np.random.randint(0,100,50) # randomly take 50 integer numbers in between 0 and 100
ts = time.time()
print(quick_sort(A))
print(time.time()-ts)
Strengths:
Linear time. Counting sort runs in O(n), making it asymptotically faster than comparison-based sorting algorithms like quicksort or merge sort.
Weaknesses:
Restricted inputs. Counting sort only works when the range of potential items in the input is known ahead of time. Space cost. If the range of potential values is big, then counting sort requires a lot of space.
def counting_sort(theList,max_value):
counts = [0]*(max_value+1)
# put the numbers in theList on the coordinate
# if the number exist, counting its frequency and put the frequency at the position
# for example: theList=[2,4,3,4], the max value is 4
# counts = [0,0,0,0,0]
# / / / / /
# 0 1 2 3 4
for num in theList:
counts[num] += 1
# Now, counts becomes [0,0,1,1,2]
# Overwrite counts to hold the next index an item with
# a given value goes.
accumulate = 0
for i, count in enumerate(counts):
counts[i] = accumulate
accumulate += count
# Now counts becomes [0,0,0,1,2]
sorted_list = [None]*len(theList)
# theList[2,4,3,4]
for num in theList:
# Place the item in the sorted list
# counts[2] = 0 -> sorted_list[0] = 2 -> counts[2]=1
# counts[4] = 2 -> sorted_list[2] = 4 -> counts[4]=3
# counts[3] = 1 -> sorted_list[1] = 3 -> counts[3]=2
# counts[4] = 3 -> sorted_list[2] = 4 -> counts[4]=4
sorted_list[counts[num]] = num
# And, make sure the next item we see with the same value
# goes after the one we just placed
counts[num] += 1
return sorted_list
A = [2,4,3,4]
print(counting_sort(A,4))
import time
A = np.random.randint(0,100,50) # randomly take 50 integer numbers in between 0 and 100
ts = time.time()
print(counting_sort(A,A.max()))
print(time.time()-ts)
sort(key=None, reverse=False): in place sort, the calling list is updated.
import numpy as np
a=list(np.random.randint(1,100,10))
a
a.sort()
a
sorted(a, key=lambda x: str(x)): return a new list containing all the iterms in ascending order
class Student(object):
def __init__(self,name,GPA):
self.name= name
self.GPA = GPA
def __lt__(self,other):
return self.name < other.name
students = [Student('John',4.0),Student('Alice',4.5),Student('Peter',5.0), Student('Nancy',3.0)]
students[0].name
sort by name, students is unchanged
students_sorted_byname = sorted(students)
students_sorted_byname[0].name,students[0].name
sort in place by GPA using sorted(key=)
students.sort(key= lambda Student: Student.GPA)
students[0].name