Jump to content

Bubble sort: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Jafet (talk | contribs)
complete revamp of top half
Line 1: Line 1:
'''Bubble sort''', also known as '''exchange sort''', is a simple [[sorting algorithm]]. It works by repeatedly stepping through the list to be sorted, comparing two items at a time, swapping these two items if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which means the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top (i.e. head) of the list via the swaps. Because it only uses comparisons to read elements, it is a [[comparison sort]].
'''Bubble sort''', sometimes shortened to '''bubblesort''', also known as '''exchange sort''', is a simple [[sorting algorithm]]. It works by repeatedly stepping through the list to be sorted, comparing two items at a time and [[swap]]ping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which means the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top (i.e. the beginning) of the list via the swaps. Because it only uses comparisons to operate on elements, it is a [[comparison sort]].


In more detail, the bubble sort algorithm works as follows:
A simple way to express bubble sort in [[pseudocode]] is as follows:
'''function''' bubble_sort(''list'' L, ''number'' listsize)
#Compare adjacent elements. If the first is greater than the second, swap them.
'''loop'''
#Do this for each pair of adjacent elements, starting with the first two and ending with the last two. At this point the last element should be the greatest.
has_swapped := 0 <span style="color:green">//reset flag</span>
#Repeat the steps for all elements except the last one.
'''for''' ''number'' i '''from''' 1 '''to''' listsize
#Keep repeating for one fewer element each time, until you have no more pairs to compare. (Alternatively, keep repeating until no swaps are needed.)
'''if''' L[i] > L[i + 1] <span style="color:green">//if they are in the wrong order</span>
swap(L[i], L[i + 1]) <span style="color:green">//exchange them</span>
has_swapped := 1 <span style="color:green">//we have swapped at least once, list may not be sorted yet</span>
'''endif'''
'''endfor'''
<span style="color:green">//if no swaps were made during this pass, the list has been sorted</span>
'''if''' has_swapped = 0
'''exit'''
'''endif'''
'''endloop'''
'''endfunction'''


'''function''' bubblesort (A : ''list''[1..n]) {
'''var''' ''int'' i, j;
'''for''' i '''from''' n '''downto''' 1 {
'''for''' j '''from''' 1 '''to''' i-1 {
'''if''' (A[j] > A[j+1])
swap(A[j], A[j+1])
}
}
}


==Analysis==


===Worst-case performance===
Bubble sort has worst-case complexity ''[[Big-O notation|O]](n<sup>2</sup>)'' on lists of size ''n''. To see why, note that each element is moved no more than one step each time. No element can be more than a distance of ''n'' away from its final sorted position, so we use at most ''O(n)'' operations to move an element to its final sorted position, and use no more than ''O(n<sup>2</sup>)'' operations in the worst case.


However, on a list where the smallest element is at the bottom, each pass through the list will only move it up by one step, so we will take ''n'' passes to move it to its final sorted position. As each pass traverses the whole list a pass will take ''O(n)'' time. Thus the least number of operations in the worst case is also ''O(n<sup>2</sup>)''.
== Performance ==


===Best-case performance===
Bubble sort needs <math>O(n^2)</math> comparisons to sort n items and can sort in-place. Although the algorithm is one of the simplest sorting algorithms to understand and implement, it is too inefficient for use on lists having more than a few elements. Even among simple <math>O(n^2)</math> sorting algorithms, algorithms like [[insertion sort]] are considerably more efficient.
When a list is already sorted, bubblesort will pass through the list once, and find that it does not need to swap any elements. This means the list is already sorted. Thus bubblesort will take ''O(n)'' time when the list is completely sorted. It will also use considerably less time if the elements in the list are not too far from their sorted places.


===Rabbits and turtles===
Due to its simplicity, the bubble sort is often used to introduce the concept of an algorithm to introductory programming students. However, some researchers such as Owen Astrachan have gone to great lengths to disparage bubble sort and its
The positions of the elements in bubble sort will play a large part in determining its performance. Large elements at the top of the list do not pose a problem, as they are quickly swapped downwards. Small elements at the bottom, however, as mentioned earlier, move to the top extremely slowly. This has led to these types of elements being named [[the Tortoise and the Hare|rabbits and turtles]], respectively.
continued popularity in computer science education, recommending that it no longer even be taught.[http://www.cs.duke.edu/~ola/papers/bubble.pdf] The [[Jargon file]], which famously calls [[bogosort]] "[t]he archetypical perversely awful algorithm", also calls bubble sort "the generic ''bad'' algorithm".[http://www.jargon.net/jargonfile/b/bogo-sort.html] Don Knuth, in his famous ''[[The Art of Computer Programming]]'', concluded that "the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems", some of which he discusses therein.


Various efforts have been made to eliminate turtles to inprove upon the speed of bubble sort. [[Cocktail sort]] does pretty well, but it still retains ''O(n<sup>2</sup>)'' worst-case complexity. [[Comb sort]] compares elements large gaps apart and can move turtles extremely quickly, before proceeding to smaller and smaller gaps to smoothen out the list. It clocks in at a respectable ''O(n log n)'' time, rivaling in speed and simplicity more complex competitors like [[quicksort]] and [[heapsort]].
Bubble sort is [[Asymptotic notation|asymptotically]] equivalent in running time to [[insertion sort]] in the worst case, but the two algorithms differ greatly in the number of swaps necessary. Insertion sort needs only <math>O(n)</math> operations if the list is already sorted, whereas naïve implementations of bubble sort (like the pseudocode above) require <math>O(n^2)</math> operations. (This can be reduced to <math>O(n)</math> if code is added to stop the outer loop when the inner loop performs no swaps.) Experimental results such as those of Astrachan have also shown that insertion sort performs considerably better even on random lists. For these reasons many modern algorithm textbooks avoid using the bubble sort algorithm in favor of insertion sort.


===Alternative implementations===
Bubble sort also interacts poorly with modern CPU hardware. It requires at least twice as many writes as insertion sort, twice as many cache misses, and asymptotically more [[branch prediction|branch mispredictions]]. Experiments by Astrachan sorting strings in Java show bubble sort to be roughly 5 times slower than insertion sort and 40% slower than [[selection sort]].
One way to optimize bubblesort is to note that, after each pass, the largest element will always move down to the bottom. During each comparison, it is clear that the largest element will move downwards. Given a list of size ''n'', the ''n<sup>th</sup>'' element will be guaranteed to be in its proper place. Thus it suffices to sort the remaining ''n - 1'' elements. Again, after this pass, the ''n - 1<sup>th</sup>'' element will be in its final place.


We can then do bubbling passes over increasingly smaller parts of the list. More precisely, instead of doing ''n<sup>2</sup>'' comparisons (and swaps), we can use only ''n + (n-1) + (n-2) + ... + 2 + 1'' comparisons. This [[arithmetic progression|sums]] up to ''n(n + 1) / 2'', which is still ''O(n<sup>2</sup>)'', but which can be considerably faster in practice.
Reversing the order in which the list is traversed for each pass improves the efficiency somewhat. This bi-directional bubblesort is sometimes called [[shuttle sort]] since the algorithm shuttles from one end of the list to the other.


==In practice==
Although bubble sort is one of the simplest sorting algorithms to understand and implement, its ''O(n<sup>2</sup>)'' complexity means it is far too inefficient for use on lists having more than a few elements. Even among simple ''O(n<sup>2</sup>)'' sorting algorithms, algorithms like [[insertion sort]] are considerably more efficient.

Due to its simplicity, bubble sort is often used to introduce the concept of an algorithm, or a sorting algorithm, to introductory [[computer science]] students. However, some researchers such as Owen Astrachan have gone to great lengths to disparage bubble sort and its continued popularity in computer science education, recommending that it no longer even be taught.[http://www.cs.duke.edu/~ola/papers/bubble.pdf] The [[Jargon file]], which famously calls [[bogosort]] "the archetypical perversely awful algorithm", also calls bubble sort "the generic '''bad''' algorithm".[http://www.jargon.net/jargonfile/b/bogo-sort.html] [[Donald Knuth]], in his famous ''[[The Art of Computer Programming]]'', concluded that "the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems", some of which he discusses therein.

Bubble sort is [[Asymptotic notation|asymptotically]] equivalent in running time to [[insertion sort]] in the worst case, but the two algorithms differ greatly in the number of swaps necessary. Experimental results such as those of Astrachan have also shown that insertion sort performs considerably better even on random lists. For these reasons many modern algorithm textbooks avoid using the bubble sort algorithm in favor of insertion sort.

Bubble sort also interacts poorly with modern CPU hardware. It requires at least twice as many writes as insertion sort, twice as many cache misses, and asymptotically more [[branch prediction|branch mispredictions]]. Experiments by Astrachan sorting strings in Java show bubble sort to be roughly 5 times slower than insertion sort and 40% slower than [[selection sort]].


== Variations ==
== Variations ==
*'''Odd-even sort''' is a parallel version of [[bubblesort]], for message passing systems.
*'''Odd-even sort''' is a parallel version of bubble sort, for message passing systems.


== References ==
== References ==

Revision as of 16:11, 12 August 2006

Bubble sort, sometimes shortened to bubblesort, also known as exchange sort, is a simple sorting algorithm. It works by repeatedly stepping through the list to be sorted, comparing two items at a time and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which means the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top (i.e. the beginning) of the list via the swaps. Because it only uses comparisons to operate on elements, it is a comparison sort.

A simple way to express bubble sort in pseudocode is as follows:

function bubble_sort(list L, number listsize)
    loop
        has_swapped := 0 //reset flag
        for number i from 1 to listsize
            if L[i] > L[i + 1] //if they are in the wrong order
                swap(L[i], L[i + 1]) //exchange them
                has_swapped := 1 //we have swapped at least once, list may not be sorted yet
            endif
        endfor
        //if no swaps were made during this pass, the list has been sorted
        if has_swapped = 0
            exit
        endif
    endloop
endfunction


Analysis

Worst-case performance

Bubble sort has worst-case complexity O(n2) on lists of size n. To see why, note that each element is moved no more than one step each time. No element can be more than a distance of n away from its final sorted position, so we use at most O(n) operations to move an element to its final sorted position, and use no more than O(n2) operations in the worst case.

However, on a list where the smallest element is at the bottom, each pass through the list will only move it up by one step, so we will take n passes to move it to its final sorted position. As each pass traverses the whole list a pass will take O(n) time. Thus the least number of operations in the worst case is also O(n2).

Best-case performance

When a list is already sorted, bubblesort will pass through the list once, and find that it does not need to swap any elements. This means the list is already sorted. Thus bubblesort will take O(n) time when the list is completely sorted. It will also use considerably less time if the elements in the list are not too far from their sorted places.

Rabbits and turtles

The positions of the elements in bubble sort will play a large part in determining its performance. Large elements at the top of the list do not pose a problem, as they are quickly swapped downwards. Small elements at the bottom, however, as mentioned earlier, move to the top extremely slowly. This has led to these types of elements being named rabbits and turtles, respectively.

Various efforts have been made to eliminate turtles to inprove upon the speed of bubble sort. Cocktail sort does pretty well, but it still retains O(n2) worst-case complexity. Comb sort compares elements large gaps apart and can move turtles extremely quickly, before proceeding to smaller and smaller gaps to smoothen out the list. It clocks in at a respectable O(n log n) time, rivaling in speed and simplicity more complex competitors like quicksort and heapsort.

Alternative implementations

One way to optimize bubblesort is to note that, after each pass, the largest element will always move down to the bottom. During each comparison, it is clear that the largest element will move downwards. Given a list of size n, the nth element will be guaranteed to be in its proper place. Thus it suffices to sort the remaining n - 1 elements. Again, after this pass, the n - 1th element will be in its final place.

We can then do bubbling passes over increasingly smaller parts of the list. More precisely, instead of doing n2 comparisons (and swaps), we can use only n + (n-1) + (n-2) + ... + 2 + 1 comparisons. This sums up to n(n + 1) / 2, which is still O(n2), but which can be considerably faster in practice.


In practice

Although bubble sort is one of the simplest sorting algorithms to understand and implement, its O(n2) complexity means it is far too inefficient for use on lists having more than a few elements. Even among simple O(n2) sorting algorithms, algorithms like insertion sort are considerably more efficient.

Due to its simplicity, bubble sort is often used to introduce the concept of an algorithm, or a sorting algorithm, to introductory computer science students. However, some researchers such as Owen Astrachan have gone to great lengths to disparage bubble sort and its continued popularity in computer science education, recommending that it no longer even be taught.[1] The Jargon file, which famously calls bogosort "the archetypical perversely awful algorithm", also calls bubble sort "the generic bad algorithm".[2] Donald Knuth, in his famous The Art of Computer Programming, concluded that "the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems", some of which he discusses therein.

Bubble sort is asymptotically equivalent in running time to insertion sort in the worst case, but the two algorithms differ greatly in the number of swaps necessary. Experimental results such as those of Astrachan have also shown that insertion sort performs considerably better even on random lists. For these reasons many modern algorithm textbooks avoid using the bubble sort algorithm in favor of insertion sort.

Bubble sort also interacts poorly with modern CPU hardware. It requires at least twice as many writes as insertion sort, twice as many cache misses, and asymptotically more branch mispredictions. Experiments by Astrachan sorting strings in Java show bubble sort to be roughly 5 times slower than insertion sort and 40% slower than selection sort.

Variations

  • Odd-even sort is a parallel version of bubble sort, for message passing systems.

References

Template:Wikibookschapter