Serial and binary search in data structure using c
Thus, for an array of n elements, the best-case time for serial search requires just 1 array access. Unless the best-case behavior occurs with high probability, the best-case running time is generally not used during analysis.
Hashing has a worst-case behavior that is linear for finding a target, but with some care, hashing can be dramatically fast in the average-case. Hashing also makes it easy to add and delete elements from the collection that is being searched.
To be specific, suppose the information about each student is an object of the following form, with the student ID stored in the key field:. We call each of these objects a record. Of course, there might be other information in each student record. If student IDs are all in the range The record for student ID k can be retrieved immediately since we know it is in data[k].
What, however, if the student IDs do not form a neat range like Suppose that we only know that there will be a hundred or fewer and that they will be distributed in the range We could then use an array with 10, components, but that seems wasteful since only a small fraction of the array will be used. It appears that we have to store the records in an array with elements and to use a serial search through this array whenever we wish to find a particular student ID.
If we are clever, we can store the records in a relatively small array and still retrieve students by ID much faster than we could by serial search. In this case, we can store the records in an array called data with only components.
We'll store the record with student ID k at location:. The record for student ID is stored in array component data.
This general technique is called hashing. Each record requires a unique value called its key. In our example the student ID is the key, but other, more complex keys are sometimes used. A function called the hash function , maps keys to array indices. Suppose we name our hash function hash.
If a record has a key of k , then we will try to store that record at location data[hash k ]. Using the hash function to compute the correct array index is called hashing the key to an array index. The hash function must be chosen so that its return value is always a valid index for the array. Given this hash function and keys that are multiples of , every key produces a different index when it was hashed. Thus, hash is a perfect hash function. Unfortunately, a perfect hash function cannot always be found.
Suppose we no longer have a student ID , but we have instead. The record with student ID will be stored in data as before, but where will student ID be placed? So there are now two different records that belong in data.
This situation is known as a collision. In this case, we could redefine our hash function to avoid the collision, but in practice you do not know the exact numbers that will occur as keys, and therefore, you cannot design a hash function that is guaranteed to be free of collisions. Typically, though, you do know an upper bound on how many keys there will be. The usual approach is to use an array size that is larger than needed.
The extra array positions make the collisions less likely. A good hash function will distribute the keys uniformly throughout the locations of the array. If the array indices range from 0 to 99, then you might use the following hash function to produce an array index for a record with a given key:.
One way to resolve collisions is to place the colliding record in another location that is still open. This storage algorithm is called open-addressing. Open addressing requires that the array be initialized so that the program can test if an array position already contains a record. With this method of resolving collisions, we still must decide how to choose the locations to search for an open position when a collision occurs There are 2 main ways to do so.
There is a problem with linear probing. When several different keys hash to the same location, the result is a cluster of elements, one after another. As the table approaches its capacity, these clusters tend to merge into larger and larger clusters. This is the problem of clustering. Clustering makes insertions take longer because the insert function must step all the way through a cluster to find a vacant location.
Searches require more time for the same reason. The most common technique to avoid clustering is called double hashing. With double hashing, we could return to our starting position before we have examined every available location.
An easy way to avoid this problem is to make sure that the array size is relatively prime with respect to the value returned by hash2 in other words, these two numbers must not have any common factor, apart from 1. Two possible implementations are:. If the value is bigger that what we are looking for, then look in the first half;otherwise,look in the second half. Repeat this until the desired item is found. The table must be sorted for binary search. It eliminates half the data at each iteration.
If we have elements to search, binary search takes about 10 steps, linear search steps. Binary Search finds the middle element of the array. Checks that middle value is greater or lower than the search value. If it is smaller, it gets the left side of the array and finds the middle element of that part. If it is greater, gets the right part of the array. It loops the operation until it finds the searched value.
Or if there is no value in the array finishes the search. Also you can see visualized information about Linear and Binary Search here: Thank you for your interest in this question. Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site the association bonus does not count.
Would you like to answer one of these unanswered questions instead? Email Sign Up or sign in with Google. What is the difference between Linear search and Binary search? Bill the Lizard k Please read the appropriate sections in your course material which, has hopefully, been selected and prepared by your instructor s.
Failing that, a general wikipedia, c2 or google search can answer may of these sort of questions. A linear search would ask: The binary search would ask: Binary search requires the input data to be sorted; linear search doesn't Binary search requires an ordering comparison; linear search only requires equality comparisons Binary search has complexity O log n ; linear search has complexity O n as discussed earlier Binary search requires random access to the data; linear search only requires sequential access this can be very important - it means a linear search can stream data of arbitrary size.
Jon Skeet k A better analogy would be the "guess my number between 1 and game" with responses of "you got it", "too high", or "too low". The dictionary analogy seems fine to me, though it's a better match for interpolation search. Dictionary analogy is better for me Apr 4 '14 at With dictionary approach, the take away is sorting. So the importantly you must make sure the data is sorted before the binary search is started. If not you will be jumping all over the oceans without finding the value: If you do not mark the already tried ones, this can become worse.
So always do the sorting. Some Java based binary search implementation is found here digizol. Yes, the requirement that the input data is sorted is my first bullet point Mia Clarke 6, 3 41 I would like to add one difference- For linear search values need not to be sorted. But for binary search the values must be in sorted order. Pick a random name "Lastname, Firstname" and look it up in your phonebook.
Time both methods and report back! Prabu - Incorrect - Best case would be 1, worst , with an average of May 4 '09 at