This article contains information regarding various searching techniques that may prove useful in various situations without regard to programming language.

Description Of Searching?

Ready for a completely convoluted example? Read on…

Let’s say that you are visiting your buddy in his hometown and your buddy called from his workplace to inform you that he was about to go into a meeting and would be unable to answer the phone for the next hour. Before hanging up, he tells you that you should pick him up at his workplace in approximately an hour.

Upon hanging up, you realize that you have no idea how to get to your buddy’s workplace since you are new to this town and you do not even know the name of the company for which he works! Now you are in a pickle. The good news is that you have caller ID, but the bad news is that you are very naive and feel that your buddy would be angry with you if you were to call him at work. Of course, at this company, when one person has a meeting, everyone has a meeting and nobody picks up the phone.

Funny thing is that the hotel you are staying at has a “reverse” yellow pages available in the room (I told you this was a convoluted story). There are tons of numbers listed in this book starting with 111-111-1111 and ending with 999-999-9999. You notice that your buddy’s number is 345-456-5678. You jump right to the correct page, determine your buddy’s workplace, go to mapquest.com and find directions. Problem solved.

Of course, you used some sort of searching algorithm when trying to locate your buddy’s number in the reverse phone book. You probably did not realize it at the time, but there are sevearl algorithms that you could have used. Many of which are discussed here.

Various Algorithms

There are several different ways to perform a search. Primarily, the version that you choose to use will depend on how the dataset is constructed. For instance, is the dataset ordered (i.e. sorted)?

Binary Search
Best Case: O(1)
Worst Case: O(log2 n)
Average Case: O(log n)

Description:
A binary search is a search algorithm for searching a set of sorted data for a particular value. A binary search requires random access to the data being searched (i.e. a tape drive will not do). In its simplest form, binary search assumes the data is sorted (usually by a sort algorithm) and takes advantage of that characteristic.

Algorithm:

  1. Guess the middle element of a list as the desired value.
  2. If the element has not been found, divide the list into two equal partitions
  3. Apply the guess of the middle element to each of the divided partitions
  4. Repeat this process until there are either less than two elements in any particular partition or the guessed middle element is the key that you were searching for.

Code:

[sourcecode language=”cpp”]
bool BinarySearch( int aList[], int iKey, int iLeft, int iRight )
{
int iMid;

// no more partitions to search
if ( iRight < iLeft )
return false;

// guess a new middle element and compare
iMid = FLOOR( ( iLeft + iRight ) >> 1 );
if ( aList[iMid] == iKey )
return true; // position in aList is iMid
else if ( iKey < aList[iMid] )
BinarySearch( aList, iKey, iLeft, iMid – 1 );
else if ( iKey > aList[iMid] )
BinarySearch( aList, iKey, iMid + 1, iRight );
}
[/sourcecode]

Hash Table Search
Best Case: O(1)
Worst Case: O(n)
Average Case: O(1)

Description:
A hash table is more of a concept or an aide in performing a searching algorithm. The process is generally used to implement sets of data (i.e. maps).

A hash table is a data structure that speeds up searching for information by a particular aspect of that information, called a key (if the information is a set of customer records, for example, the keys might include first name, last name, address, etc). The idea behind the hash table is to process the key with a function that returns a hash value; that hash value then determines where in the data structure the record will (or probably will) be stored. When the same key is used to search for the stored information, the same hash value is generated, leading the searcher either directly to the location of the information or to the best place to start looking.

Algorithm: Hash table algorithms are in abundance. There are several popular ones to use since they tends to minimize the time spent implementing the hash; therefore, more time actually getting to the data originally desired. One of these algorithms is knowns as “MD5”. Because it is quite lengthy and complicated, one would be well-versed to perform a search online for more information.

Linear Search
Best Case: O(1)
Worst Case: O(n)
Average Case: O(n/2)

Description:
A linear search (or sequential search) is the simplest of the algorithms and probably most likely to be the one you use in most of your coding practices because it’s the most straightforward and obvious. In addition, this algorithm can be used on any dataset.

Algorithm:

  1. Walk through the values of the list either until the key value is found or until the end of the list is reached

Code:

[sourcecode language=”cpp”]
bool LinearSearch( int aList[], int iKey, int iNumElements )
{
int i;
for ( i = 0; i < iNumElements; i++ )
{
if ( aList[i] == iKey )
return true;
}
return false; // key not found
}
[/sourcecode]

Tree Search
Best Case: Various
Worst Case: Various
Average Case: Various

Description:
A tree search is actually a term used to describe one of several sub-algorithms that fall into this category. What makes an algorithm a child of a tree search is in the dataset’s architecture. Specifically, the dataset must be set up in such a way that each element resides as a leaf (or node) of a tree.

As mentioned, there are several sub-algorithms available:

  • Depth-first search
  • Breadth-first search
  • Best-first search
  • A*
  • Depth-limited search
  • Iterative deepening depth-first search
  • Bidirectional search
  • Uniform-cost search

Applications

When writing AI code for a video game like Warcraft, one might use the A* algorithm for a unit’s pathfinding. It involves scoring various nodes according to their location from one point to another. It is one of the fastest, if not the fastest algorithms for this application.

For writing a paint program’s “fill” command, you might use a flavor of the breadth-first algorithm, even though you are not actually searching for a particular key — the idea about expanding in all directions still applies.

As an IT professional, I have not come across any particular situation where implementing the more complex algorithms was necessary. In general, a binary search is adequate for most any application.