Search Strategies: Internet and Library Databases
Searching the Internet, Searching Library Databases
People are so accustomed to searching the Internet that they assume that searching library catalogs and library databases is going to work exactly the same. This couldn't be farther from the truth. Here's why:
- The Internet is actually a conglomeration of thousands of unrelated networks of computers that share one thing in common: they are all able to communicate with each other using a shared set of computer protocols. Other than this, there is no inherent organization to the Internet that makes it particularly easy to navigate. For this reason, we have search services like Google.
- Internet search services such as Google regularly pull content from the Internet and create indexes of the content that can then be searched, much like a database can be searched. Web page designers who adhere to standards provide contextual clues in their documents that make finding their documents via a search service like Google easier. The problem is that much of the content on the Internet does not adhere to standards.
- Indexes of Internet content are created by computers and are largely unexamined for accuracy by people who use the indexes. Anyone who has searched for information using any Internet search service will notice all kinds of "strange" matches in their search results. This is because the indexes that help them find pages are created wholly by computers and have not been evaluated for relevancy by humans. Therefore, many results in any search listing will not be at all what the searcher expects.
- Library databases are designed by professionals to make finding information a more precise process. A library's catalog is based on specific description, following very well defined rules, of each and every item contained in a library's collection. A researcher who knows how the system operates will be able to eliminate nonrelevant materials with relative ease. This goes also for research databases that provide access to articles from journals, magazines, and newspapers. In each of these types of systems, professional catalogers or indexers have examined materials covered in the databases and have provided descriptions that a searcher can take advantage of to bring more precision to a search.
- Library databases and research databases make use of "controlled vocabularies." Psychology has its own vocabulary. Sociology has its own vocabulary. Art, music, history, all have their own vocabularies. A database that focuses on identifying materials useful to history researchers will make use of the professional vocabulary for history, thus making it easier for a historian to find what's needed on a particular topic.
- Library databases may or may not include the ability to search the full text of their covered publications. Systems like ProQuest and WilsonWeb do provide full text search capabilities, while the CSA databases do not. Systems that do not search full text rely heavily on indexing and abstracting to provide researchers with results.
Even though there are fundamental differences in structure and coverage among databases and the open Internet, researchers using any search system will benefit from thinking carefully about what they're searching and how to construct a search strategy.
Constructing Searches -- Google/Database Comparisons
Constructing a search in Google can be very different from constructing a search in a research database. Google is designed to return results based on any or all of the words entered into the Google search box. A Google user does not need to specify connections between the terms and doesn't have to follow any particular rules to come up with some kind of result listing. Most databases, on the other hand, work better using logical connectors between search words. The most commonly used connector is the Boolean connector AND. If it is important that the search results include all of the words entered into the search box, the database should be queried using AND between each individual word or phrase. The following example compares a search executed in Google and in a library research database, in this case, CSA ILLUMINA's Engineering Research Database.
Google Search for information on costs associated with the use of solar water heaters. |
CSA ILLUMINA search for information on costs |
|
![]() |
![]() |
|
The Google search is entered as most people would normally enter a search in Google. In other words, a string of words has been entered to describe the main topics being searched. Entering the same strategy in CSA will not work. CSA suggests entering phrases in quotation marks and connecting search terms using the Boolean connector AND. The search example shows this strategy. While some of the database systems do accommodate Google-like strategies, most of them still work better using the kind of strategy illustrated in the CSA example above.
The result of the Google search pictured above is a listing of 616,000 web pages. The result of the CSA search is 10 articles coming mostly from engineering journals. Most researchers are not going to want to examine 616,000 websites, so finding a strategy that works better in Google will be of benefit.
Google Search using phrase searching. |
Google Search using phrase searching and required matches. |
|
![]() |
![]() |
|
The Google searches pictured above demonstrate two strategies for getting fewer and better results in Google. The first strategy shows a reduction in results from 616,000 to 225,000, still a huge number, but considerably fewer items to go through. Getting to this point was as simple as enclosing the phrase solar water heaters in quotation marks. This indicates to Google that it should only bring back web pages that contain this exact phrase and the word costs, if possible. The second strategy shows the use of required terms in Google to further reduce retrieval. Placing the plus sign (+) in front of each term that must be included in the results ensures that web pages retrieved will contain all of the required terms. The results now rest at 84,800 web pages. Further reductions are possible by using Google's Advanced Search panel. A searcher can limit by date and even by language if need be. Further reduction can be gained by using Google Scholar instead of Google. Scholar searches only scholarly websites and includes articles that can be located in some database systems. The same search of Google Scholar results in 2,050 matching web pages, clearly a huge reduction in the number of materials a researcher will need to review.
The Google examples above demonstrate the importance of thinking about how to construct a search. Database searches also benefit from careful wording of the search. With both Google and library databases, the better the choice of words used to describe what a researcher is looking for, the better the results of the search. The huge difference in results brought back by Google and by the CSA database can be explained simply: Google searches the full text of all searchable Internet content; databases search descriptions and sometimes full text of journals, magazines, and newspapers selected for inclusion by the individual databases. For example, the ProQuest Research Library database covers nearly 4000 different magazines, journals, and newspapers and can be searched as a full text system. It does not, however, include the ability to search websites, so its focus differs completely from that of Google. The CSA ILLUMINA system does not include full text searching, so a researcher is relying on the database's indexing and abstracting to locate relevant materials. While this might seem like a shortcoming, in many cases this can provide a researcher with more focused search results.
Constructing Searches -- Using Database Organization to Construct Better Searches
Further examination of the CSA ILLUMINA Engineering Research Database will help to illustrate some of the methods that can be used in databases to get more precise results. First of all, knowing what is covered in a database will help drive a researcher's choice of that database over another. The UNF Library categorizes databases according to their usefulness to a particular discipline. The CSA ILLUMINA Engineering Research Database covers professional journals and conference papers that focus on engineering and includes access to the following systems:
- ANTE: Abstracts in New Technologies and Engineering, 1981 to date
- CSA/ASCE Civil Engineering Abstracts, 1966 to date
- Earthquake Engineering Abstracts, 1971 to date
- Environmental Engineering Abstracts, 1990 to date, and
- Mechanical & Transportation Engineering Abstracts, 1966 to date
If a researcher is looking to cover published professional literature in any of these areas, then this database is a good choice. The search used previously to contrast with Google resulted in a listing of 10 articles, coming mostly from journals. The illustrations below show the search strategy and the result screen in the database.
CSA Search Strategy |
Beginning of CSA Search Result Listing |
|
![]() |
![]() |
|
One of the things that databases are really good about is showing researchers what types of published materials are available. In the result screen above, the CSA system divides results into journal articles, peer-reviewed journal articles, and conference reports. This makes it easier for a researcher to review the types of materials available at a glance. Many databases also include newspaper articles, magazine articles, dissertations, and other types of sources. If a researcher is primarily interested in retrieving journal articles, having the articles separated into browsable tabs, like in CSA, makes it easier to sort through the results.
Another feature of databases is the use of "controlled vocabularies" that rely on terminology that makes sense to professionals who work in particular fields. For example, engineers will have a firm grasp of the terminology used in engineering and will likely rely on that terminology in conducting reviews of published literature. The CSA system shows terminology assigned to each article to the right of the article description. Many databases call these terms descriptors. In the example above, the first article is labeled with a number of descriptors including construction, water heaters, construction costs, residential buildings, frames, etc. A researcher looking specifically for construction costs related to the installation of solar hot water heaters in residential construction can make use of the descriptors to find articles that target his or her particular interests. The Advanced Seach in CSA accommodates entering multiple descriptors or a researcher can review a relevant article description and select the descriptors directly from the article's information. Of course a researcher doesn't have to limit to descriptors, but it is a good way to limit searches that result in large numbers of items.
The following illustrations demonstrate how to conduct a new search using descriptors discovered in a relevant article.
Full article description from CSA. |
Selecting descriptors from an article description. |
|
![]() |
![]() |
|
The full article description in CSA includes all the information necessary to document the article in a bibliography: author's name, article title, journal title, volume, issue, date, and pages. Additionally, CSA provides contact information for the author(s), author affiliation(s), subject descriptors used to categorize the article, an abstract of the article, and any additional information that might be useful to a researcher. The CSA system also allows a researcher to review and check descriptors found in an article description to search for articles that are similar. The second illustration above shows the descriptors Water heaters, Construction costs, Residential buildings, and Solar energy checked with options for using AND to narrow the search or OR to broaden the search. Clicking the Go button will send the new search strategy and return a result listing of any articles that match all the selected descriptors. Optionally, a researcher could click the Advanced Search tab at the top of the article listing and directly type in the desired descriptors to complete a new search.
Many databases will even recommend search strategies that a researcher need only click to execute. For example, the ProQuest Research Library will provide clickable searches as a part of the result of a basic keyword search of its system. An example of a ProQuest search result with recommendations is pictured below.
ProQuest Search Results with Recommendations |
||
![]() |
The search illustrated to the left was a basic keyword search on the phrase "solar water heaters" and the keyword cost. The search resulted in a possible 94 matching documents. Some of the materials are from scholarly journals, some are from magazines, some are from trade publications, and some are from newspapers. The tabs listed above the results listing allow a researcher to choose articles from any of those types of publications. ProQuest is a full-text system, so the search result comes from articles that match the strategy anywhere, in article titles, author names, journal names, abstracts, full article text, etc. The Advanced search tab allows a researcher to restrict results just to full text or to subject headings or to any other searchable field if needed. In the panel above the search results, the ProQuest system provides recommendations for new searches based on the strategy entered by the researcher and on the results retrieved in the current search. Notice in the Suggested Topics box that ProQuest suggests searches on
and gives the option to view additional search suggestions. A researcher can click any of the Suggested Topics links to perform a new search in the database. Depending on the strategy selected, results might increase or decrease. For example, a search on Water Heaters will likely result in more items retrieved, whereas a search on Water heaters AND Solar energy might result in fewer listings. |
|
As a researcher works with a database, he or she will find it useful to keep track of key terminology that keeps cropping up in article descriptions as a means of identifying additional search strategies for finding more information or for further refining the search topic. The better armed a searcher is with vocabulary to describe the topic of a search, the more likely the searches will be successful.
Shaping Questions Into Search Strategies
All of the examples shown above have revolved around isolating key phrases and words that would adequately describe a topic in a search. Note that none of the strategies was phrased in the form of a question. This was quite deliberate. While database designers are constantly working toward building search interfaces that work with natural language, realistically, most databases rely on logically combining words and phrases rather than expressing ideas in complete sentences. Researchers should spend some time thinking carefully about their topics and how to phrase their questions about the topics before launching into database searches. A good strategy to use is to start with a research question or questions and to make a list of all of the key concepts that the question(s) might touch upon.
| Research Question | Possible Search Key Words and Phrases | |
| Do celebrity endorsements of products influence consumer purchases? | "celebrity endorsement" |
|
| How effective are municipal recycling programs in reducing landfill waste? | recycling |
|
Beginning with the list of key words and phrases, think about logical connections between the ideas and how to phrase these for use in a database. Most databases make use of the Boolean connector AND as a means of stringing together two or more concepts in one search. A few examples should help to clarify.
| Possible Search Strategies for Celebrity Endorsements question | Possible Search Strategies for Municipal Recycling question | |
| "celebrity endorsement" AND "consumer behavior" | "solid waste recycling" AND effectiveness | |
| "celebrity endorsement" AND "purchasing decisions" | "municipal recycling programs" AND "citizen participation" | |
| "professional athletes" AND "purchasing decisions" | landfills AND recycling AND management |
A researcher should try a number of strategies, carefully reviewing results along the way. As relevant results come up, the researcher should review key terminology used in the articles, in the article abstracts, and in the database subject headings or descriptors with an eye toward identifying other ideas for constructing related searches. Not every search strategy is going to work, so the better the researcher's grasp of the topic and its related terminology, the more likely he or she will be in eventually finding materials suitable for the topic being scrutinized.
Updated 1/09.








