URL stands for Uniform Resource Locator.
A URL is just the internet address for any given webpage:
Understanding the component parts of a URL can be helpful in a variety of situations. Here are just a few reasons why understanding URLs is useful:
- The URL often reveals key information about a site
- An understanding of URLs provides the needed foundation for many advanced search strategies
- A heightened attention to URLs helps searchers recognize fraudulent sites
Each section below focuses on a different part of the URL. At the end of the webtext is a quiz that you can take to test your understanding of URLs.
Locate the protocol
The “protocol” is the first part of URL. Some browsers simplify how addresses are displayed by hiding the protocol: for example, in Chrome and Firefox, http://writingcommons.org displays as writingcommons.org
The protocol https indicates that information sent through the page will be encrypted, and therefore harder to read if some third party intercepts the information. (The next time you are entering a username and password on a page, check for the “https” protocol.)
Locate the domain name
The “domain name” identifies the site that contains the page you are viewing. It appears just before the first single slash (/). If there is no single slash, then the domain name is whatever appears at the end of the URL.
For example, the following URLs all refer to pages on the Writing Commons site:
If you look carefully, you will see that most browsers try to help users out by boldfacing the domain name in the address bar.
Being able to locate the domain name in a URL allows you to identify the entity that hosts the page you are viewing—a piece of information that is often crucial to understanding the nature of your source.
Elements of the URL that appear after the domain indicate different sub-directories. For example:
In the example above, “open-text,” “information-literacy,” and “rhetorical-analysis” are sub-directories of the domain writingcommons.org. Think of these as folders within folders.
Subdomains are similar to sub-directories in that they provide a way for website developers to separate content, but subdomains appear before the domain name in the URL. Don’t let this trip you up. The domain name is still the content that appears pressed up against the first single slash (/) or—if there is no single slash—at the very end of the URL.
For example, the domain name in all of the following URLs is google.com
Pay attention to the placement of the dots. The following is not a Google page:
Here the domain is mgoogle.com, not google.com
Recognize top-level domains
In the domain name writingcommons.org, the “top-level domain” is .org. The top-level domain .org was originally intended for use by non-profit organizations—and many non-profits continue to use it—but it is now open to anyone.
In the domain name amazon.com, the top-level domain is .com. Short for “commercial,” .com is the most common top-level domain in the world and is now used for a wide variety of sites—not just the sites of commercial enterprises.
Some top-level domains have retained their original meanings and are especially helpful to know:
Newer top-level domains such as .museum, .bike, and .clothing are not yet widely used.
Some domains include a country domain extension—or “country code top level domain.”
Here are some examples:
Pay attention to country domain extensions. When present in a URL, they represent a core component of the domain. Note, for example, that hydra.com and hydra.com.gr are different domains. The two are unrelated sites run by unrelated entities.
For a comprehensive list of top-level domains, consult one of the following:
Use your understanding of URLs to enhance your web searching
Once you understand URLs, certain kinds of advanced search strategies become easier to conceptualize, remember, and implement—for example, filtering by domain and top-level domain.
Filter by top-level domain
If you know that the kind of information you are seeking is most likely to appear on a site with a particular type of top-level domain, you can restrict your search to this type of site using the site: search operator.
For example, if you are seeking government documents on the topic of student loans, then a search for student loans site:gov will return only results with the top-level domain gov, filtering out a large number of sites that are not relevant to your research needs.
Filter by domain
If you know the domain of the site on which your information will appear, you can use site: to search only that site.
For example, a search for sample tests site:dmv.ca.gov will return only pages located on the California Department of Motor Vehicles (DMV) website (the domain of which is dmv.ca.gov).
The site: operator works in all major search engines (Google, Bing, Baidu, DuckDuckGo, etc.).
Practice identifying deceptive URLs
The immediate benefit of the drill below will be to improve your ability to distinguish between real and fraudulent sites, but the exercise will also help you sharpen your overall URL-analysis skills by heightening your attention to the component parts of URLs.
A) Which of the following are eBay.com web pages? Do not go to the sites. (Some sites masquerading as legitimate sites may contain harmful underlying code). Just examine the URLs.
B) Find the domain name in this URL:
A) eBay page?
|1. http://pages.ebay.com||YES||This is an eBay page. The domain name is ebay.com|
|2. http://movies.half.ebay.com||YES||This is an eBay page. The domain name is ebay.com (“movies” and “half” indicate subdomains).|
|3. http://pages.ebey.com||NO||This is not an eBay page. Note that “ebay” is misspelled as ebey.|
|4. http://188.8.131.52:8866/ebay.htm||NO||This is not an eBay page. The first single slash (/) is not preceded by the domain name ebay.com.|
|5. http://email@example.com||NO||This is not an eBay page. Notice that there is no slash (/) after “ebay.com.”|
|6. http://page.@ebay.com||NO||This is not an eBay page. The actual domain is @ebay.com, not ebay.com. (@ebay.com is as different from ebay.com as zebay.com, bebay.com, mebay.com, etc. One character can make all the difference.)|
|7. http://signin-ebay.com||NO||This is not an eBay page. If the hyphen were a period, we’d be fine. But it isn’t. As in the example above with @, the hyphen could be any character and be just as wrong.|
|8. http://www.ebay.com/electronics/ipad||YES||This is an eBay page. The domain name is ebay.com. The first single slash (/) is directly preceded by .ebay.com|
|9. http://www.ebay.deals.com||NO||This is not an eBay page. The domain name is deals.com (not ebay.com).|
|10. http://www.ebay.pro||NO||This is not an eBay page. The domain name is ebay.pro (not ebay.com).|
|11. http://www.ebay.com.bb/motors/motorcycles||NO||This is not an eBay page. The domain name is ebay.com.bb(not ebay.com).|
|12. http://www.ebay.com/itm/A-Planet-of-Viruses-by-Carl-Zimmer-2011-Hardcover-/191063912359||YES||This is an eBay page. The domain name is ebay.com. The first slash is directly preceded by.ebay.com|
B) The domain name in the following URL is bernadinec.com (not bankofamerica.com). Notice that bernadinec.com is what appears just before the first single slash (/):