Google anomalies

If you feel that your question or comment doesn't fit into the categories above, feel free to post it here.
Post Reply

Google anomalies

Post by dalehileman » Fri Feb 25, 2005 6:42 pm

In Google search for an expression placed in "exact phrase" box and "intitle:definition" placed in the "all" box, I achieve hundreds of hits, but in only a very few cases does "definition" appear in a title.

I have encountered many such anomalies in Google. Can anyone explain this
ACCESS_POST_ACTIONS

Google anomalies

Post by haro » Fri Feb 25, 2005 8:01 pm

Dale, without knowing what expression you placed in the 'exact phrase' box, I cannot confirm that. I ran a test using a random phrase in the 'exact phrase' box and your "intitle:definition" in the 'all' box, and I got thousands of hits, of which I checked the first 100 without finding a single case without the word "definition" in the title.

Please keep in mind that, in this context, "title" means the text that is displayed in the title bar of the browser window, not a title line at the top of the page layout. The title in the title bar may be the same as the title of the page text proper, but that need not be the case. Google's 'intitle:' operator searches for terms in the title bar only, so you may get hits that do not have your term at the top of the page text. Could that be your problem?
ACCESS_POST_ACTIONS
Signature: Hans Joerg Rothenberger
Switzerland

Google anomalies

Post by russcable » Sat Feb 26, 2005 8:26 am

Although this doesn't match the help on the site, the "intitle:" keyword does not seem to always omit search results that don't have the word in the title no matter which box you put it in, it should merely make the results with that word in the title sort first (better matches).
For what you want, the "allintitle:" keyword may work a little better.

I was using the word "pumpkin" and noticed that I still got several hits that had the word "meaning" instead of "definition" almost as if I had used "~definition" which does however give a few more hits.
ACCESS_POST_ACTIONS

Google anomalies

Post by Phil White » Sat Feb 26, 2005 2:32 pm

A further issue is that Web pages that use frames are made up of a number of separate HTML (or ASP or PHP) pages, each of which may have a "TITLE". Only the title from main frameset actually appears in the title bar of the browser.
ACCESS_POST_ACTIONS
Signature: Phil White
Non sum felix lepus

Google anomalies

Post by haro » Sat Feb 26, 2005 6:56 pm

Yeah Phil, and, as far as I know, Google's crawler bots still aren't able to get into frames unless there are hyperlinks pointing into them.

Meanwhile, using Dale's and Russ' parameters, I found lots of Web pages that do not contain the word 'definition' at all, neither in the title bar nor in the body or in frames, and many of them do not contain 'meaning' either. I never encountered this phenomenon before, because I never search for titles.

I think the explanation is fairly simple: Web pages are dynamic creatures. They may have changed since Google last scanned them. On pages that are not referred to from many other pages, it may take months for Google to revisit them. Meanwhile the contents including the title may have completely changed while the URL, as listed in Google from the previous visit, is still the same.
ACCESS_POST_ACTIONS
Signature: Hans Joerg Rothenberger
Switzerland

Google anomalies

Post by Phil White » Sat Feb 26, 2005 7:21 pm

Hans Jörg,
Google and the other bots cope pretty well with most framesets now, but most good web designers are still avoiding them.
ACCESS_POST_ACTIONS
Signature: Phil White
Non sum felix lepus

Google anomalies

Post by dalehileman » Sat Feb 26, 2005 7:28 pm

Hans: virtually any random expression will do: Whoop and a Holler, for instance
Russ: "allintitle:" evidently doesn't work unless there's more than one keyword

There are probably very good reasons for Google's strange algorithm that suit others' purposes better than mine. I've advised Google that there should be an option for more literal results, saving the serious researcher much time by rejecting titles not pertinent. I expect results about the time Christ comes down, the Devil comes up, and they warmly embrace
ACCESS_POST_ACTIONS

Google anomalies

Post by haro » Sat Feb 26, 2005 10:17 pm

Phil, of course I try to avoid frames too, but I must admit I do have a few pages on the Web that simply cannot do without frames without losing a lot of their user friendliness. Most of them haven't been found by Google in more than two years, although the bots have carefully listed the whole caboodle around those frames.

Dale, I know what you mean and I concur. However, I still believe that most of those false hits are caused by changes on those pages before the next scan by Google. Google presently lists eight billion pages. It takes a while to wade through them all to keep the listings updated.
ACCESS_POST_ACTIONS
Signature: Hans Joerg Rothenberger
Switzerland

ACCESS_END_OF_TOPIC
Post Reply