google to fix blog noise problem | the register
google to fix blog noise problem | the register
skip to content
biting the hand that feeds it
cash ’n’ carrion
events
hardwaresoftwaremusic & mediacommssecuritymanagementscienceodds & sods
reg hardware
reg developer
channel register
whitepapers
news tools
newsletters & feeds
reg mobile
reg desktop news alerts
reg shops
reg merchandise
reg books
mobile gadgets
hosting
top stories
great war diary reveals original captain blackadder
google sweet-talks spice girls king
microsoft animates flickryoutubedotmac -enstein
bbc tech chief: you freetards don't matter
music biz sues robertson again
read more top stories
related whitepapers
gaining maximum value from information and data
managing intellectual property assets
building the virtualized enterprise with vmware infrastructure
achieve your cost-saving goals
foster's emea supports virtual infrastructure with equallogic
equallogic's iscsi san meets foster's emea storage requirements
game data for half a million e-athletes stored on equallogic sans
turtle entertainment case study
five watches its digital content grow with equallogic
five case study
why green security makes good business sense
a wick hill/watchguard white paper: november 2007
the register » music and media »
google to fix blog noise problem-noblogby andrew orlowski in san francisco → more by this authorpublished friday 9th may 2003 02:12 gmt
google is to create a search tool specifically for weblogs, most likely giving material generated by the self-publishing tools its own tab.
ceo eric schmidt made the announcement on monday, at the jp morgan technology and telecom conference. 'soon the company will also offer a service for searching web logs, known as "blogs,"' reported reuters.
it isn't clear if weblogs will be removed from the main search results, but precedent suggests they will be. after google acquired usenet groups from deja.com, it developed a unique user interface and a refined search engine, and removed the groups from the main index. after a sticky start, usenet veterans welcomed the new interface. google recently acquired blogger, and sources suggest this is the most likely option.
bloggers too are likely to welcome their very own tab as a legitimization of the publishing format. but many others will breathe a sigh of relief as blogs disappear from the main index.
"i just want a search engine that works," laments chris roddy, a politics and linguistics undergraduate at the university of emory.
"i can get a google search with porn turned off; why can't i get blogs turned off too?" he asked on slashdot.
google has strived in vain to maintain the quality of its search results in the face of a blizzard of links generated by a small number of sources. (google searches 3,083,324,652 pages as of 4pm pt today. assuming there are one million bloggers, and generously assuming they have a hundred pages each, that amounts to 0.032 per cent of web content indexed by google. recent research by pew put the number of blog readers as opposed to writers, as "statistically insignificant").
however, through dense and incestuous linking, results from blogs can drown out other sources.
"the main problem with blogs is that, as far as google is concerned, they masquerade as useful information when all they contain is idle chatter," wrote roddy. "and through some fluke of their evil software, they seem to get indexed really fast, so when a major political or social event happens, google is noised to the brim with blogs and you have to start at result number 40 or so before you get past the blogs." we'd noticed.
"taking usenet out of the general search was great, because it is not really interfering with general internet searching," roddy told us. "usenet was a public forum in the first place."
a slashot discussion prompted a suggestion that google add a -noblog option, which it effectively appears to be introducing by default.
gary stock, chief technology office for nexcerpt, inc. agrees.
"a year or two ago you could hit 'i'm feeling lucky' and there was a good chance that you could find a good and authoritative page," he told us.
"it is less the case today. more and more people have more text to type, and may not have anything authoritative to say - they just throw up characters on the screen."
he says that the link-based algorithm called pagerank™ was designed, at stanford university, with very different assumptions about the quality of information.
"they didn't foresee a tightly-bound body of wirers," reckons stock. "they presumed that technicians at usc would link to the best papers from mit, to the best local sites from a land trust or a river study - rather than a clique, a small group of people writing about each other constantly. they obviously bump the rankings system in a way for which it wasn't prepared."
information quality
for stock and roddy, the problem is that the resulting degradation in the quality of information makes it even harder to find primary source material. roddy said the realization came after searching through 500 blog entries to find a primary source.
exacerbating the problem, says stock - who devised 'googlewhacking', or the art of producing a search query that returns just one result - is the frequency with which the sites are indexed.
"if they are really spidering all 3 billion pages, then they must have changed some law of physics," he explains.
"someone has made a choice whether to go to a site ever hour or every three years. that begs the question - if i know something to be a high traffic site and i train my robots to visit often, do i discount it when i feed my information to pagerank?"
for example, he cites a hypothetical.
"suppose turtle-rescue.org has authoritative information about turtles. and it changes every month. then boingboing puts up a page about turtles and that becomes a big deal.
"each of us gets vote," jokes stock. "and someone votes every day and i vote once every four years."
"the blogs push up very quickly up to the top of the search results."
databases
"to me the power of what dave winer and ev williams have done, and it's great, is that i can easily publish resourceshelf in seconds, giving me time to do other things," says respected author and librarian gary price. price doesn't regard his site as a weblog, even though he uses blogger tools, now owned by google. price co-authored the invisible web, a guide to little-known about public resources on the internet [amazon - review].
"but what happens when the weblog fad dies down?" he asks.
"the public think that they can put 2.1 words into google and the best answer will appear, they don't ask how long is it taking them to get it. for the average person - its very good, but there are choices out there; and a lot of people aren't aware of them and don't know."
"you have to realize there are other information sources, and that information costs money."
"this is why new york public libraries has a sign 'here's where you find the stuff that isn't in google.' and much of this is publicly accessible," price points out.
or as seth finkelstein reminds us,"google is good, but not god."
(we'll follow up what, and how to get it, soon).
unearned reputations
ironically, the low information quality of blog-infested google results is a consequence of bloggers' attempts to introduce community aspects to what remains a solitary activity. the auto-citation feature 'trackback' is frequently fingered as the culprit: many search results google returns are trackbacks.
and yet dealing with trackback noise can be as much an opportunity as a challenge for google's user interface designers. just as the standalone usenet tab allowed sophisticated metadata searching and threading, so could the google blog tab.
granting 0.03 per cent of the web with its own google tab might rankle with some, but others could argue it produces the best of both worlds, for general search users and webloggers.
one group is likely to protest long and hard, however: and that's people who have taken advantage of this quirk to use google as their primary promotion channel or reputation creator. while folk whose reputations have been forged before the dawn of the blogroll will not be affected, and need not worry, the reaction may be predictable.
it's a bit like challenging a monarch with the viability of the hereditary principle: you can guess what they'll say.
just as one-man one-vote democracy terrifies the bejesus out of some people, so surely will a fairer google. ®
track this type of story as a custom atom/rss feed or by email.
post to slashdot
digg this
add to del.icio.us
reddit
previous article
next article
promote your events and training courses for free
it & business books
latest mobile gadgets
communities dominate brands: business & marketing challenges for the 21st century, hardback
ipodpedia: the ultimate ipod and itunes resource
absolute beginner's guide to computer basics
photoshop cs2 for dummies
photoshop cs2 book for digital photographers, the
preventive photoshop: take the best digital photographs now for better images later
freecom mobile drive 160gb usb-2: external hard drives
lacie rugged 160gb firewire 400 & 800/usb 2.0 hard drive: external hard drives
holux gr-239 cigarette plug bluetooth gps receiver: gps bluetooth
freecom dvb-t usb stick hybrid analog and digital tuner: laptop accessories
igo stowaway ultra-slim bluetooth keyboard (english): bluetooth keyboards
lacie d2 esata 500gb 3gbits/s hard dirve: external hard drives
newton peripherals bluetooth mogo mouse (pcmcia card): bluetooth mice
how to gain maximum value from your information and data
managing intellectual property assets
download here
top 20 stories •
all the week’s headlines •
archive
© copyright 2007
privacy policyadvertisingsite mapcontact usabout ussyndication
Acceuil
suivante
google to fix blog noise problem | the register BBC/OU Open2.net - Can Gerry Robinson Fix The NHS? Two charged with hacking PeopleSoft to fix grades - Network World Bike Shop Frederick, Maryland Bike Repairs, Bike Sales --BICYCLE ... Object Fix Zip - Freeware for repairing damaged ZIP archives with ... Ajaxian » IE’s Memory Leak Fix Greatly Exaggerated IPython fix for Leopard - O'Reilly ONLamp Blog Fix all Ajax cross-browser problems then deploy Wiki Autrans - Fix FIX-IT - bedrijvengids - handelsgids - webdesign - Pc repair ... PC Fix Error Doctor Registry Cleaner PC Diagnostics PC Checkup SF Gate: Columnists: Mark Morford Archive PNG in Internet Explorer: How to Use Registry Repair, Clean Up & File Fix for Windows Markdown Fix Histoire philosophique et politique des établissemens et du ... - Résultats Google Recherche de Livres How not to fix HTML ¶ Personal Weblog of Joe Clark, Toronto Madeleine Fix-Hansen :: Design :: Illustration :: Media ... MacNN Apple updates iMac fix for Tiger users MacNN Apple updates iMac fix for Tiger users Acheter Housse pouf Sit Fix... avec eco-SAPIENS macosxhints.com - 10.5: A fix for broken video chats and screen ... PKH-fix - Prozeßkostenhilfeberechnung molly.com » So How Do We Fix the Web, Really? How to fix broken Firefox extensions Free Software Magazine System Downloads : DHCP Fix /// AnalogX Reviews: Video Game Reviews Are Broken, Please Fix OpenBSD 4.0 errata Water fix proposed in Southeast - Weather - MSNBC.com Lettres édifiantes et curieuses, écrites des missions étrangères. - Résultats Google Recherche de Livres Candy- Chocolate- A Candy Fix Tena Fix - Incontinence Windows Mobile 5.0 Fix Site - Home Bug Fix Weekend finished :: pnCommunity :: Support at your fingertips はてなブックマーク - FIX Your Freebie Fix - All the latest Freebies, Coupons and Online Deals Portail Internet de la Haute Autorité de santé - H-FIX PDS frontline: the wall street fix PBS Fix Your Money Screw-Ups - Kiplinger.com [Profil de fix] OverBlog - Le blog des blogs XML.com: Using XSLT to Fix Swing Mr. Fix It (2006) KompoZer - Easy web authoring Rob Galbraith DPI: EOS-1D Mark III sub-mirror fix announced in USA ... Indonesia's three divas fix the nation's finances International ... Objet Publicitaire : Magnet Magic fix - ALB01.com Windows Vista Team Blog : Partners helping fix Vista Software ... FOSSwire » Fix a Frozen System with the Magic SysRq Keys Why Blog Post Frequency Does Not Matter Anymore Marketing Profs ... serious fix 4.1 How to Fix CGI - majordojo GRC CIH Virus Recovery You receive an access violation error and the system may appear to ... FIX: Update to enable DirectX Video Acceleration (DXVA) of Windows ... Video Coldplay - Fix You - coldplay, fix, you, clip ... Free Registry Fix 3.9 for Windows Solar shield could be quick fix for global warming - earth - 05 ... adaptive path » 8 quick ways to fix your search engine Fix for COM Surrogate Has Stopped Working Error in Vista :: the ... Oral Fixation Mints / Get Your Fix Nail Fungus Onychomycosis - Fix My Fungus