Manchmal lohnt es sich, eine Nachricht im Original nachzuschlagen, selbst, wenn man schon hundert "Nachrichten über die Nachricht" gelesen hat. Ist mir letzte Woche so mit dem Thema "Fakten zu Google" gegangen: The magic that makes Google tic bei ZDNet. Zitiert wurden überall die eindrucksvollen Zahlen - 30 Cluster mit jeweils 2.000 Billig-PCs etc. Ja, das beeindruckt auch mich. Viel spannender ist aber der Rest des Artikels (ja, man muss dann wirklich über die ersten zwei Absätze hinaus noch weiterlesen ...)
Zum Beispiel:
Google's vision is broader than most people imagine, said Hölzle: "Most people say Google is a search engine but our mission is to organise information to make it accessible." [Siehe auch "Warum Google keine Suchmaschinen-Company ist"]
Und insbesondere der folgende Vergleich hat es mir als alten Arthur-C.-Clarke-Fan angetan:
When Arthur C. Clarke said that any sufficiently advanced technology is indistinguishable from magic, he was alluding to the trick of hiding the complexity of the job from the audience, [...]. Nobody hides the complexity of the job better than Google does; [...] all hidden behind a deceptively simple, white, Web page that contains a single one-line text box and a button that says Google Search.
Dahinter steht tatsächlich eine gewaltige Rechenleistung (deren Aufzucht und Hege der Artikel schön zkizziert). Ich denke, dass es nicht übertrieben ist, zu sagen, dass Google heute die Firma ist, die das Problem, wie man eine solch mächtige Maschine preisgünstig baut und zuverlässig in Betrieb hält, am besten im Griff hat. Und damit kann man viel mehr machen, als eine simple Suchmaschine betreiben (Siehe auch Google Desktop und die Google-Strategie). Und Google diversifiziert im Moment wie blöde, um diese einzigartige Fähigkeit auszunutzen. Schöner Artikel dazu bei John Batelle:
With the news that Google has locked down googlereviews.com, incorporated reviews (of sorts) into Froogle, re-launched Google Groups in a bid to get competitive with Yahoo, started to update Blogger, has released desktop search (with its obvious developer platform implications), and is quickly scaling Local with mobile and the like, it's a pretty obvious conclusion to draw: Google is joining the architecture of participation party in a big way.
Und, um den Vorsprung weiter zu halten, versucht die Firma konsequent, die besten "Computer Scientists" für sich zu gewinnen. Siehe Personalmarketing bei Google und Google ist cool - aber kein Wohltäter.
Und hier noch ein paar nette Randnotizen aus dem ZDnet-Artikel, zum Thema Google-Technologie:
Today the company mirrors everything across multiple independent data centres, and the fault tolerance works across sites, "so if we lose a data centre we can continue elsewhere -- and it happens more often than you would think. Stuff happens and you have to deal with it."
A new data centre can be up and running in under three days. "Our data centre now is like an iMac," said Schulz." You have two cables, power and data. All you need is a truck to bring the servers in and the whole burning in, operating system install and configuration is automated."
[...]
As the scale of the operation increases, it introduces some particular problems that would not be an issue on smaller systems. For instance, Google uses IDE drives for all its storage. They are fast and cheap, but not highly reliable. To help deal with this, Google developed its own file system -- called the Google File System, or GFS -- which assumes an individual unit of storage can go away at any time either because of a crash, a lost disk or just because someone stepped on a cable.
The power of three
There are no disk arrays within individual PCs; instead Google stores every bit of data in triplicate on three machines on three racks on three data switches to make sure there is no single point of failure between you and the data. "We use this for hundreds of terabytes of data," said Hölzle.[...]
Running thousands of cheap servers with relatively high failure rates is not an easy job. Standard tools don't work at this scale, so Google has had to develop them in-house. Some of the other challenges the company continues to face include:
Debugging: "You see things on the real site you never saw in testing because some special set of circumstances that create a bug," said Hölzle. "This can create non-trivial but fun problems to work on."
Data errors: A regular IDE hard disk will have an error rate in the order of 10-15 -- that is one millionth of one billionth of the data written to it may get corrupted and the hard-disk's own error checking will not pick it up. "But when you have a petabyte of data you need to start worrying about these failures," said Hölzle. "You must expect that you will have undetected bit errors on your disk several times a month, even with hardware checking built-in, so GFS does have an extra level of checksumming. Again this is something we didn’t expect, but things happen."
Spelling: Google wrote its own spell checker, and maintains that nobody know as many spelling errors as it does. The amount of computing power available at the company means it can afford to begin teaching the system which words are related -- for instance "Imperial", "College" and "London". It's a job that many CPU years, and which would not have been possible without these thousands of machines. "When you have tons of data and tons of computation you can make things work that don’t work on smaller systems," said Hölzle. One goal of the company now is to develop a better conceptual understanding of text, to get from the text string to a concept.
Power density: "There is an interesting problem when you use PCs," said Hölzle. "If you go to a commercial data centre and look at what they can support, you'll see a typical design allowing for 50W to 100W per square foot. At 200W per square foot you notice the sales person still wants to sell it but their international tech guy starts sweating. At 300W per square foot they cry out in pain."
Eighty mid-range PCs in a rack, of which you will find many dozens in a Google data centre, produce over 500W per square foot. [...]
Recent Comments