IBM is betting $100 million that it can whoop Google in the search engine realm. Its new project, dubbed WebFoundation, is an intelligent database that can separate the wheat from the chaff on the Internet. Its algorithms check for accuracy and truthfulness, popularity, translates languages, compares prices, tracks chat rooms and more, using a cluster of 30 dual Xenon processors and 160 TBytes of disk storage. IBM plans to sell data like what your company's public reputation is, as gleaned from newspapers, TV, radio transcripts, magazines, etc., for around $150k. A commercial service, Factavia, will launch WebFoundation's capabilities in mid-2004. What has it learned so far? 30% of the web is porn, and 30% is duplicated data. There are 50 M new or changed pages every day, and 65% of web pages are now written in English, but by 2010, English will be a minority.