Google analyzed search terms entered into a Beijing-based website to help develop blacklists for a censored search engine it has been planning to launch in China, according to confidential documents seen by The Intercept.
Engineers working on the censorship sampled search queries from 265.com, a Chinese-language web directory service owned by Google.
Unlike Google.com and other Google services, such as YouTube, 265.com is not blocked in China by the country’s so-called Great Firewall, which restricts access to websites deemed undesirable by the ruling Communist Party regime.
265.com was founded in 2003 by Cai Wensheng, a Chinese entrepreneur known as the “the godfather of Chinese webmasters.” In 2008, Google acquired the website, which it now operates as a subsidiary. Records show that 265.com is hosted on Google servers, but its physical address is listed under the name of the “Beijing Guxiang Information and Technology Co.,” which is based out of an office building in northwest Beijing’s Haidian district.
265.com provides news updates, links to information about financial markets, and advertisements for cheap flights and hotels. It also has a function that allows people to search for websites, images, videos, and other content. However, search queries entered on 265.com are redirected to Baidu, the most popular search engine in China and Google’s main competitor in the country.
It appears that Google has used 265.com as a de facto honeypot for market research, storing information about Chinese users’ searches before sending them along to Baidu. Google’s use of 265.com offers an insight into the mechanics behind its planned Chinese censored search platform, code-named Dragonfly, which the company has been preparing since spring 2017.
After gathering sample queries from 265.com, Google engineers used them to review lists of websites that people would see in response to their searches. The Dragonfly developers used a tool they called “BeaconTower” to check whether the websites were blocked by the Great Firewall. They compiled a list of thousands of websites that were banned, and then integrated this information into a censored version of Google’s search engine so that it would automatically manipulate Google results, purging links to websites prohibited in China from the first page shown to users.
According to documents and people familiar with the Dragonfly project, teams of Google programmers and engineers have already created a functioning version of the censored search engine. Google’s plan is for its China search platform to be made accessible through a custom Android app, different versions of which have been named “Maotai” and “Longfei,” as The Intercept first reported last week.
The app has been designed to filter out content that China’s authoritarian government views as sensitive, such as information about political opponents, free speech, democracy, human rights, and peaceful protest. The censored search app will “blacklist sensitive queries” so that “no results will be shown” at all when people enter certain words or phrases, according to internal Google documents.
The documents seen by The Intercept indicate that Google’s search project is being carried out as part of a “joint venture” with another company, presumably one based in China, because internet companies providing services in China are required by law to operate their servers and data centers in the country. In January, Google entered into an agreement with the Chinese company Tencent, which Google said at the time would allow it to “focus on building better products and services.” A bipartisan group of six U.S. senators is asking Google CEO Sundar Pichai to explain whether the Tencent deal is linked to the censored search app.
It is unclear whether, as part of the joint venture, Google’s partner company would be able to unilaterally update the blacklists. Documents seen by The Intercept state that that the “joint venture will have the ability” to blacklist websites and “sensitive queries.”
One source familiar with the project told The Intercept that Google has planned to provide the partner company with an “application programming interface,” or API, that it could potentially use to add blacklisted words or phrases. The source said they believed it was likely that the third-party company would be able to “update the blacklist without Google’s approval,” though the source could not confirm this with certainty. The details about the API have not been reported before.