We present two sets
of results: 1) Internet measurements of keyword filtering by the
Great “Firewall” of China (GFC); and 2) initial results of using latent
semantic analysis as an efficient way to reproduce a blacklist
of censored words via probing.
Our Internet measurements suggest that the GFC’s keyword filtering
is more a panopticon than a firewall, i.e., it need not block
every illicit word, but only enough to promote self-censorship.
China’s largest ISP, ChinaNET, performed 83.3% of all filtering
of our probes, and 99.1% of all filtering that occurred at the first
hop past the Chinese border. Filtering occurred beyond the third
hop for 11.8% of our probes, and there were sometimes as many as
13 hops past the border to a filtering router. Approximately 28.3%
of the Chinese hosts we sent probes to were reachable along paths
that were not filtered at all. While more tests are needed to provide
a definitive picture of the GFC’s implementation, our results disprove
the notion that GFC keyword filtering is a firewall strictly at
the border of China’s Internet.