DNS Changes & Twicler the CUIL Search Robot

Over the past few days we have been helping a client consolidate their servers onto one single system, to save on co-lo/hosting charges. New system acquired we began to migrate sites onto this single eight core Xeon / 12GiB server (Ubuntu Jaunty 64bit)

We performed an audit on the front side, to ensure that some sites were no-longer used. Spoke to staff, viewed various logs and grep’d source code. Certain we had everything we began to move sites. All the while watching logs to ensure that sites move properly and traffic tapers off then stops all together (on a 1h TTL in the DNS sometimes this can taper to next to like 2 hits/h in less than 2.5h)

After moving we continue to tail log files on the deprecated servers, to catch any last/missed resources. After all other traffic had stopped for over five hours only one remained: Twicler.

All the requests were typical HTTP/1.1 robot style requests, just to a very old IP.

Speculation: Cuil may have internal caching DNS servers that hold IP information for well beyond the TTL from the authoritative NS.

Wild Speculation: We’ve head that the Twicler (Cuil robot), like other search robots, makes multiple passes on a site. So then it would appear that with the first pass it is caching the site’s IP address and using that for subsequent passes. I wonder then what impact moving a site’s IP has for the Cuil algo.

Oh and if anybody cares the robot came from 216.129.119.X