The next search paradigm

The Library of Congress has Twitter’s tweets, but the tech isn’t advanced enough to search it!


[T]he Library noted in a separate White Paper released Friday that it can take 24 hours to perform a single search for a term in the Twitter archive.

Also consider that the US government through its wiretapping institution must have figured out a way to do key-word searches in such a mess of data.

I say “must have” only slightly warily. The use-spaces aren’t quite the same. Getting a load of data publicly available for search at any time (as is being done for Twitter’s archive) is different from looking for fiendish patterns in communications. Also with the latter example, there is likely more control over the kind of strain that is put on the hardware that is doing the work. If you open up data that the whole world would be interested in, though, that’s a different kettle of fish entirely different from making certain data available under certain domains.

Even so, it’s another example of the vast march of tech, and the size and breadth of Washington, when one branch of the government has a need likely already solved by another one.

