The Library of Congress is adding tweets to its archives, but users still aren’t able to search through the more than 170 billion (and growing) tweets.
The agreement between the library and Twitter was actually signed back in 2010, and the library announced Monday that it has just now completed its initial objectives in the tweet project, which included: “to acquire and preserve the 2006-10 archive; to establish a secure, sustainable process for receiving and preserving a daily, ongoing stream of tweets through the present day; and to create a structure for organizing the entire archive by date.”
The next challenge for the Library of Congress is making that archive accessible in a way that will be of use to researchers, something that it hasn’t quite figured out how to do, reports the Washington Post.
“People expect fully indexed — if not online searchable — databases, and that’s very difficult to apply to massive digital databases in real time,” Deputy Librarian of Congress Robert Dizard Jr. told the Post. “The technology for archival access has to catch up with the technology that has allowed for content creation and distribution on a massive scale.”
In addition to its 170 billion existing tweets, the Twitter archive increases by approximately half a billion tweets each day.