I started working with the #YesAllWomen tweet archive I built this week.  This is the largest archive I have dealt with so there have been some new challenges. There were over 107,000 tweets and retweets during the #WeNeedDiverseBooks campaign, but this was about 5 times that size.

The geocoding for the map below happens in batches of 10,000 tweets, so it will be a while before it is completely finished.  In the meantime, you can explore this copy of the map while the encoding continues.

Notes:  While some people have granted Twitter permission to embed their exact location in the metadata of their tweets, I strip that data out of the archive before I publish anything or make the download available.  I know people had to opt-in to include that data, but I am not convinced that everyone who opts-in to that really understand what it means.  The locations used on the map here are general locations as people have specified on their profile and are not tied to any specific tweet.

This method is safer, in my opinion, but it definitely introduces errors into the map.  Many people use odd abbreviations or whimsical phrases for their locations.  The encoding algorithms try their best, but they are not perfect.  For example, someone from Southeast Missouri used SEMO as their location and the encoding resulted in their tweet appearing to come from the town of Semo in Fiji.  Another person used “Under the moon” as their location and the encoder believes that to be somewhere in western Russia.

So with that in mind… enjoy the map.