Every #wrimosagainsttheeatingofsushi related hashtag I’ve tweeted (and a few I haven’t)

A long time ago, a couple of Twitter followers and I were discussing a movement to prevent the eating of me–that is, because my nickname is Sushi and that also happens to be a food. We eventually came up with #wrimosagainsttheeatingofsushi. In the years since, the #wrimosagainsttheeatingofsushi movement has grown, as well as the #wrimosfortheeatingofsushi movement. But more and more semi-related hashtags have cropped up, almost all of them taking the form #___[for/against]the____ingof______.

I’ve tried to search Twitter for them, but online searches don’t usually handle fill-in-the-blank searches very well. Or at all, really. But that’s okay.

You hovered for the alt text, didn't you? Don't worry, I did too when previewing this.
(Image credit: xkcd)

I know (some) regular expressions.

What’s a regular expression?

A regular expression (or regex) is a pattern that matches something. You can think of a word as a regular expression when you search for something–whether you’re searching online or for a file on your computer. Most of your results are going to match the word(s) you type in. But what if you want only the first and last words of your search phrase to match? Or in my case, only certain words within a hashtag to match?

That’s where regular expressions come in.

Here’s a tutorial if you’d like to dig deeper.

But knowing regular expressions doesn’t do me much good if I don’t have a way to search with them. This is why grep exists.

grep is just plain awesome

Grep is amazing. It searches your plaintext files and shows you the results. You can also search across multiple files at once, which makes grep very powerful. I can figure out which novel I wrote that one awesome line in or when I used some made up word in chat… or find Twitter hashtags since grep supports regular expressions.

Sound cool? Good because we’re going to have some greppy fun.

The last tool: the Twitter archive

This is the easiest step. Twitter lets you download a full archive of your tweets from your site settings, so I downloaded mine and unzipped the files to my Backup directory.

The files containing the tweets themselves are Javascript files sorted by month. We can grep our way through these. Sweet.

Putting it all together

Let’s put this together and find all the hashtags. Here’s what I did.

[sushi@marigold ~]$ grep -r -E [a-zA-Z]+the[a-zA-Z]+ingof[a-zA-Z]+ ~/Backup/tweets/data/js/tweets > ~/Documents/wrimosforagainst.csv

Let’s break this down.

[sushi@marigold ~]$: This is my hostname. The ~ says I’m sitting in my home directory, which affects things like where I save files. More on this in a minute.

grep -r -E: grep says to use the grep command. -r means to search recursively–that is, to search all the files in that directory. This is important because we want to search multiple files in the same directory. -E is for extended regular expressions, which let me do some of these exciting things to follow.

[a-zA-Z]+: Check for any letter a through z, capitalized or not. The + means to check and see if a letter appears at least once.

the: Exactly what it says. Look for the letters “the” in order, exactly once.

[a-zA-Z]+: Check for any letter a through z, capitalized or not. The + means to check and see if a letter appears at least once.

ingof: Exactly what it says. Look for the letters “ingof” in order, exactly once.

[a-zA-Z]+: Check for any letter a through z, capitalized or not. The + means to check and see if a letter appears at least once.

This is a lot of information. If I had just hit enter here, my screen would have overflowed with hashtags. That’s where the > (greater than) sign comes in. The > means to send the information from this file somewhere. The rest of that command is telling what to name the file and where to save it. In this case I chose to send this information to a comma separated value file, which I can then open with LibreOffice and do whatever I want with it.

A CSV file is basically a spreadsheet in text form, which is what I saved the file as the first time. When you open the file, you can then tell Excel or LibreOffice what to use to separate the values. LibreOffice chose comma and tag and semicolon; I added colon since colons were useful in separating the hashtags from the rest of the data.

Once I opened the CSV file in LibreOffice I sorted the column with the hashtags. This gave me almost exactly what I wanted, as the majority of these hashtags are at the bottom in the w section.

There are a couple of catches to this. Eeach tweet in the Javascript file includes the hashtags as a separate entity from the tweets, as well as what tweet you’re replying to. This means hashtags can show up multiple times. This also means that hashtags used in tweets I replied to can also show up, even if I never used them. This part is doubly useful, but it means that not every spinoff hashtag is here simply because I didn’t reply to all of those tweets.

Here are those hashtags. There are 118 of them in all, and I have only edited to remove duplicates. Happy reading, and happy hashtagging! Continue reading

So you want to follow me on Twitter

Last updated 17 March 2013

Congratulations! You have a Twitter account, and somehow you have stumbled across my account. That follow button may look awfully tempting after you read over the past few pearls of wisdom that I’ve dropped onto the Internet. Before you click that follow button and regret it a few weeks later when I’m linkspamming or retweet-spamming or talking about abstract algebra or open source software or silly hashtags, it’s best that you know what you’re getting yourself into. A page of tweets (or two, or three) rarely does anyone much good. So what are you getting yourself into, anyway?

Here’s my Twtrland profile so you can get a good idea for yourself. It categorizes me as a power user and, for some reason, under the food category. Twtrland is for the eating of sushi. How unfortunate.

First, I direct tweets at people a lot. This means that while it looks like I tweet a lot, you may not see a lot of them for awhile–unless, of course, we happen to follow several of the same people. If you know me through NaNoWriMo, this is not impossible and in fact may be the case. I’ve jumped into several conversations with other Wrimos in my Twitter circle, and they’ve jumped into conversations that I was having because we share followers.

I tweet a lot of links. Honestly, I probably tweet more links than anything else. You’ll probably learn something interesting or want to gouge your eyes out with a spork, or at least wonder where on earth I find my material. Thankfully, I do not tweet every single link that goes into my Delicious account because if I did, at least twenty more tweets would show up on my stream outside of my own accord, and that would annoy even me. Why not? There’s already far too much feed pollution going on with everyone linking their Twitters and Facebooks and Tumblrs and blogs and what have you, and it is time to stop that nonsense. I don’t even tweet every single post I make here–just the particularly poignant ones, like possibly this one.

Some days my tweets will be more interesting than others. What determines that? It’s a complicated formula that involves how interesting my day is, what’s on my mind, my mood, my whereabouts, and the phase of the moon. There’s a similar formula that calculates how many tweets I’ll make in a given day that uses similar variables. In short: it varies a lot. I tweet a lot when I should be doing something else, and sometimes I’ll go nearly a day sans update. Some things just aren’t worth a Twitter update. No, I do not need to broadcast every single aspect of my programming progress to the world unless it’s spectacular in some way. Be grateful.

I livetweet things on occasion. Not often, but it does happen. The last time it happened was with the national spelling bee finals. The time before that was something Apple-related, not because I’m an Apple fangirl, but because I knew I’d be hearing all about it in the tech blogs I read, so I may as well get the news firsthand. In case you’re wondering, I’m not an Apple hater, but I do use Linux. That should tell you a lot. However, when I do livetweet, I try to keep it interesting. Well, except for the spelling bee. Then I just do whatever I want.

It’d be a good idea to know what I like to talk about, isn’t it? After all, my Twitter profile (at the moment) says “The Sushi from NaNoWriMo. Writer. Math nerd. Geek of many trades. wikiwrimo.org admin. I’m the Internet, not food. thisnameATgmail”. The first line is because yes, people have asked, despite using the same name at both sites. I’ll talk about anything. Between October and December, I talk about NaNoWriMo a lot. Current interests are a popular topic, and they tend toward the geeky and techy.

So for the big question: Will I follow you back? Well… probably not. Not because you’re a terrible human because I’m sure you’re wonderful. But I’ve crept up past 1300 followers (aah! who are you people? where did you find me?), and there is no reasonable way for me to keep up with every single person and still keep Twitter fun while getting work done. (I work from home. This is a serious issue.) Fun fact: I now have far more Twitter followers than Facebook friends and am completely okay with this. These days many of my new followings are folks I know in some way from outside of Twitter. I do read all my @mentions and reply to many of them, though.

Convinced yet? Scared off? Follow away.