Some queries to try for the Instagram challenge

#1

I was playing around with the data available from the two Instagram accounts for this challenge and I thought I’d throw out some example queries people could use for building a story out of the data here. I used @shinestythreads data for these, but they should work for any Instagram account (with the names changed, obviously, to suit the account you’ll be pulling from).

After reading up on what data people usually track in their Instagram accounts, I decided to check out these metrics:

Engagement

Likes per post

I wanted to look at likes per post, so I put together this query:

SELECT id, likes_count as "likes" FROM instagram_media

But just plotting that table alone made a total mess of a chart, so I made a histogram instead:

Histogram of likes per post
likes_per_post_hist

Comments per post

And then I did basically the same thing for comments using this query:

SELECT id, comments_count as "comments" FROM instagram_media

And made this plot:
Histogram of comments per post

Best time of day for posting

I wanted to see what hour of the day was the best for generating engagement on posts, so I put together this query and generated some bar plots using Chartio:

SELECT to_char(created_time, 'HH24') as "time", AVG(likes_count) as "average likes", AVG(comments_count) as "average comments" FROM instagram_media GROUP BY time ORDER BY time ASC

This will show you the average number of likes and comments in one-hour increments and gives a plot that looks like this (this might look funny if you’re not used to thinking in 24 hour time, but…you should be):

Average likes per post by hour of day

Average comments per post by hour of day

Engagement rate

This measures the number of likes and comments per follower by post. I decided to make a histogram for “like” engagement rate and “comment” engagement rate from a dataframe generated using this query:

SELECT instagram_media.id, likes_count as "likes", likes / cast(instagram_user.counts_followed_by as float) as "like engagement rate", comments_count as "comments", comments_count / cast(instagram_user.counts_followed_by as float) as "comments engagement rate" FROM instagram_media, instagram_user

Histogram of like engagement

Histogram of comment engagement

Engagement by hashtag

Then I thought it would be cool to see what the average engagement is for hashtags that get used a lot in this account, so I ran this query:

SELECT value as "hashtag", AVG(likes_count) as "average likes", AVG(comments_count) as "average comments" . FROM instagram_media JOIN instagram_media_tags ON instagram_media.id = instagram_media_tags.instagram_media_id GROUP BY value ORDER BY "average likes" DESC

And visualized the data using Python’s matplotlib:

Average likes per hashtag

Average comments per hashtag

And if you take a look at the two plots, some interesting trends should pop out: first of all, #returnhisbong is a top performer as far as average likes (sure!), but #buttstuff is just absolutely crushing it as far as the average number of comments this hashtag can pull in on a post