My goal in writing this article is to show the basics of using Instagram programmatically. My approach can be applied to data analysis, computer vision, and any other cool project you can think of.

Instagram is the largest photo-sharing social media platform, with around 500 million monthly active users and 95 million photos and videos uploaded to Instagram every day. The scale of the data is huge and the potential is huge. This article will show you how to use Instagram as a data source rather than a platform, and show you how to use the development methods presented in this article in your projects.

Introduction to apis and tools

Instagram has official apis, but they are outdated and currently offer very limited functionality. So for this article, I’m using the non-official Instagram API provided by LevPasha. The API supports all key features, such as liking, fanning, uploading pictures and videos, etc. It’s written in Python, and I’ll focus on the data side in this article.

I recommend Jupyter Notebook and IPython. It’s fine to use official Python, but it doesn’t provide features like image display.

The installation

You can install the library using PIP as follows:


     
  1. python -m pip install -e git+https://github.com/LevPasha/Instagram-API-python.git#egg=InstagramAPI 

Copy the code

If ffmpeg is not already installed on your system, you can install it on Linux using the following command:


     
  1. sudo apt-get install ffmpeg 

Copy the code

On Windows, you need to run the following commands in the Python interpreter:


     
  1. import imageio 

  2.  

  3. imageio.plugins.ffmpeg.download()  

Copy the code

Here’s how to login to Instragram using the API:


     
  1. from InstagramAPI import InstagramAPI 

  2.  

  3. username="YOURUSERNAME" 

  4.  

  5. InstagramAPI = InstagramAPI(username, "YOURPASSWORD") 

  6.  

  7. InstagramAPI.login()  

Copy the code

If the login is successful, you will receive a “successful login” message.

Basic request

With the above preparations in place, we can proceed to implement the first request:


     
  1. InstagramAPI.getProfileData() 

  2.  

  3. result = InstagramAPI.LastJson  

Copy the code

     
  1. {u'status': u'ok', 

  2.  u'user': {u'biography': u'', 

  3.   u'birthday': None, 

  4.   u'country_code': 20, 

  5.   u'email': [email protected]', 

  6.   u'external_url': u'', 

  7.   u'full_name': u'Nour Galaby', 

  8.   u'gender': 1, 

  9.   u'has_anonymous_profile_picture': False, 

  10.   u'hd_profile_pic_url_info': {u'height': 1080, 

  11. U: 'url' u 'https://instagram.fcai2-1.fna.fbcdn.net/t51.2885-1aaa7448121591_1aa.jpg',

  12.    u'width': 1080}, 

  13.   u'hd_profile_pic_versions': [{u'height': 320, 

  14. U: 'url' u 'https://instagram.fcai2-1.fna.fbcdn.net/t51.2885-19/s320x320/19aa23237_4337448121591_195310aaa32_a.jpg',

  15.     u'width': 320}, 

  16.    {u'height': 640, 

  17. U: 'url' u 'https://instagram.fcai2-1.fna.fbcdn.net/t51.2885-19/s640x640/19623237_45581744812153_44_a.jpg',

  18.     u'width': 640}], 

  19.   u'is_private': True, 

  20.   u'is_verified': False, 

  21.   u'national_number': 122, 

  22.   u'phone_number': u'+201220', 

  23.   u'pk': 22412229, 

  24.   u'profile_pic_id': u'1550239680720880455_22', 

  25. u'profile_pic_url': U 'https://instagram.fcai2-1.fna.fbcdn.net/t51.2885-19/s150x150/19623237_455817448121591_195310166162_a.jpg',

  26.   u'show_conversion_edit_entry': False, 

  27.   u'username': u'nourgalaby'}}  

Copy the code

As shown above, the results are presented in JSON format with all requested data.

You can access the result data using the normal key-value method. Such as:

You can also use tools (such as Notepad++) to view JSON data and explore.

Get and view the Instagram timeline

Let’s implement some more useful features. We will request the last post in the timeline and view it in Jupyter Notebook.

The following code gets the timeline:


     
  1. InstagramAPI.timelineFeed() 

Copy the code

Similar to the previous request implementation, we also use LastJson() to view the result. Looking at the resulting JSON data, we can see that it includes a series of key values called “entries.” Each element in the list holds information about a specific post on the timeline, including the following elements:

  • [text] : Saves the post text content under the title, including the hashtag.

  • [likes] : The number of likes in a post.

  • [created_at] : post creation time.

  • [comments] : Comments on a post.

  • [image_versions] : Save a link to the actual JPG file that can be used to display the image in the Jupyter Notebook.

function

The functions Get_posts_from_list() and Get_url() loop over the list of posts, looking for the URL in each post and appending it to our empty list.

When the above function is complete, we should have a list of urls that look like this:

We can use the ipython. display module to view the image as follows:

Viewing images in IPython Notebook is a very useful feature, and we’ll use these functions to see the results again, so stay tuned.

Get the most popular posts

Now that we know how to make basic requests, what about more complex ones? Now we’re going to do something similar, namely how to get the most popular of our posts. To do this, you first need to get all the posts of the currently logged in user and then sort the posts by number of likes.

Get all posts from the user

To get all the posts, we will loop over the list of results using the next_max_id and more_avialable values.


     
  1. import time 

  2. myposts=[] 

  3. has_more_posts = True 

  4. max_id="" 

  5.  

  6. while has_more_posts: 

  7.     InstagramAPI.getSelfUserFeed(maxid=max_id) 

  8.     if InstagramAPI.LastJson['more_available'] is not True: 

  9.         has_more_posts = False #stop condition 

  10.         print "stopped" 

  11.  

  12.     max_id = InstagramAPI.LastJson.get('next_max_id','') 

  13.     myposts.extend(InstagramAPI.LastJson['items']) #merge lists 

  14.     time.sleep(2) # Slows the script down to avoid flooding the servers  

  15.  

  16. print len(myposts)  

Copy the code

Save and load data to disk

Since the above request can take a long time to complete, we don’t want to run it when it’s unnecessary, so it’s good practice to save the result and load it again as work continues. To do this, we’ll use Pickle. Pickle can load any variables by serializing them and saving them to a file. Here is a working example:

Save:


     
  1. import pickle 

  2. filename=username+"_posts" 

  3. pickle.dump(myposts,open(filename,"wb"))  

Copy the code

Loading:


     
  1. import pickle 

  2.  

  3. filename="nourgalaby_posts" 

  4.  

  5. myposts=pickle.load(file=open(filename))  

Copy the code

Sort by likes

Now we have an ordered dictionary named “Myposts”. To sort by a key in a dictionary, we can use a Lambda expression as follows:


     
  1. myposts_sorted = sorted(myposts, key=lambda k: 

  2.  

  3. k['like_count'],reverse=True) 

  4.  

  5. top_posts=myposts_sorted[:10] 

  6.  

  7. bottom_posts=myposts_sorted[-10:]  

Copy the code

The following code can achieve the same display as above:


     
  1. image_urls=get_images_from_list(top_posts) 

  2.  

  3. display_images_from_url(image_urls)  

Copy the code

Filter images

We may want to do some filtering on our posts. For example, we might have a post with a video, but we only want an image post. We can do filtering like this:


     
  1. myposts_photos= filter(lambda k: k['media_type']==1, myposts) 

  2. myposts_vids= filter(lambda k: k['media_type']==2, myposts) 

  3. print len(myposts) 

  4. print len(myposts_photos) 

  5. print len(myposts_vids)  

Copy the code

Of course, you can filter any variable in the results, so be creative!

notice


     
  1. InstagramAPI.getRecentActivity() 

  2. get_recent_activity_response= InstagramAPI.LastJson  

  3. for notifcation in get_recent_activity_response['old_stories']: 

  4.     print notifcation['args']['text']  

Copy the code

The result could be:


     
  1. userohamed3 liked your post. 

  2. userhacker32 liked your post. 

  3. user22 liked your post. 

  4. userz77 liked your post. 

  5. userwww77 started following you. 

  6. user2222 liked your post. 

  7. user23553 liked your post.  

Copy the code

Notifications from specific users only

Now, we can do what we want and play around with notifications. For example, I can get a list of notifications from a particular user:


     
  1. username="diana" 

  2. for notifcation in get_recent_activity_response['old_stories']: 

  3.     text = notifcation['args']['text'] 

  4.     if username  in text: 

  5.         print text  

Copy the code

Let’s try something more interesting, like: get the time of day when you get the most likes, the time of day when people get the most likes. To do this, we’ll create a graph showing the time of day in relation to the number of likes you receive.

The following code draws the date and time of the notification:


     
  1. import pandas as pd 

  2. df = pd.DataFrame({"date":dates}) 

  3. df.groupby(df["date"].dt.hour).count().plot(kind="bar",title="Hour" ) 

Copy the code

As you can see in this example, I get the most likes between 6pm and 10pm. If you know anything about social media, you know that this is the peak usage time when most businesses post to get the most recognition.

Get the list of fans and followers

Now I’m going to get a list of followers and comments and do something on the list.

To use getUserFollowings and getUserFollowers, you first need to get user_id. Here is one way to get user_id:

Now you can call the function as follows. Note that if you have a very large number of followers, you need to make multiple requests (more on that below). Now we make a request to get a list of fans and followers. The JSON result gives a list of users with information about each fan and fan.


     
  1. InstagramAPI.getUserFollowings(user_id) 

  2.  

  3. print len(InstagramAPI.LastJson['users']) 

  4.  

  5. following_list=InstagramAPI.LastJson['users'] 

  6.  

  7. InstagramAPI.getUserFollowers(user_id) 

  8.  

  9. print len(InstagramAPI.LastJson['users']) 

  10.  

  11. followers_list=InstagramAPI.LastJson['users']  

Copy the code

If the number of followers is large, the results may not be a complete list.

Get all the fans

Getting all fan lists is similar to getting all posts. We’ll make a request and iterate over the result using the next_max_id key.

Thanks to Francesc Garcia for his support.


     
  1. import time 

  2.  

  3. followers   = [] 

  4. next_max_id = True 

  5. while next_max_id: 

  6.     print next_max_id 

  7.     #first iteration hack 

  8.     if next_max_id == True: next_max_id='' 

  9.     _ = InstagramAPI.getUserFollowers(user_id,maxid=next_max_id) 

  10.     followers.extend ( InstagramAPI.LastJson.get('users',[])) 

  11.     next_max_id = InstagramAPI.LastJson.get('next_max_id','') 

  12.     time.sleep(1)  

  13.  

  14. followers_list=followers  

Copy the code

The same could be done for a list of followers, but I don’t do that because as far as I’m concerned, one request is enough to get all my followers.

We now have a list of all fans and followers in JSON format. I’ll convert the list to a more user-friendly data type, collections, to facilitate a series of operations on the data.

I just take the “username” key and use set() on it.


     
  1. user_list = map(lambda x: x['username'] , following_list) 

  2. following_set= set(user_list) 

  3. print len(following_set) 

  4.  

  5. user_list = map(lambda x: x['username'] , followers_list) 

  6. followers_set= set(user_list) 

  7. print len(followers_set)  

Copy the code

So HERE I’m taking the set of all the user names. The same can be done for “full_name” and the results are more user-friendly. But the results may not be unique, as some users may not provide full names.

So now we have two sets. We can do the following:

Here I give some statistics about fans. You can do a lot of things like save a list of your followers and compare them later to see how they are dropping.

Above we show what you can do with Instagram data. I hope you’ve learned how to use the Instagram API and have a basic idea of what you can do with it. Stay tuned for the official apis, which are still under development and you can use them for more things in the future. If you have any questions or suggestions, please feel free to contact me.

For media cooperation please contact:

Email address: [email protected]