Instagram’s official API was gutted in 2018 after the Cambridge Analytica scandal. What was once a rich public API now only gives access to your own content unless you’re a verified business partner. For anyone doing competitor research, influencer analysis, or social listening — you need an alternative approach.
What You Can Extract from Instagram
- Profile data — username, followers, following, post count, bio, website, verified status
- Posts — image/video URLs, captions, likes, comments count, hashtags, mentions, location
- Reels — view count, likes, comments, audio used
- Hashtag feed — top and recent posts for any hashtag
- Comments — text, author, timestamp, likes
Instagram’s Defence Mechanisms
Instagram uses Meta’s full ML-powered anti-abuse stack:
- Login required for most data since 2023 — guest access is heavily restricted
- Rate limiting per account and per IP — even logged-in accounts get rate limited
- Account checkpoints — automated behaviour triggers phone verification
- GraphQL signature verification — their internal API requires rotating tokens
Manual Approach — Instaloader (Python)
Instaloader is the most popular open-source tool for Instagram scraping. Here’s how to use it:
Step 1 — Install
pip install instaloader
Step 2 — Scrape a profile
import instaloader
L = instaloader.Instaloader()
# Public profile — no login needed (for now)
profile = instaloader.Profile.from_username(L.context, "natgeo")
print(f"Followers: {profile.followers}")
print(f"Posts: {profile.mediacount}")
print(f"Bio: {profile.biography}")
# Download recent posts
for post in profile.get_posts():
print(post.url, post.likes, post.comments)
break # just the first one
Step 3 — Handle rate limits
Instagram rate-limits aggressively. Even with login, pulling data on 50+ profiles in one session typically triggers a temporary block. You’ll need:
- Multiple Instagram accounts in rotation
- Random delays between requests (30–120 seconds)
- Residential proxies — each account on a different IP
- Session persistence to avoid re-login fingerprinting
Accounts used for scraping get suspended. Managing a pool of accounts is time-consuming and fragile.
Using Scrapios Instead
Scrapios abstracts all the session management, proxy rotation, and rate limiting. Here’s the same profile pull:
curl -X POST https://api.scrapios.com/api/v1/ext/jobs
-H "X-API-Key: scr_live_YOUR_KEY"
-H "Content-Type: application/json"
-d '{
"url": "https://www.instagram.com/natgeo/",
"catalog_scraper_id": 5,
"catalog_version_id": 14
}'
{
"status": "completed",
"result": {
"preview_data": [{
"username": "natgeo",
"full_name": "National Geographic",
"followers": 283000000,
"following": 168,
"posts_count": 27432,
"bio": "Experience the world through the eyes of National Geographic photographers.",
"website": "https://www.nationalgeographic.com",
"verified": true,
"is_private": false
}]
}
}
Influencer Vetting Checklist
When evaluating an Instagram influencer for a campaign, pull these data points:
- Follower count — baseline reach
- Engagement rate — aim for >3% (likes + comments ÷ followers)
- Follower growth trend — scrape weekly for 4 weeks
- Recent post frequency — active accounts post 3–7x per week
- Comment quality — generic comments (“Nice!” “🔥🔥”) signal fake engagement
Vet influencers with real data
500 free credits every month. Pull 50 Instagram profiles per month at zero cost.