New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
looks like X/twitter(?) broke something again #983
Comments
broke again zedeus/nitter#983
|
Also, the syndication api for |
|
Yes, it is not working now. I hope the Nitter people fix this soon. |
|
Is there a online/CLI tool converting |
Not really. showReplies=false shows years-old content when not logged in. |
|
Down again... |
That's because in that specific example, those tweets were years ago. Look again at the like count, notice anything? |
We can just search for the first |
Indeed #!/usr/bin/python3
import requests
import re
import urllib
url = "https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk"
with urllib.request.urlopen(url) as response:
encoding = response.info().get_param('charset', 'utf8')
html = response.read().decode(encoding)
result = re.search('script id="__NEXT_DATA__" type="application\/json">([^>]*)<\/script>', html)[1]
print(result) |
Interesting, but this doesn't return RSS with 'item', 'pubDate' etc. tags. Maybe a script using https://github.com/lkiesow/python-feedgen would do the job? |
Not sure I understand ? It expose far more informations than needed and it does expose the date and all Here's an example for one tweet only : {
"type": "tweet",
"entry_id": "tweet-1519480761749016577",
"sort_index": "1691455400412446720",
"content": {
"tweet": {
"id": 0,
"location": "",
"conversation_id_str": "1519480761749016577",
"created_at": "Thu Apr 28 00:56:58 +0000 2022",
"display_text_range": [
0,
52
],
"entities": {
"user_mentions": [],
"urls": [],
"hashtags": [],
"symbols": [],
"media": []
},
"favorite_count": 4600599,
"favorited": false,
"full_text": "Next I’m buying Coca-Cola to put the cocaine back in",
"id_str": "1519480761749016577",
"lang": "en",
"permalink": "/elonmusk/status/1519480761749016577",
"possibly_sensitive": false,
"quote_count": 171975,
"reply_count": 187438,
"retweet_count": 649833,
"retweeted": false,
"text": "Next I’m buying Coca-Cola to put the cocaine back in",
"user": {
"blocking": false,
"created_at": "Tue Jun 02 20:12:29 +0000 2009",
"default_profile": false,
"default_profile_image": false,
"description": "Blades of Glory",
"entities": {
"description": {
"urls": []
},
"url": {}
},
"fast_followers_count": 0,
"favourites_count": 30569,
"follow_request_sent": false,
"followed_by": false,
"followers_count": 153112066,
"following": false,
"friends_count": 410,
"has_custom_timelines": false,
"highlightedLabel": {
"badge": {
"url": "https://pbs.twimg.com/profile_images/1683899100922511378/5lY42eHs_bigger.jpg"
},
"description": "X",
"userLabelType": "BusinessLabel",
"userLabelDisplayType": "Badge"
},
"id": 0,
"id_str": "44196397",
"is_translator": false,
"listed_count": 126597,
"location": "𝕏Ð",
"media_count": 1659,
"name": "Elon Musk",
"normal_followers_count": 153112066,
"notifications": false,
"profile_banner_url": "https://pbs.twimg.com/profile_banners/44196397/1690621312",
"profile_image_url_https": "https://pbs.twimg.com/profile_images/1683325380441128960/yRsRRjGO_normal.jpg",
"protected": false,
"screen_name": "elonmusk",
"show_all_inline_media": false,
"statuses_count": 29441,
"time_zone": "",
"translator_type": "none",
"url": "",
"utc_offset": 0,
"verified": false,
"withheld_in_countries": [],
"withheld_scope": "",
"is_blue_verified": true
}
}
}
},EDIT : Maybe you meant a directly usable solution for an end user, and of course it's not, the snippet need to be adapted by a dev. |
Ok, thank you, I'll try this. |
calling the syndication URL without being logged in twitter doesn't retrieve the most recent tweets. If I call this url in postman, I retrieve 100 tweets from 10/19/2018 to 07/31/2023; no tweets from august... |
It retrieves the tweets with the highest like count from that user, which doesnt sound good if your goal is retrieving the most recent tweets, as there's no guarantee new tweets will make it to the top 100 tweets from that user. And even if they did, it might take a considerable amount of time |
|
I've noticed that for smaller accounts that have less than 100 tweets, that syndication URL does not load any tweets. |
No. That is the case for all big accounts. I am interested in the most recent Tweets and this approach will lead to nothing. |
im using this for my bot and it working fine with cookies and headers. |
Yes. And there is at least one tweet from August with more likes (>807K) than some older tweets which are included (e.g. <680K). |
|
is there any forecast for solving this problem? |
|
Looks like https://nitter.privacydev.net/ is working |
|
That one is a fork which uses account credentials. See #830 |
|
I am aware but couldn't nitter implement a system that aurora uses with lots of accounts that rotate per user? |
That's hard to maintain and simple for twitter to ban by just filtering "if number of accounts per IP > SOME_CONSTANT: ban all of them" |
User feeds not working on this |
|
I switched to the privacydevel fork, credentials in but its still 404ing the same endpoint upstream is having problems with |
|
Strange. privacydev (without credentials) works more or less for @ElonMusk, but not for other users like for instance @BarackObama. |
(Apologies all for the slight segue here.) My bad. I assumed your Nitter instance was dual stack (I've edited my previous comment to correct for that). I'm still not sure why you would restrict/block access to your instance from a block of IPv4 addresses from a major US ISP. I personally think its a bit brute force for ISPs, but that's just me. Then again it seems like your services attract quite a bit of hostile traffic (#983 (comment)).... |
|
Pardon my inexperience, but considering the abundance of throwaway bot accounts on Twitter, could a Nitter instance without guest account tokens use a communal bot account to browse Twitter? It would confuse the trackers to see traffic from a bunch of different users pouring into one account. |
|
The problem is that, for whatever reason, Twitter is spending all its moderation efforts solely on stopping people from scraping their website — Those bots stay up for quite a while, but anyone that reads posts faster than humanly possible is banned. |
|
Why not rate-limit the Nitter server, or better yet, use a bunch of throwaway accounts, each browsing at an undetectable speed? |
I'm sure that is possible.... though the end-user experience would probably be quite poor for a public instance with the traffic volume of say @animegrafmays' instance. That and you're just playing a game of cat-and-mouse with Twitter/Xitter. It's not a long term sustainable strategy. Private instances seem to be the future. One person mentioned a guide to deploy an instance to a Synology NAS: #983 (comment), this PR #830, and this Nitter forks pre_guest_accounts (https://github.com/PrivacyDevel/nitter/tree/pre_guest_accounts) branch. |
|
A private instance may not be useful if the point is to avoid having a Twitter account. I will simply stop trying to read Twitter timelines at all if I cannot get them in chronologicial order without an account. Elon Musk can clean my toilet. |
|
Twstalker still works, amazingly. Is it because very few people use it, and it gets under Musk's radar? |
|
Has anyone tried and had any success in automating user account creation? Seems like that could help public instances survive but the captcha seems like a big issue... |
Not the best place to say it but thank you for keeping nitter.poast.org running, seems to be the last Nitter server now and we desperately need it (I run @splatoon@wetdry.world on fedi). Still works really well, I'm not sure how we would be continuing on without your service (I assume bird.makeup isn't doing that well lol). |
he was still scraping our instance up until a couple days ago. I asked him publicly why he was still doing it and it seems to have stopped
if you reach out to me on there I will give you access to a private ip-whitelisted instance I run just for bots, that way you're not limited by the current measures I've put in place on the public one |
|
I am still using nitter.net, with Firefox set never to override user choices in overriding the certificate error. Since I am not logging into anything and those are public posts I am reading, an MITM attack isn't worth much. |
|
Can anyone with a private account actually watch videos on nitter? If so, what's the exact commit you are running to perform that black magic? |
|
Videos should work again with the latest guest_accounts commit. |
I tested earlier on nitter.poast.org with a video on NintendoUK which had the latest Splatoon song lol |
Not sure exactly how you're doing that when nitter.ner has an expired cert + HTTP Strict Transfer Security (HSTS) enabled.....😕 |
he explained how in his post. im starting to question your username |
You do understand that your GitHub issue has roughly 683 comments, right? I probably did read @lukefromdc's comment at some point. Him or someone else is using another tool to scrape those instances.
Let's not go down that road....and stay somewhat civil? |
sure, you are replying to a public issue i created and every time you do that i get an email. in an effort to try to aid the issue i raised and to prevent getting multiple emails about the same issue or comment i try to be as descriptive as possible. getting an email at 3am from you that says "Let's not go down that road....and stay somewhat civil?" |
|
At this point, seeing how long this thread has lasted, I expect I will continue to be getting emails about this thread until the cows come home. 😆 |
|
@animegrafmays et al.: hope that helps. |
I would not even know HOW to set up a scraper. Rather I use nitter instances to manually monitor remaining Twitter timelines from my community. This is because of Twitter's refusal to show chronological timelines (the only kind useful to me) without an account. |
|
Again: to get past the certificate issue in Firefox (do NOT login to anything showing a certificate error!): |
|
in my experience over the last week it appears twitter limits based on account+ip. using the same list of guest accounts on two dedicated server they are always different. right now there's 11 accounts limited on one server and 6 on the other. i wonder if we can use this to our advantage, perhaps a shared pool of accounts would actually work |
|
Easiest way I can think of to get a ton of different IPs to operate under is Tor, but it's probably not a wise move to use that because it is easy to block outright |
|
There are tons of websites that list public proxies in residential areas. 90% don't work, but if you can sift through them they're very useful for scraping. |
|
@animegrafmays can i ask why your instance is shown as not healthy at https://status.d420.de/ ? |
sure. being listed on that page is a death sentence. the warning on the top of the page begging people to not scrape instances is ignored. being scraped is bad. so the status page is blocked so we don't show up. i still also aggressively monitor logs for scraping attempts and have blocked entire AS ranges from major providers (google, aws, digitalocean) |
|
Twitter recently broke videos on Nitter. Can somebody please fix them? |
try this: #1178 |
The text was updated successfully, but these errors were encountered: