Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

looks like X/twitter(?) broke something again #983

Open
animegrafmays opened this issue Aug 15, 2023 · 632 comments
Open

looks like X/twitter(?) broke something again #983

animegrafmays opened this issue Aug 15, 2023 · 632 comments

Comments

@animegrafmays
Copy link


8493b396fd05f26fe681a6abe9a849dc983d091fd4472a53dc3aa72547b030c4

BANKA2017 added a commit to BANKA2017/twitter-monitor-assets that referenced this issue Aug 15, 2023

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
broke again

zedeus/nitter#983
@ghost
Copy link

ghost commented Aug 15, 2023

Also, the syndication api for showReplies=true does not work anymore:
https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk?showReplies=true
but showReplies=false still works, showing the tweets ordered by like count...
https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk?showReplies=false

@beingnajib
Copy link

Yes, it is not working now. I hope the Nitter people fix this soon.

@paulamei
Copy link

paulamei commented Aug 15, 2023

Is there a online/CLI tool converting
https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk
to RSS feed? Then we could individually download the HTML from a logged in profile and do the conversion in a second step.

@nerra0pos
Copy link

Also, the syndication api for showReplies=true does not work anymore: https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk?showReplies=true but showReplies=false still works, showing the tweets ordered by like count... https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk?showReplies=false

Not really. showReplies=false shows years-old content when not logged in.

@iceFbr
Copy link

iceFbr commented Aug 15, 2023

Down again...

@Dheatly23
Copy link

Also, the syndication api for showReplies=true does not work anymore: https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk?showReplies=true but showReplies=false still works, showing the tweets ordered by like count... https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk?showReplies=false

Not really. showReplies=false shows years-old content when not logged in.

That's because in that specific example, those tweets were years ago. Look again at the like count, notice anything?

@ghost
Copy link

ghost commented Aug 15, 2023

Is there a online/CLI tool converting https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk to RSS feed? Then we could individually download the HTML from a logged in profile and do the conversion in a second step.

We can just search for the first { from begin and first } from end and then parse as json.
If a user has not so many tweets (500-1000) then the chance is quite good that also newer tweets are within the most popular 100.
But of course it's not a very good solution. At least, nitter should have it as a backup when nothing else works, this method can be used.

@Write
Copy link

Write commented Aug 15, 2023

Is there a online/CLI tool converting https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk to RSS feed? Then we could individually download the HTML from a logged in profile and do the conversion in a second step.

We can just search for the first { from begin and first } from end and then parse as json. If a user has not so many tweets (500-1000) then the chance is quite good that also newer tweets are within the most popular 100. But of course it's not a very good solution. At least, nitter should have it as a backup when nothing else works, this method can be used.

Indeed

#!/usr/bin/python3

import requests
import re
import urllib

url  = "https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk"

with urllib.request.urlopen(url) as response:
    encoding = response.info().get_param('charset', 'utf8')
    html = response.read().decode(encoding)
    result = re.search('script id="__NEXT_DATA__" type="application\/json">([^>]*)<\/script>', html)[1]

    print(result)

@paulamei
Copy link

paulamei commented Aug 15, 2023

Indeed

Interesting, but this doesn't return RSS with 'item', 'pubDate' etc. tags. Maybe a script using https://github.com/lkiesow/python-feedgen would do the job?

@Write
Copy link

Write commented Aug 15, 2023

Indeed

Interesting, but this doesn't return RSS with 'item', 'pubDate' etc. tags. Maybe a script using https://github.com/lkiesow/python-feedgen would do the job?

Not sure I understand ? It expose far more informations than needed and it does expose the date and all

Here's an example for one tweet only :

 {
            "type": "tweet",
            "entry_id": "tweet-1519480761749016577",
            "sort_index": "1691455400412446720",
            "content": {
              "tweet": {
                "id": 0,
                "location": "",
                "conversation_id_str": "1519480761749016577",
                "created_at": "Thu Apr 28 00:56:58 +0000 2022",
                "display_text_range": [
                  0,
                  52
                ],
                "entities": {
                  "user_mentions": [],
                  "urls": [],
                  "hashtags": [],
                  "symbols": [],
                  "media": []
                },
                "favorite_count": 4600599,
                "favorited": false,
                "full_text": "Next I’m buying Coca-Cola to put the cocaine back in",
                "id_str": "1519480761749016577",
                "lang": "en",
                "permalink": "/elonmusk/status/1519480761749016577",
                "possibly_sensitive": false,
                "quote_count": 171975,
                "reply_count": 187438,
                "retweet_count": 649833,
                "retweeted": false,
                "text": "Next I’m buying Coca-Cola to put the cocaine back in",
                "user": {
                  "blocking": false,
                  "created_at": "Tue Jun 02 20:12:29 +0000 2009",
                  "default_profile": false,
                  "default_profile_image": false,
                  "description": "Blades of Glory",
                  "entities": {
                    "description": {
                      "urls": []
                    },
                    "url": {}
                  },
                  "fast_followers_count": 0,
                  "favourites_count": 30569,
                  "follow_request_sent": false,
                  "followed_by": false,
                  "followers_count": 153112066,
                  "following": false,
                  "friends_count": 410,
                  "has_custom_timelines": false,
                  "highlightedLabel": {
                    "badge": {
                      "url": "https://pbs.twimg.com/profile_images/1683899100922511378/5lY42eHs_bigger.jpg"
                    },
                    "description": "X",
                    "userLabelType": "BusinessLabel",
                    "userLabelDisplayType": "Badge"
                  },
                  "id": 0,
                  "id_str": "44196397",
                  "is_translator": false,
                  "listed_count": 126597,
                  "location": "𝕏Ð",
                  "media_count": 1659,
                  "name": "Elon Musk",
                  "normal_followers_count": 153112066,
                  "notifications": false,
                  "profile_banner_url": "https://pbs.twimg.com/profile_banners/44196397/1690621312",
                  "profile_image_url_https": "https://pbs.twimg.com/profile_images/1683325380441128960/yRsRRjGO_normal.jpg",
                  "protected": false,
                  "screen_name": "elonmusk",
                  "show_all_inline_media": false,
                  "statuses_count": 29441,
                  "time_zone": "",
                  "translator_type": "none",
                  "url": "",
                  "utc_offset": 0,
                  "verified": false,
                  "withheld_in_countries": [],
                  "withheld_scope": "",
                  "is_blue_verified": true
                }
              }
            }
          },

EDIT : Maybe you meant a directly usable solution for an end user, and of course it's not, the snippet need to be adapted by a dev.

@paulamei
Copy link

paulamei commented Aug 15, 2023

Indeed

Interesting, but this doesn't return RSS with 'item', 'pubDate' etc. tags. Maybe a script using https://github.com/lkiesow/python-feedgen would do the job?

Not sure I understand ? It expose far more informations than needed and it does expose the date and all

Ok, thank you, I'll try this.

@jcmag
Copy link

jcmag commented Aug 15, 2023

Is there a online/CLI tool converting https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk to RSS feed? Then we could individually download the HTML from a logged in profile and do the conversion in a second step.

calling the syndication URL without being logged in twitter doesn't retrieve the most recent tweets. If I call this url in postman, I retrieve 100 tweets from 10/19/2018 to 07/31/2023; no tweets from august...

@null-routed
Copy link

Is there a online/CLI tool converting https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk to RSS feed? Then we could individually download the HTML from a logged in profile and do the conversion in a second step.

calling the syndication URL without being logged in twitter doesn't retrieve the most recent tweets. If I call this url in postman, I retrieve 100 tweets from 10/19/2018 to 07/31/2023; no tweets from august...

It retrieves the tweets with the highest like count from that user, which doesnt sound good if your goal is retrieving the most recent tweets, as there's no guarantee new tweets will make it to the top 100 tweets from that user. And even if they did, it might take a considerable amount of time

@yuv418
Copy link

yuv418 commented Aug 15, 2023

I've noticed that for smaller accounts that have less than 100 tweets, that syndication URL does not load any tweets.

@nerra0pos
Copy link

Also, the syndication api for showReplies=true does not work anymore: https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk?showReplies=true but showReplies=false still works, showing the tweets ordered by like count... https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk?showReplies=false

Not really. showReplies=false shows years-old content when not logged in.

That's because in that specific example, those tweets were years ago. Look again at the like count, notice anything?

No. That is the case for all big accounts. I am interested in the most recent Tweets and this approach will lead to nothing.

@kpopdev
Copy link

kpopdev commented Aug 15, 2023

Indeed

Interesting, but this doesn't return RSS with 'item', 'pubDate' etc. tags. Maybe a script using https://github.com/lkiesow/python-feedgen would do the job?

Not sure I understand ? It expose far more informations than needed and it does expose the date and all

Here's an example for one tweet only :

 {
            "type": "tweet",
            "entry_id": "tweet-1519480761749016577",
            "sort_index": "1691455400412446720",
            "content": {
              "tweet": {
                "id": 0,
                "location": "",
                "conversation_id_str": "1519480761749016577",
                "created_at": "Thu Apr 28 00:56:58 +0000 2022",
                "display_text_range": [
                  0,
                  52
                ],
                "entities": {
                  "user_mentions": [],
                  "urls": [],
                  "hashtags": [],
                  "symbols": [],
                  "media": []
                },
                "favorite_count": 4600599,
                "favorited": false,
                "full_text": "Next I’m buying Coca-Cola to put the cocaine back in",
                "id_str": "1519480761749016577",
                "lang": "en",
                "permalink": "/elonmusk/status/1519480761749016577",
                "possibly_sensitive": false,
                "quote_count": 171975,
                "reply_count": 187438,
                "retweet_count": 649833,
                "retweeted": false,
                "text": "Next I’m buying Coca-Cola to put the cocaine back in",
                "user": {
                  "blocking": false,
                  "created_at": "Tue Jun 02 20:12:29 +0000 2009",
                  "default_profile": false,
                  "default_profile_image": false,
                  "description": "Blades of Glory",
                  "entities": {
                    "description": {
                      "urls": []
                    },
                    "url": {}
                  },
                  "fast_followers_count": 0,
                  "favourites_count": 30569,
                  "follow_request_sent": false,
                  "followed_by": false,
                  "followers_count": 153112066,
                  "following": false,
                  "friends_count": 410,
                  "has_custom_timelines": false,
                  "highlightedLabel": {
                    "badge": {
                      "url": "https://pbs.twimg.com/profile_images/1683899100922511378/5lY42eHs_bigger.jpg"
                    },
                    "description": "X",
                    "userLabelType": "BusinessLabel",
                    "userLabelDisplayType": "Badge"
                  },
                  "id": 0,
                  "id_str": "44196397",
                  "is_translator": false,
                  "listed_count": 126597,
                  "location": "𝕏Ð",
                  "media_count": 1659,
                  "name": "Elon Musk",
                  "normal_followers_count": 153112066,
                  "notifications": false,
                  "profile_banner_url": "https://pbs.twimg.com/profile_banners/44196397/1690621312",
                  "profile_image_url_https": "https://pbs.twimg.com/profile_images/1683325380441128960/yRsRRjGO_normal.jpg",
                  "protected": false,
                  "screen_name": "elonmusk",
                  "show_all_inline_media": false,
                  "statuses_count": 29441,
                  "time_zone": "",
                  "translator_type": "none",
                  "url": "",
                  "utc_offset": 0,
                  "verified": false,
                  "withheld_in_countries": [],
                  "withheld_scope": "",
                  "is_blue_verified": true
                }
              }
            }
          },

EDIT : Maybe you meant a directly usable solution for an end user, and of course it's not, the snippet need to be adapted by a dev.

im using this for my bot and it working fine with cookies and headers.

@intuser
Copy link

intuser commented Aug 15, 2023

Is there a online/CLI tool converting https://syndication.twitter.com/srv/timeline-profile/screen-name/elonmusk to RSS feed? Then we could individually download the HTML from a logged in profile and do the conversion in a second step.

calling the syndication URL without being logged in twitter doesn't retrieve the most recent tweets. If I call this url in postman, I retrieve 100 tweets from 10/19/2018 to 07/31/2023; no tweets from august...

Yes. And there is at least one tweet from August with more likes (>807K) than some older tweets which are included (e.g. <680K).

@Mr-Freewan
Copy link

is there any forecast for solving this problem?

@ghost
Copy link

ghost commented Aug 15, 2023

Looks like https://nitter.privacydev.net/ is working

@zedeus
Copy link
Owner

zedeus commented Aug 15, 2023

That one is a fork which uses account credentials. See #830

@ghost
Copy link

ghost commented Aug 15, 2023

I am aware but couldn't nitter implement a system that aurora uses with lots of accounts that rotate per user?

@ghost
Copy link

ghost commented Aug 15, 2023

I am aware but couldn't nitter implement a system that aurora uses with lots of accounts that rotate per user?

That's hard to maintain and simple for twitter to ban by just filtering "if number of accounts per IP > SOME_CONSTANT: ban all of them"

@ghost
Copy link

ghost commented Aug 15, 2023

Looks like https://nitter.privacydev.net/ is working

User feeds not working on this

@dawnerd
Copy link

dawnerd commented Aug 15, 2023

I switched to the privacydevel fork, credentials in but its still 404ing the same endpoint upstream is having problems with

@intuser
Copy link

intuser commented Aug 15, 2023

Strange. privacydev (without credentials) works more or less for @ElonMusk, but not for other users like for instance @BarackObama.

@compuguy
Copy link

compuguy commented Feb 20, 2024

there's no AAAA record so it has nothing to do with IPv6 as it is IPv4 only.

I'm honestly not sure why one would block an entire ISP's IPv6 block

one of the largest DDoS botnets runs on Hetzner and almost exclusively abuses their v6 ranges. thats a perfectly valid reason for doing such. but this is not something that has been done on this nitter instance.

(Apologies all for the slight segue here.)

My bad. I assumed your Nitter instance was dual stack (I've edited my previous comment to correct for that). I'm still not sure why you would restrict/block access to your instance from a block of IPv4 addresses from a major US ISP. I personally think its a bit brute force for ISPs, but that's just me. Then again it seems like your services attract quite a bit of hostile traffic (#983 (comment))....

@JapanYoshi
Copy link

Pardon my inexperience, but considering the abundance of throwaway bot accounts on Twitter, could a Nitter instance without guest account tokens use a communal bot account to browse Twitter? It would confuse the trackers to see traffic from a bunch of different users pouring into one account.

@stopmotio
Copy link

The problem is that, for whatever reason, Twitter is spending all its moderation efforts solely on stopping people from scraping their website — Those bots stay up for quite a while, but anyone that reads posts faster than humanly possible is banned.

@JapanYoshi
Copy link

Why not rate-limit the Nitter server, or better yet, use a bunch of throwaway accounts, each browsing at an undetectable speed?

@compuguy
Copy link

compuguy commented Feb 21, 2024

Why not rate-limit the Nitter server, or better yet, use a bunch of throwaway accounts, each browsing at an undetectable speed?

I'm sure that is possible.... though the end-user experience would probably be quite poor for a public instance with the traffic volume of say @animegrafmays' instance. That and you're just playing a game of cat-and-mouse with Twitter/Xitter. It's not a long term sustainable strategy. Private instances seem to be the future. One person mentioned a guide to deploy an instance to a Synology NAS: #983 (comment), this PR #830, and this Nitter forks pre_guest_accounts (https://github.com/PrivacyDevel/nitter/tree/pre_guest_accounts) branch.

@lukefromdc
Copy link

A private instance may not be useful if the point is to avoid having a Twitter account. I will simply stop trying to read Twitter timelines at all if I cannot get them in chronologicial order without an account. Elon Musk can clean my toilet.

@anibalburdo
Copy link

Twstalker still works, amazingly. Is it because very few people use it, and it gets under Musk's radar?

@telekinemon
Copy link

Has anyone tried and had any success in automating user account creation? Seems like that could help public instances survive but the captcha seems like a big issue...

@ExperiencersInternational

@animegrafmays On the "guest_account" branch, do the videos work? What's the specific commit you are using?

using the guest accounts branch here, the latest commit. nothing modified. videos and multi only work with more than 5 accounts. as far as I can tell it's related maybe to how the data is fetched

here is an example: https://nitter.poast.org/NASAUniverse/status/1758594049991225404

our nitter currently has 74 account tokens from accounts aged older than 2016 as newer accounts have much worse limits

edit: just checked with a private instance I run for our fediverse bots that has two accounts and videos play so the limitation must just be 2

Not the best place to say it but thank you for keeping nitter.poast.org running, seems to be the last Nitter server now and we desperately need it (I run @splatoon@wetdry.world on fedi).

Still works really well, I'm not sure how we would be continuing on without your service (I assume bird.makeup isn't doing that well lol).

@animegrafmays
Copy link
Author

I assume bird.makeup isn't doing that well

he was still scraping our instance up until a couple days ago. I asked him publicly why he was still doing it and it seems to have stopped

I run @splatoon@wetdry.world on fedi

if you reach out to me on there I will give you access to a private ip-whitelisted instance I run just for bots, that way you're not limited by the current measures I've put in place on the public one

@lukefromdc
Copy link

I am still using nitter.net, with Firefox set never to override user choices in overriding the certificate error. Since I am not logging into anything and those are public posts I am reading, an MITM attack isn't worth much.

@somini
Copy link
Contributor

somini commented Feb 21, 2024

Can anyone with a private account actually watch videos on nitter? If so, what's the exact commit you are running to perform that black magic?

@zedeus
Copy link
Owner

zedeus commented Feb 21, 2024

Videos should work again with the latest guest_accounts commit.

@ExperiencersInternational

Can anyone with a private account actually watch videos on nitter? If so, what's the exact commit you are running to perform that black magic?

I tested earlier on nitter.poast.org with a video on NintendoUK which had the latest Splatoon song lol

@compuguy
Copy link

I am still using nitter.net, with Firefox set never to override user choices in overriding the certificate error. Since I am not logging into anything and those are public posts I am reading, an MITM attack isn't worth much.

Not sure exactly how you're doing that when nitter.ner has an expired cert + HTTP Strict Transfer Security (HSTS) enabled.....😕

@animegrafmays
Copy link
Author

Not sure exactly how you're doing that when nitter.ner has an expired cert + HTTP Strict Transfer Security (HSTS) enabled.....😕

he explained how in his post. im starting to question your username

@compuguy
Copy link

Not sure exactly how you're doing that when nitter.ner has an expired cert + HTTP Strict Transfer Security (HSTS) enabled.....😕

he explained how in his post. im starting to question your username

You do understand that your GitHub issue has roughly 683 comments, right? I probably did read @lukefromdc's comment at some point. Him or someone else is using another tool to scrape those instances.

he explained how in his post. im starting to question your username

Let's not go down that road....and stay somewhat civil?

@animegrafmays
Copy link
Author

Let's not go down that road....and stay somewhat civil?

sure, you are replying to a public issue i created and every time you do that i get an email. in an effort to try to aid the issue i raised and to prevent getting multiple emails about the same issue or comment i try to be as descriptive as possible. getting an email at 3am from you that says "Let's not go down that road....and stay somewhat civil?"

@stopmotio
Copy link

stopmotio commented Feb 22, 2024

At this point, seeing how long this thread has lasted, I expect I will continue to be getting emails about this thread until the cows come home. 😆

@ofifoto
Copy link

ofifoto commented Feb 22, 2024

@animegrafmays et al.:

image

hope that helps.

@lukefromdc
Copy link

Not sure exactly how you're doing that when nitter.ner has an expired cert + HTTP Strict Transfer Security (HSTS) enabled.....😕

he explained how in his post. im starting to question your username

You do understand that your GitHub issue has roughly 683 comments, right? I probably did read @lukefromdc's comment at some point. Him or someone else is using another tool to scrape those instances.

he explained how in his post. im starting to question your username

Let's not go down that road....and stay somewhat civil?

I would not even know HOW to set up a scraper. Rather I use nitter instances to manually monitor remaining Twitter timelines from my community. This is because of Twitter's refusal to show chronological timelines (the only kind useful to me) without an account.

@lukefromdc
Copy link

Again: to get past the certificate issue in Firefox (do NOT login to anything showing a certificate error!):
1: open about:config
2: set network.stricttransportsecurity.preloadlist to false

@animegrafmays
Copy link
Author

in my experience over the last week it appears twitter limits based on account+ip. using the same list of guest accounts on two dedicated server they are always different. right now there's 11 accounts limited on one server and 6 on the other. i wonder if we can use this to our advantage, perhaps a shared pool of accounts would actually work

@stopmotio
Copy link

Easiest way I can think of to get a ton of different IPs to operate under is Tor, but it's probably not a wise move to use that because it is easy to block outright

@nukeop
Copy link

nukeop commented Feb 23, 2024

There are tons of websites that list public proxies in residential areas. 90% don't work, but if you can sift through them they're very useful for scraping.

@Ame-chan-angel
Copy link

@animegrafmays can i ask why your instance is shown as not healthy at https://status.d420.de/ ?

@animegrafmays
Copy link
Author

@animegrafmays can i ask why your instance is shown as not healthy at https://status.d420.de/ ?

sure. being listed on that page is a death sentence. the warning on the top of the page begging people to not scrape instances is ignored. being scraped is bad. so the status page is blocked so we don't show up. i still also aggressively monitor logs for scraping attempts and have blocked entire AS ranges from major providers (google, aws, digitalocean)

@sthalik
Copy link

sthalik commented Feb 25, 2024

Twitter recently broke videos on Nitter. Can somebody please fix them?

@animegrafmays
Copy link
Author

Twitter recently broke videos on Nitter. Can somebody please fix them?

try this: #1178

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests