Hello,
I made a simple script to scraper threads.net using python and selenium. the script is just few lines long and it’s easy to understand.
So what this script does?
first it will open edge browser(which you can change it to firefox or chrome). now you have to enter credentials to log into it. your browsing data and credentials will be stored in user_data
which you can move around.
It scroll through threads’s feed/hashtag/explore and It will store the src of every image it encounters so at the end we will have a links.txt file containing all the links to the images we have encountered.
now we have links.txt and we can use the following command to download all the images from the links.txt
wget -i links.txt
the script:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.edge.options import Options
import time
options = Options()
options.add_argument("--user-data-dir=user_data")
driver = webdriver.Edge(options=options)
driver.get('https://threads.net')
s = set()
input("Press any key to continue...")
for i in range(30):
try:
elements = driver.find_elements(By.XPATH, "//img")
for e in elements:
s.add(e.get_attribute("src"))
driver.execute_script("window.scrollBy(0, 1000);")
time.sleep(0.2)
except:
print("oopsie")
with open("links.txt", 'w') as f:
links = list(s)
for l in links:
f.write(l+"\n")
driver.quit()
I hope it was usefull :D
Edit:
here is a link to links.txt
https://0x0.st/HGjx.txt
nice code, could you share the asm for all that?