home.social

#datascraping — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #datascraping, aggregated by home.social.

  1. Learn everything you need to know about Data Scraping via these 70 free HackerNoon blog posts. hackernoon.com/70-blog-posts-t #datascraping

  2. I'm slightly creeped out but not surprised. I was editing a music score on my laptop recently and I added an instruction to play the piece "robotic". The next time I logged into Indeed, the first job recommendation to come up is for Robotics Operator. Is Indeed scraping data from my recent documents for keywords?

    Always check your firewall.

    #scraping #datascraping

  3. The “17.5 million Instagram user data leak” making rounds in 2026? Old news

    The data from 2022 was already leaked in 2023.

    We broke down all 3 dumps - same records

    Don’t fall for clickbait reports!

    Read: hackread.com/instagram-user-da

    #Instagram #DataLeak #Cybersecurity #Privacy #DataScraping

  4. Từ một dự án freelancing scrape Substack, một người đã biến giải pháp 1 lần thành công cụ tự phục vụ, mở ra cơ hội thị trường. Câu chuyện chuyển đổi từ làm thuê sang tạo sản phẩm. #FreelanceTips #ProductBuilding #Substack #DataScraping #StartupViecles #TaoSanPham #KinhNghiemTuDo

    reddit.com/r/SideProject/comme

  5. **AI nợ công: Làm thế nào các công cụ đào tạo LLM phá vỡ hợp đồng xã hội của mã nguồn mở**
    AI học hỏi từ mã nguồn mở nhưng không hoàn thiện nghĩa vụ, gây bất cập cho cộng đồng. Các dự án LLM (Large Language Models) "dựng" dữ liệu công khai nhưng xem nhẹ trách nhiệm bảo mật, tôn vinh tác giả và lợi ích lâu dài của phần mềm mở. Cần tái định hướng để công nghệ phát triển bền vững.

    #AI #Mãnguồnmở #Đàotạocôngnghệ #Bềnvững #ĐạođứcAI #OpenSource #SocialContract #TechEthics #AIdebt #DataScraping

    h

  6. Tuyển dụng: Vị trí scrape 300.000 tiêu đề sách PDF từ AbeBooks, tìm file từ Wayback Machine/Anna's Archive. Tổng 4TB dữ liệu sẽ được lưu trữ vào đĩa quang 128GB (Verbatim/Panasonic) để đảm bảo đọc được 100 năm. Ngân sách: $700 (chưa vật tư).

    #TuyểnDụng #Scraping #LưuTrữDữLiệu #PDF #AbeBooks
    #Hiring #DataScraping #DataArchiving #PDF

    reddit.com/r/programming/comme

  7. LinkedIn, the social media titan known for its riveting inspirational #quotes and unsolicited connections, is now channeling its inner #superhero, battling the dastardly villains of data scraping. 🦸‍♂️💼 Apparently, charging $15k for harvested data is a crime—unless you're #LinkedIn, of course. 🤑🔍
    therecord.media/linkedin-sues- #DataScraping #SocialMedia #Crime #HackerNews #ngated

  8. Google's crackdown on data scrapers triggered immediate disruptions across the marketing landscape, particularly for organizations whose business models depend on SEO. The move represents the latest evolution in the ongoing battle between major websites and data scrapers. Read more at @TechRadar. #Google #SEO #DataScraping #Tech #Technology flip.it/F5M7-d

  9. Many companies have already completed #datascraping everything on the internet, and commercially available personal databases through #Experian and other available databases. The only thing left was government databases. #ElonMusk put himself first in line.

  10. So first #TuneCore, now #YouTube. They are trying to scrap from our art, creativity and potential income making datasets for sale, without even giving us a single penny. Please be intelligent, do not allow a single step over your rights with this predatory and misleading companies. #DistroKid , #CDBaby probably will try to do the same over the next years.

    @[email protected] @musicproduction @[email protected] #musicindustry #AI #IA #industriadelamusica #dataset #datascraping @radicalmusic

  11. So first #TuneCore, now #YouTube. They are trying to scrap from our art, creativity and potential income making datasets for sale, without even giving us a single penny. Please be intelligent, do not allow a single step over your rights with this predatory and misleading companies. #DistroKid , #CDBaby probably will try to do the same over the next years.

    @[email protected] @musicproduction @[email protected] #musicindustry #AI #IA #industriadelamusica #dataset #datascraping @radicalmusic

  12. So first #TuneCore, now #YouTube. They are trying to scrap from our art, creativity and potential income making datasets for sale, without even giving us a single penny. Please be intelligent, do not allow a single step over your rights with this predatory and misleading companies. #DistroKid , #CDBaby probably will try to do the same in the next years.

    @[email protected] @musicproduction @[email protected] #musicindustry #AI #IA #industriadelamusica #dataset #datascraping @radicalmusic

  13. So first #TuneCore, now #YouTube. They are trying to scrap from our art, creativity and potential income making datasets for sale, without even giving us a single penny. Please be intelligent, do not allow a single step over your rights with this predatory and misleading companies. #DistroKid , #CDBaby probably will try to do the same in the next years.

    @[email protected] @musicproduction @[email protected] #musicindustry #AI #IA #industriadelamusica #dataset #datascraping @radicalmusic

  14. So first #TuneCore, now #YouTube. They are trying to scrap from our art, creativity and potential income making datasets for sale, without even giving us a single penny. Please be intelligent, do not allow a single step over your rights with this predatory and misleading companies. #DistroKid , #CDBaby probably will try to do the same over the next years.

    @[email protected] @musicproduction @[email protected] #musicindustry #AI #IA #industriadelamusica #dataset #datascraping @radicalmusic

  15. ⚠️AHORA A LAS 17H🇦🇷⚠️ ⚖CONFERENCIA
    🔷️“#DataScraping con #IA y el enfoque basado en #riesgo como límite a los usos de la #InteligenciaArtificial.”
    🔹️Dra. PhD Johanna C Faliero
    📌LINK DE CONEXIÓN➡️usal-edu-ar.zoom.us/j/89116192

  16. Ever wondered what a #bot really is? 🤖 From chatbots helping with customer service to web crawlers indexing the internet, bots are everywhere! 🌐

    In our latest guide, we dive into:

    🤖 The definition and evolution of bots
    📜 Different types of bots
    🌍 Impact of good & bad bots on our daily lives
    🔍 Detecting and blocking malicious bots

    With bot traffic accounting for almost 50% of all internet traffic and 2/3rd of them being bad bots, it’s crucial to understand their influence on our online activities.

    Discover more now! bit.ly/3KkgVtI

    #badbots #webcrawlers #maliciousbots #ddosattacks #datascraping #malware #botmanagement #managedservices #waap #apptrana #indusface