Lemmy
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
cm0002@lemmy.world to Programmer Humor@programming.dev · 7 days ago

APIs vs Web Scrapers

lemmy.ml

message-square
9
link
fedilink
  • cross-posted to:
  • programmerhumor@lemmy.ml
17

APIs vs Web Scrapers

lemmy.ml

cm0002@lemmy.world to Programmer Humor@programming.dev · 7 days ago
message-square
9
link
fedilink
  • cross-posted to:
  • programmerhumor@lemmy.ml
alert-triangle
You must log in or register to comment.
  • Kojichan@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    4 days ago

    I just recently seen a python scraper in my server logs earlier today. Strangest thing to see.

  • HappyFrog@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    3
    ·
    7 days ago

    As long as the scrapers follows robots.txt

    • Jankatarch@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      7 days ago

      It’s equivalent to “the code.”

      • dejected_warp_core@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        7 hours ago

        It really should be “parlay.txt”.

      • kautau@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        7 days ago

  • mspencer712@programming.dev
    link
    fedilink
    arrow-up
    2
    ·
    7 days ago

    I feel like there should be a third box with Wall Street raider types, for scrapers that use Selenium browser automation.

    I don’t think it’s entirely unblockable - adsense seems to know to only serve unmonetized PSA ads - but I think it’s very difficult to discriminate between “this is a real browser controlled by an end user” and “this is a real browser being controlled by automated test software”.

    • erytau@programming.dev
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 days ago

      Fourth panel as well, with those bots collecting data for AI training that don’t respect your robots.txt, change user agents and overload your servers

      • dejected_warp_core@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        4 hours ago

        War boys from Fury Road?

  • TropicalDingdong@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    7 days ago

    beautiful soup

Programmer Humor@programming.dev

programmer_humor@programming.dev

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !programmer_humor@programming.dev

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

  • Keep content in english
  • No advertisements
  • Posts must be related to programming or programmer topics
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 2.72K users / day
  • 5.12K users / week
  • 5.2K users / month
  • 5.26K users / 6 months
  • 1 local subscriber
  • 23.8K subscribers
  • 154 Posts
  • 1.47K Comments
  • Modlog
  • mods:
  • adr1an@programming.dev
  • Feyter@programming.dev
  • BurningTurtle@programming.dev
  • Pierre-Yves Lapersonne@programming.dev
  • BE: 0.19.11
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org