this post was submitted on 24 Jul 2024
1 points (100.0% liked)

Technology

58061 readers
31 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
top 11 comments
sorted by: hot top controversial new old
[–] [email protected] 2 points 1 month ago (1 children)

Parts of the Internet now only searchable on specific sites now? What next - charging a monthly subscription to use Google?

This needs to be regulated before the Internet becomes like streaming TV.

[–] [email protected] 1 points 1 month ago* (last edited 1 month ago)

Robots.txt has been around for a long time, and all the major search engines will honor it. Not having a full index of the Web is the norm.

That isn't to say that the practice of signing agreements isn't potentially a concern. Not sure that I like the idea of search engines paying sites money to degrade search results of competitors.

[–] [email protected] 0 points 1 month ago (1 children)

Is Google really permitted to prevent any other search engine from looking at Reddit?

[–] [email protected] 0 points 1 month ago (2 children)

I guess Reddit is permitted to only let Google index it

[–] [email protected] 1 points 1 month ago (1 children)
[–] [email protected] 0 points 1 month ago (1 children)

I don't know of any law that says that they can't.

[–] [email protected] 1 points 1 month ago* (last edited 1 month ago)

I also don’t know if a law that says search engines have to honor a robots.txt file. I guess we will see what happens if Bing or some other service decides to ignore it.

[–] [email protected] 0 points 1 month ago* (last edited 1 month ago) (1 children)

How can they do that, logistically?

Like I realize there's a flag they can raise that asks not to be indexed but that's not legally binding.

[–] [email protected] 1 points 1 month ago

I guess they can make it hard to index by scraping by rate limiting or requiring login to view content etc and only provide Google the api to bypass the restrictions

There's probably a lot of ways to do it

[–] [email protected] 0 points 1 month ago (1 children)

just begin with site:reddit.com test for brave search and it still works

[–] [email protected] 1 points 1 month ago

did you set time limit to last week? old posts are still indexed. just tried "site:reddit.com df:w" on DDG and no hits