I kinda want to mirror this to the fediverse with a bot to 1. Make more people see it and 2. Mirror it so when it gets taken down its distributed on here.
wait is that website a fediverse instance actually?
also if you do mirror it, make sure to do it in an efficient way. for example, some websites offer one large download to archive the whole site, like wikipedia. is less strain on the server than scraping each page individually.
Yeah, I’m real torn. On one hand, I immediately want to scrape this site, but I also don’t want to beat the site up tying up their bandwidth. There seems to be a parent site db4p.org thats managing mirrors of this site, but I don’t see any sort of torrent or archive. If there’s something like that, I’d be very inclined to just archive the entire site/database.
Mmm… such a bot could run once every 24 hours either “visiting the site” and reading the HTML contents. Or using the DB directly if they have an API somewhere.
You may want to create a specific community for that to avoid flooding other communities, and post summaries on other communities every week/month if you do this.
Just a tip to avoid getting banned for spamming communities.
I kinda want to mirror this to the fediverse with a bot to 1. Make more people see it and 2. Mirror it so when it gets taken down its distributed on here.
Should I do it? Or is that dumb?
wait is that website a fediverse instance actually?
also if you do mirror it, make sure to do it in an efficient way. for example, some websites offer one large download to archive the whole site, like wikipedia. is less strain on the server than scraping each page individually.
Do it.
DEFINITELY do it! It’s the opposite of dumb!
Yeah, I’m real torn. On one hand, I immediately want to scrape this site, but I also don’t want to beat the site up tying up their bandwidth. There seems to be a parent site db4p.org thats managing mirrors of this site, but I don’t see any sort of torrent or archive. If there’s something like that, I’d be very inclined to just archive the entire site/database.
Mmm… such a bot could run once every 24 hours either “visiting the site” and reading the HTML contents. Or using the DB directly if they have an API somewhere.
Either way it doesn’t cost them much.
Just put the feed in its own community, then yeah. Go for it.
There’s a kernel of a good idea there but I’m not sure how the actual format of that would work out…
Mirror it how exactly?
Do you mean scraping the data and publishing a report to lemmy/piefed/mastodon?
I was thinking indeed scraping (or using an API), and when a new entry is made, repost it on here (on a seperate community).
You may want to create a specific community for that to avoid flooding other communities, and post summaries on other communities every week/month if you do this.
Just a tip to avoid getting banned for spamming communities.
Yes ofcourse! It’s also against (at least lemmy.world’s) instance to post as a bot without express permission from the community mods.
Make a new community for sure. I would be interested