@otsch Thanks for your contribution!
The most difficult part about crawling is #cloudflare
the only way I found to bypass it, but it doesn't work 100% of the time is #flaresolverr https://github.com/FlareSolverr/FlareSolverr
Since you started supporting structured data, here is an idea for you. Start supporting #microformats https://microformats.org/
Twenty years ago this past February, Kevin Marks and I introduced #microformats in a conference presentation.
Full post: https://tantek.com/2024/044/t1/twenty-years-microformats
Aside: This is an even shorter summary of that post from ~200 days ago, which #Mastodon readers never got due to a Mastodon #federation bug (details in https://tantek.com/t5Yo1).
Since early 2023, here are the top three updates & interesting developments in microformats:
1. Growing rel=me adoption for distributed verification ( in Mastodon etc.)
* Wikipedia, Threads, omg.lol
2. Proposal to merge #microformats2 h-review into h-entry, since in practice (e.g. on #indieweb) reviews are just entries with a bit more.
3. #metaformats adoptions, implementations, iteration
Happy 12 years of https://indieweb.org/POSSE #POSSE and
19 years of https://microformats.org/ #microformats! (as of yesterday, the 20th)
A few highlights from the past year:
POSSE (Publish on your Own Site, Syndicate Elsewhere) has grown steadily as a common practice in the #IndieWeb community, personal sites, CMSs (like Withknown, which itself reached 10 years in May!), and services (like https://micro.blog and Bridgy) for over a decade.
In its 12th year, POSSE broke through to broader technology press and adoption beyond the community. For example:
* David Pierce’s (@pierce@mas.to) excellent article @TheVerge.com (@verge@mastodon.social): “The poster’s guide to the internet of the future” (https://www.theverge.com/2023/10/23/23928550/posse-posting-activitypub-standard-twitter-tumblr-mastodon):
“Your post appears natively on all of those platforms, typically with some kind of link back to your blog. And your blog becomes the hub for everything, your main home on the internet.
Done right, POSSE is the best of all posting worlds.”
* David also recorded a 29 minute podcast on POSSE with some great interviews: https://podcasts.apple.com/us/podcast/the-posters-guide-to-the-new-internet/id430333725?i=1000632256014
* Cory Doctorow (@craphound.com @doctorow@mamot.fr) declared in his Pluralistic blog (@pluralistic.net) post: “Vice surrenders” (https://pluralistic.net/2024/02/24/anti-posse/):
“This is the moment for POSSE (Post Own Site, Share Everywhere [sic]), a strategy that sees social media as a strategy for bringing readers to channels that you control”
* And none other than Molly White (@mollywhite.net @molly0xfff@hachyderm.io) of @web3isgoinggreat.com (@web3isgreat@indieweb.social) built, deployed, and started actively using her own POSSE setup as described in her post titled “POSSE” (https://www.mollywhite.net/micro/entry/202403091817) to:
"… write posts in the microblog and automatically crosspost them to Twitter/Mastodon/Bluesky, while keeping the original post on my site."
Congrats Molly and well done!
In its 19th year, the microformats formal #microformats2 syntax and popular vocabularies h-card, h-entry, and h-feed, kept growing across IndieWeb (micro)blogging services and software like CMSs & SSGs both for publishing, and richer peer-to-peer social web interactions via #Webmention.
Beyond the IndieWeb, the rel=me microformat, AKA #relMe, continues to be adopted by services to support #distributed #verification, such as these in the past year:
* Meta Platforms #Threads user profile "Link" field¹
* #Letterboxd user profile website field²
For both POSSE and microformats, there is always more we can do to improve their techniques, technologies, and tools to help people own their content and identities online, while staying connected to friends across the web.
Got suggestions for this coming year? Join us in chat:
* https://chat.indieweb.org/dev
* https://chat.indieweb.org/microformats
for discussions about POSSE and microformats, respectively.
Previously: https://tantek.com/2023/171/t1/anniversaries-microformats-posse
This is post 15 of #100PostsOfIndieWeb. #100Posts
← https://tantek.com/2024/151/t1/minimum-interesting-service-worker
→ https://tantek.com/2024/237/t1/people-over-protocols-platforms
Post glossary:
Bridgy
https://brid.gy/ and https://fed.brid.gy/ for direct federation instead of POSSE
CMS
https://indieweb.org/CMS
h-card
https://microformats.org/wiki/h-card
h-entry
https://microformats.org/wiki/h-entry
h-feed
https://microformats.org/wiki/h-feed
microformats2 syntax
https://microformats.org/wiki/microformats2-parsing
rel-me
https://microformats.org/wiki/rel-me
SSG
https://indieweb.org/SSG
Webmention
https://indieweb.org/Webmention
Withknown
https://indieweb.org/Known
References:
¹ https://tantek.com/2023/234/t1/threads-supports-indieweb-rel-me
² https://indieweb.org/rel-me#Letterboxd
I've started owl-blogs to explore the #IndieWeb space. At the moment I'm "only" using #microformats and #webmention.
I've tried to implement #micropub in the past, but found it provided to little value for me.
Anything I should try out next?
I did a write up of setting up Webmentions on my site! I had mentioned it earlier, but there was one stumbling block that took me longer to figure out.
Webmentions let me get notified when people share my posts, respond to my comments on other sites, etc., and lets me use my site for a lot of the kinds of interactions I'd otherwise have to do on social media.
I like the concept of #aboutideasnow. It feels like #microformats, but for URLs. It's kinda weird how few conventions the #web has for urls. Aside from index.html, robots.txt, and favicon.ico there are very few standard URLs that many sites support.
So ... how *should* open platforms advertise their share intents? I made a #microformats proposal, but I'd love to talk about it. https://werd.io/2024/advertising-share-intents-with-microformats #indieweb #fediverse
I've recently added search, Webmentions and microformats to my website.
https://hamatti.org/posts/search-webmentions-and-microformats/
# h-anniversary
TL;DR
<time class="dt-duration" duration="P20Y"> 20 Jahre Microformats</time>
Longread
Vor zwanzig Jahren haben @KevinMarks und @tantek.com Microformats in einer Konferenzpräsentation vorgestellt.
Happy Birthday
Ich bin ein bisschen spät dran, ich weiß, ich weiß, aber es gibt wenig, was mich schon so lange (online) begleitet wie das Format und die Community (abgesehen vielleicht von WordPress), dass ich das nicht unkommentiert lassen kann!!
Microformats is the glue that bridges web content with a richer online experience.Zu meiner Historie: Im Gegensatz zu vielen Anderen in der Branche, mache ich meinen Job nicht, weil ich Spaß am programmieren alleine habe. Ich hatte nie das Bedürfnis, als Kind oder Jugendlicher an einem Computer oder C64 herumzubasteln. Statt dessen bin ich Ende der 90er dem Internet/Web/Bloggen verfallen.
Das Web war:
But if you think of the years 1995-2005, you remember when the web was our social network: blogs, comments on blogs, feed readers, and services such as Flickr, Technorati, and BlogBridge to glue things together. Those were great years […]
Eigentlich passt auch die Beschreibung des IndieWebs:
It is a community of independent and personal websites connected by open standards and based on the principles of: owning your domain and using it as your primary online identity, publishing on your own site first (optionally elsewhere), and owning your content.
Ich habe damals angefangen Webseiten mit Frontpage zu bauen, hab den HTML Code verändert und geschaut wie sich das auf sie Seite auswirkt, hab CSS „drüber gelegt“, ein wenig Dynamik mit JavaScript dazu „gebastelt“… Es hat Spaß gemacht!
Ich bin also nicht durch die Freude am Programmieren im Web gelandet, sondern habe durch die Faszination am Web, programmieren gelernt
Wer sich damals, im deutschsprachigen Raum, mehr oder weniger seriös mit dem Thema HTML beschäftigt hat, ist früher oder später über die Webkrauts gestolpert und über diesen Dunstkreis, habe ich 2006 auch das erste Mal von Microformats gelesen.
Wenn man in letzter Zeit durchs Internet surft, stolpert man immer häufiger über den Begriff „Microformats“ oder sieht das grüne Symbol auf Kontaktseiten. Aber was genau sind Microformats und für was sind sie gut?
Ich glaub es war das Blog von @pixelgraphix auf dem ich dieses „grüne Symbol“ zum ersten Mal entdeckt habe.
Die Idee hat mich tief beeindruckt! Ein Format, „designed for humans first and machines second„! HTML als API „nur“ unter Verwendung von class
und rel
Attributen, also klassisches Plain Old Semantic HTML (PoSH)!
Und irgendwie beschäftigen mich Microformats bis jetzt:
Durch die Erfahrung der letzten Jahre habe ich mittlerweile eine etwas differenziertere Meinung zu „HTML als API“, das Ändert aber nichts an meiner generellen Faszination für Websemantiken.
Die Microformats Community hat mir außerdem die Welt des Open Webs und der Open Standards offenbart, immerhin haben Microformats direkt oder indirekt auch Initiativen wie DataPortability.org, DiSo und das IndieWeb beeinflusst.
Danke Microformats und noch einmal Happy Birthday !
@evan Here’s my take, hope it helps?
https://github.com/benpate/sherlock
Sherlock is a #Golang library that assembles any data/metadata it can find on a URL (including #WebFinger, #RSS, #OpenGraph, and #IndieWeb #MicroFormats ) and returns an #ActivityStream back to its caller. There’s composable add-ons for caching and other custom rules.
Overall, mapping to ActivityStreams was pretty easy. Sherlock is the key component in #Emissary that helps it participate in many different social webs.
Hi #fediverse and #indieweb - is there a #microformats #activityPub (maybe via #rss?) bridge of some kind to push content to mastodon? Any link or advice that would work for https://hroy.eu/blog/ (not WordPress)?
The second #Golang library that makes #Emissary unique is: https://github.com/benpate/sherlock
Sherlock converts HTML pages into #ActivityStreams documents using any and all meta-data available: #JSONLD, #WebFinger, #MicroFormats, #RSS, #Atom, #JSONFeeds (with more coming soon). Any format it can identify gets parsed and normalized into a standard-looking ActivityStreams doc, which gets passed up the toolchain to Hannibal.
I'm so excited to see this stack start to deliver real results soon.
@maffeis #ActivityPub is for federation whereas #micropub and #microsub are for interacting with your instance, so they are not really exclusive though.
Micropub is already supported by tools like micro.blog, @ia Writer and such.
Not sure if anyone has implemented it on top of an ActivityPub backend though.
#Webmention, #WebSub and #Microformats would be the more direct #IndieWeb “competitor” to ActivityPub, but eg @snarfed.org and @pfefferle are both showing that the two can be bridged