101010.pl is one of the many independent Mastodon servers you can use to participate in the fediverse.
101010.pl czyli najstarszy polski serwer Mastodon. Posiadamy wpisy do 2048 znaków.

Server stats:

483
active users

#llmbenchmark

0 posts0 participants0 posts today
Giskard<p>Thanks to Kyle Wiggers for this article. We're honored to see our research covered by TechCrunch. 🤝</p><p>Read the article here: <a href="https://techcrunch.com/2025/05/08/asking-chatbots-for-short-answers-can-increase-hallucinations-study-finds/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">techcrunch.com/2025/05/08/aski</span><span class="invisible">ng-chatbots-for-short-answers-can-increase-hallucinations-study-finds/</span></a> </p><p><a href="https://fosstodon.org/tags/AISecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AISecurity</span></a> <a href="https://fosstodon.org/tags/LLMBenchmark" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMBenchmark</span></a> <a href="https://fosstodon.org/tags/research" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>research</span></a></p>
Giskard<p>✨ Announcing Phare: new multi-lingual <a href="https://fosstodon.org/tags/LLMBenchmark" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMBenchmark</span></a> 🌊</p><p>We're announcing an open &amp; independent LLM benchmark to evaluate key AI security dimensions including hallucination, factual accuracy, bias, and potential for harm across several languages, with Google DeepMind as research partner.</p><p>Phare (Potential Harm Assessment &amp; Risk Evaluation) will cover leading models from the top 7 AI labs in English, French, and Spanish, and will evaluate models across four dimensions:<br>👇</p>