101010.pl is one of the many independent Mastodon servers you can use to participate in the fediverse.
101010.pl czyli najstarszy polski serwer Mastodon. Posiadamy wpisy do 2048 znaków.

Server stats:

502
active users

#LLMs

19 posts19 participants5 posts today

Chock up a win for Anthropic.

Judge rules Anthropic's use of books without permission to train its artificial intelligence system was legal under U.S. copyright law, accepting the position put forward by Anthropic that it made fair use of the books and that U.S. copyright law "not only allows, but encourages" its AI training because it promotes human creativity.

tech.yahoo.com/ai/articles/ant #AI #AITraining #Anthropic #LLMs #Copyright #Lawsuit #Claude

Our history teacher taught us that the foundation of getting to the truth is to find many opinions and compare the sources, not just what each source says but who said it and why they said it and in which situation they said it.

LLMs spew out one unsourced answer out of context, and so make getting to the truth impossible. They've clearly stolen the data but refuse to say who from because it would get them into legal trouble, so they go all vague when asked about the origin of the info.

LLMs are driving us away from critical thinking and towards blind acceptance of whatever the LLM's owner says is true.

#AI#LLM#LLMs

2025 State of #DataSecurity Report

Quantifying #AI's impact on #DataRisk

AI is everywhere. #Copilots help employees boost productivity and agents provide front-line customer support. #LLMs enable businesses to extract deep insights from their data.

👉Once unleashed, however, AI acts like a hungry Pac-Man, scanning and analyzing all the data it can grab.
If AI surfaces critical data where it doesn’t belong, it’s game over. Data can’t be un-breached..."

info.varonis.com/en/state-of-d

Cover of the Varonis 2025 State of Data Security Report
info.varonis.comState of Data Security Report 2025Varonis' 2025 State of Data Security Report shares findings from 1,000 real-world IT environments to uncover the dark side of the AI boom and what proactive steps orgs can take to secure critical information.

📯 Diese Woche im #DigitalHistoryOFK: Torsten Hiltmann und @DigHisNoah präsentieren "RAG den Spiegel" – ein innovatives RAG-System zur Analyse des SPIEGEL-Archivs. Der Vortrag zeigt, wie #LLMs Geschichtswissenschaft verändern und hermeneutische mit computationellen Methoden verbinden.
📅 25. Juni, 16-18 Uhr, online (Zugang auf Anfrage)
ℹ️ Abstract: dhistory.hypotheses.org/10912 #TextMining #4memory #DigitalHistory @historikerinnen @histodons @digitalhumanities

Replied in thread

@b_rain I've had to explain that LLMs are nothing at all like search engines to friends who are highly educated and, overall, likely smarter and more capable of complex thought than myself.

They're just not computer science educated, and understand computers as good at numbers and large database retrievals.

Which is the exact opposite of #LLMs.

Society isn't ready for them at all.

Towards advanced mathematical reasoning for LLMs via first-order logic theorem proving. ~ Chuxue Cao et als. arxiv.org/abs/2506.17104 #LLMs #ITP #LeanProver #Math

arXiv logo
arXiv.orgTowards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem ProvingLarge language models (LLMs) have shown promising first-order logic (FOL) reasoning capabilities with applications in various areas. However, their effectiveness in complex mathematical reasoning involving multi-step FOL deductions is still under-researched. While LLMs perform competitively on established mathematical reasoning benchmarks, they struggle with multi-step FOL tasks, as demonstrated by Deepseek-Prover-V2-7B's low accuracy (4.2%) on our proposed theorem proving dataset. This issue arises from the limited exploration of diverse proof strategies and the potential for early reasoning mistakes to undermine entire proofs. To address these issues, we propose DREAM, a self-adaptive solution that enhances the Diversity and REAsonability of LLMs' generation strategies. DREAM incorporates an Axiom-Driven Strategy Diversification mechanism to promote varied strategic outcomes and a Sub-Proposition Error Feedback to help LLMs reflect on and correct their proofs. Our contributions include pioneering advancements in LLMs' mathematical reasoning through FOL theorem proving, introducing a novel inference stage solution that improves performance by 0.6% to 6.4%, and providing a curated dataset of 447 mathematical theorems in Lean 4 format for evaluation.

We're used to computers being good at math, reasonably hard facts, and looking up information in vast databases.

#LLMs suck at exactly all of the former, and do none of it as part of their answer generation process¹.

The general public is not prepared. I see even very smart people struggle with coming to terms with it.

A lot of the demos that ridicule the #GenAI are cases of the tool being held wrong, but that's a consequence of people being — negligently if not maliciously — misled.

Reviving DSP for advanced theorem proving in the era of reasoning models. ~ Chenrui Cao, Liangcheng Song, Zenan Li, Xinyi Le, Xian Zhang, Hui Xue, Fan Yang. arxiv.org/abs/2506.11487v1 #AI #Math #AIforMath #LLMs #ITP #LeanProver

arXiv.orgReviving DSP for Advanced Theorem Proving in the Era of Reasoning ModelsRecent advancements, such as DeepSeek-Prover-V2-671B and Kimina-Prover-Preview-72B, demonstrate a prevailing trend in leveraging reinforcement learning (RL)-based large-scale training for automated theorem proving. Surprisingly, we discover that even without any training, careful neuro-symbolic coordination of existing off-the-shelf reasoning models and tactic step provers can achieve comparable performance. This paper introduces \textbf{DSP+}, an improved version of the Draft, Sketch, and Prove framework, featuring a \emph{fine-grained and integrated} neuro-symbolic enhancement for each phase: (1) In the draft phase, we prompt reasoning models to generate concise natural-language subgoals to benefit the sketch phase, removing thinking tokens and references to human-written proofs; (2) In the sketch phase, subgoals are autoformalized with hypotheses to benefit the proving phase, and sketch lines containing syntactic errors are masked according to predefined rules; (3) In the proving phase, we tightly integrate symbolic search methods like Aesop with step provers to establish proofs for the sketch subgoals. Experimental results show that, without any additional model training or fine-tuning, DSP+ solves 80.7\%, 32.8\%, and 24 out of 644 problems from miniF2F, ProofNet, and PutnamBench, respectively, while requiring fewer budgets compared to state-of-the-arts. DSP+ proves \texttt{imo\_2019\_p1}, an IMO problem in miniF2F that is not solved by any prior work. Additionally, DSP+ generates proof patterns comprehensible by human experts, facilitating the identification of formalization errors; For example, eight wrongly formalized statements in miniF2F are discovered. Our results highlight the potential of classical reasoning patterns besides the RL-based training. All components will be open-sourced.

Sehen wir uns nächste Woche in Bremen?

Bei zwei Freiraum-Terminen bin ich Mitgestalterin:

📕 🏃‍♀️ #BookSprint Welche neuen Themen braucht das "Handbuch IT in Bibliotheken" am Dienstag um 13 Uhr

🧠 🗃️ #LLMs zur Unterstützung in Katalogen am Freitag um 10 Uhr

<werbung>

Außerdem bin ich natürlich auch viel am Stand der Firma, wo wir unseren UI/UX-Designer Kai erstmals mit dabei haben. Er hat eine Blog-Serie zu Webdesign für Bibliotheken gestartet:

effective-webwork.de/webdesign

</werbung>

www.effective-webwork.deDigitale Dienste neu denken Interface- und Webdesign für Bibliotheken » effective WEBWORK GmbH | Vernetztes Arbeiten & Lernen im Internet einfach, effektiv, nutzerfreundlich

I saw #mastodonSocial just updated their terms of service to prohibit scraping data for training #LLMs which sucks. Maybe I'll move instances again.

But I wonder; how does this work across the #Fediverse? Like, surely an instance that federates with m.s doesn't have to abide by this rule, and could allow the same #ActivityPub content to be scrapped from their servers instead.

Which just makes this kind of performative.

#llm#AI#chatGPT