Tracing Chemical Knowledge Over Centuries with #LLMs
Diego Alves, Sergei Bagdasarov & Badr M. Abdullah prompted models to generate structured metadata for 47k+ texts from the #RoyalSociety Corpus (1665–1996), enabling large-scale comparison of #Chemistry and #Biology over time.
They tracked how chemical substances migrated between disciplines revealing a "chemicalization" of biology in the 19th century and a long-term trend toward standardization. #OpenScience #DiachronicAnalysis #NLP
Why you should maintain a personal LLM coding benchmark. ~ Edward Z. Yang. https://blog.ezyang.com/2025/04/why-you-should-maintain-a-personal-llm-coding-benchmark/ #LLMs #Programming
The widespread belief that #LLMs will replace all of our jobs is the strongest example of the Dunning-Kruger effect in my lifetime.
This effect normally applies to a small number of rather stupid people. But it can also affect a society at large when we extrapolate a tech that no one properly understands.
This is one reason why UX people tend to be so reluctant to drink the Kool-Aid: we've been here many, many times.
@FlipboardUK @top-stories-in-tech-FlipboardUK all of them an insult to Life itself
All perfectly normal in the world...
WaPo: Tinder lets you flirt with AI characters. Three of them dumped me.
Posting another #Introduction - plz boost far and/or wide!
#French-Born, #London-Based CompSci Teacher/Education PhD
#Education #Research #Phd, #BCS #Computing #Teacher #CCT
#CSEd #Programming #BCS
#ActuallyAutistic
#ActuallyADHD
I live with #MultipleSclerosis
#Zen / #Nonduality #Buddhist, weirdly into #Jung
#Research topics:
- #EdAI / #AIEd - #LLMs in #Education
- #CriticalStudies of #EdTech
- #Neurodiversity in #Education, and the experience of ND educators.
#AI 'Godfather' #YannLeCun: #LLMs Are Nearing the End, but Better #AI Is Coming https://www.newsweek.com/ai-impact-interview-yann-lecun-llm-limitations-analysis-2054255
LeanSolver: Solving theorems through large language models and search. ~ Avi Luciano Halevy https://repository.tudelft.nl/file/File_a98b6c93-4017-42c5-bb8f-df68da0d7034 #LLMs #ITP #LeanProver
#LLMs Pass the #TuringTest: Interrogators mistook GPT-4.5 for a human 73% of the time—far more than they did the actual human participant https://arxiv.org/abs/2503.23674
Proof or bluff? Evaluating LLMs on 2025 USA math olympiad. ~ Ivo Petrov et als. https://arxiv.org/abs/2503.21934 #LLMs #Math
Readings shared April 1, 2025. https://jaalonso.github.io/vestigium/posts/2025/04/01-readings_shared_04-01-25 #AI #Haskell #ITP #IsabelleHOL #LLMs #LeanProver #Logic #LogicProgramming #Math #Prolog #SMT #Z3
**Are chatbots reliable text annotators? Sometimes**
“_Given the unreliable performance of ChatGPT and the significant challenges it poses to Open Science, we advise caution when using ChatGPT for substantive text annotation tasks._”
Ross Deans Kristensen-McLachlan, Miceal Canavan, Marton Kárdos, Mia Jacobsen, Lene Aarøe, Are chatbots reliable text annotators? Sometimes, PNAS Nexus, Volume 4, Issue 4, April 2025, pgaf069, https://doi.org/10.1093/pnasnexus/pgaf069.
#OpenAccess #OA #Article #AI #ArtificialIntelligence #LargeLanguageModels #LLMS #Chatbots #Technology #Tech #Data #Annotation #Academia #Academics @ai
STP: Self-play LLM theorem provers with iterative conjecturing and proving. ~ Kefan Dong, Tengyu Ma. https://arxiv.org/abs/2502.00212 #AI #LLMs #ITP #LeanProver
The cultural divide between mathematics and AI (A reflection on cultural differences observed at the 2025 Joint Mathematics Meeting). ~ Ralph Furman. https://sugaku.net/content/understanding-the-cultural-divide-between-mathematics-and-ai/ #AI #LLMs #Math
The disconnect between AI benchmarks and math research (Evaluating AI systems on their ability to be a mathematical copilot). ~ Ralph Furman. https://sugaku.net/content/ai-benchmarks-vs-real-math-research/ #AI #LLMs #Math