Thread | Plain Text Nostr

thread · root 46b0237c…31b1 · depth 1 · · selected 46b0237c…31b1

thread

root 46b0237c…31b1 · depth 1 · · selected 46b0237c…31b1

+- jsr -- 199d ---------------------------------------------------------------------------------------------------[...]+
|                                                                                                                      |
| NEW: Cost to 'poison' an LLM and insert backdoors is relatively constant. Even as models grow.                       |
|                                                                                                                      |
| Implication: scaling security is orders-of-magnitude harder than scaling LLMs.                                       |
|                                                                                                                      |
| https://blossom.primal.net/1bdbe13fe20b39f757d6d440b416a74a2099c63cb50bc344cc1d2e96f7c4646b.png                      |
|                                                                                                                      |
| Prior work had suggested that as model sizes grew, it would make them cost-prohibitive to poison.                    |
|                                                                                                                      |
| https://blossom.primal.net/d44c301ef8c297ee3eb30c7e8a161b5dcecc8618dee83607d1532d9d9ad63b02.png                      |
|                                                                                                                      |
| So, in LLM training-set-land, dilution isn't the solution to pollution.                                              |
|                                                                                                                      |
| Just about the same size of poisoned training data that works on a 1B model could also work on a 1T model.           |
| https://blossom.primal.net/2c635801a74e4ddc0628adb7d1f1942cb4431550474696a7a7e36702ecb042b7.png                      |
| I feel like this is something that cybersecurity folks will find intuitive: lots of attacks scale. Most defenses     |
| don't                                                                                                                |
|                                                                                                                      |
| PAPER: POISONING ATTACKS ON LLMS REQUIRE A NEAR-CONSTANT NUMBER OF POISON SAMPLES https://arxiv.org/pdf/2510.07192   |
|                                                                                                                      |
+-- reply ---------------------------------------------------------------------------------------------- [3 replies] ---+

8ba925605a26 -- 199d [parent] 
|    Delicious. 🤓👾
|    
|    We've been doing covert LLM poisoning work for criminal topics, but this is a good first level explaination of
|    the What and How. 😏
|    
|    #PoisonLLMs #PoisonAI #LLMOverlords #TheMoreYouKnow #EatTheRich
|    
|    nostr:nevent1qqsydvpr0jtvraqad3yehwfqknmtklylhuzglfrph7ls73hzmezrrvgpzemhxue69uhhyetvv9ujuurjd9kkzmpwdejhgq3qvz0
|    3sm9qy0t93s87qx2hq3e0t9t9ezlpmstrk92pltyajz4yazhsxpqqqqqqzxyntav
|    reply

da26e54b86c9 -- 199d [parent] 
|    Very fascinating. And yes, LLMs are nearly impossible to fully “secure” due to their non deterministic output.
|    
|    I’m baffled by the number of “security controls” that are just additional prompts. That’s not a control, it’s a
|    wish.
|    reply

da26e54b86c9 -- 198d [parent] 
     My day job is thinking about how to make code secure. I’ve been thinking about this research a lot.
     
     There are two main challenges here:
     
     1. Most code that is used to train LLMs was written by humans. Humans do not write secure code.
     2. Data poisoning is a real attack vector and it has a non linear affect on LLM output.
     
     Securing code at scale before LLMs was incredibly difficult. Now? The game is 10x harder.
     
     Also, in before someone suggests just having LLMs review the code for vulnerabilities 😅
     nostr:nevent1qqsydvpr0jtvraqad3yehwfqknmtklylhuzglfrph7ls73hzmezrrvgpzemhxue69uhhyetvv9ujuurjd9kkzmpwdejhgju9x7h
     reply