brusli Posted January 26 Posted January 26 Jel pogodio ? Pricao Rambo da komunicira sa ChatGPT, najvise o filozofiji kad malo popije. I kaze mu: ti si artificial intelegince, art znaci umetnost, znaci ti si "umetnicka inteligencija". I to se njemu jako svidelo.
Fins fleet Posted January 27 Posted January 27 VRedi poslusati A za Kineze. Open AI je otvoreno priznao da gubi gomilu para i da ce da povisi cene. Na koju foru onda Kinezi mogu da prodaju konkurenta za 30 puta manje i da ne give pare?
Shan Jan Posted January 27 Posted January 27 Dok ti radis research u njihovom AI, kineska vlada verovatno radi research na celom svetu.
Engineer Posted January 27 Posted January 27 14 hours ago, Vapad said: Što, ja baš u dostupnosti više modela vidim ogroman benefit To je u redu, ali poenta AI, bar kako nam vlasnici i investitori govore, je da nam ubrza naš svakodnevni rad. A ako ja moram na 3 mesta da pitam istu stvar i analiziram odgovore, onda ništa od toga. Jedino da pitam nekog četvrtog da uradi to umesto mene Znam da se to ne odnosi na sve oblasti, ali problem je što ne znam koje konkretno. Ako ga pitam za neki kod, pa mi da pogrešan, pa vidim da ne radi i tako dalje, možda mi je lakše da odmah odem na stackoverflow ili neku stranicu u vrhu pretrage
vememah Posted January 27 Posted January 27 Quote >> Buys 10,000 H800 chips in 2021 and brings over his top hedge fund employees (all have tons of experience squeezing juice out of Nvidia GPUs for the fund) >> Launched DeepSeek in 2023 and hires dozens of PhDs from top Chinese universities (Peking, Tsinghua and Beihang) >> Pays top top top salary for tech talent only matched by Bytedance in China…wants DeepSeek to be leading “local” company >> US export restrictions force DeepSeek team to get creative and they do, finding new training methods to make LLM models (V3, r1) competitive with OpenAI, Anthropic, Gemini, Grok, LLama etc at ~1/20th the cost >> Training costs not exactly apples-to-apples but novel methods and clear improvements in efficiency (also questions around copying other models, larger H-100 clusters they maybe can’t talk about and/or CCP support) >> Open sources and publishes methods (r1 reasoning paper has 200+ authors) >> DeepSeek just hit top of App Store *** FT: https://ft.com/content/747a7b11-dcba-4aa5-8d25-403f56216d7e 1 1
Venom Posted January 27 Posted January 27 Interesantno je da se tržište u globalu malo oporavilo od prepodneva, ali Nvidia u sve većem kanalu. Mada kanal je relativan, i dalje je na ~8x u odnosu na pre dve godine.
James Marshall Posted January 27 Posted January 27 Meni je super kad ovo kažu kako je izbrisano toliko i toliko vrijednosti - vrijedi onoliko za koliko si prodao, a ne za koliko misliš da bi moglo otići. Čuj trilionipo otišlo za dan. Trilionipo čega, fiktivne vrijednosti. 3
Budja Posted January 27 Posted January 27 (edited) 14 minutes ago, James Marshall said: Meni je super kad ovo kažu kako je izbrisano toliko i toliko vrijednosti - vrijedi onoliko za koliko si prodao, a ne za koliko misliš da bi moglo otići. Čuj trilionipo otišlo za dan. Trilionipo čega, fiktivne vrijednosti. Gresis. U petak to JESTE bila vrednost, jer vrednost se racuna po ceni kupoprodaje. Dakle, u petak JESTE neko kupio a neko proado Nvidiu po ceni koja je sada 17% manja. Sad, ako cemo filozofski i finansijski to nije bila intrinsic, stvarna vrednost, ali jeste monetarna vrednost u nekom trenutku kojom su ljudi trgovali. Edited January 27 by Budja 2
vememah Posted January 27 Posted January 27 (edited) On 22.1.2025. at 19:52, Engineer said: Šta kaže da se desilo na Tiananmenu? Navukli ga iskusni Reditori da počne da objašnjava onu poznatu sliku lika ispred tenkova, čak je i datum naveo, pa se u hodu vratio na partijsku liniju kad je trebalo da ispiše Tjenanmen. Edited January 27 by vememah 2
vememah Posted January 27 Posted January 27 (edited) Quote Morgan Brown @morganb 18h • 19 tweets • 4 min read • Read on X 🧵 Finally had a chance to dig into DeepSeek’s r1… Let me break down why DeepSeek's AI innovations are blowing people's minds (and possibly threatening Nvidia's $2T market cap) in simple terms... 0/ first off, shout out to @doodlestein who wrote the must-read on this here: The Short Case for Nvidia StockAll the reasons why Nvidia will have a very hard time living up to the currently lofty expectations of the market. 1/ First, some context: Right now, training top AI models is INSANELY expensive. OpenAI, Anthropic, etc. spend $100M+ just on compute. They need massive data centers with thousands of $40K GPUs. It's like needing a whole power plant to run a factory. 2/ DeepSeek just showed up and said "LOL what if we did this for $5M instead?" And they didn't just talk - they actually DID it. Their models match or beat GPT-4 and Claude on many tasks. The AI world is (as my teenagers say) shook. 3/ How? They rethought everything from the ground up. Traditional AI is like writing every number with 32 decimal places. DeepSeek was like "what if we just used 8? It's still accurate enough!" Boom - 75% less memory needed. 4/ Then there's their "multi-token" system. Normal AI reads like a first-grader: "The... cat... sat..." DeepSeek reads in whole phrases at once. 2x faster, 90% as accurate. When you're processing billions of words, this MATTERS. 5/ But here's the really clever bit: They built an "expert system." Instead of one massive AI trying to know everything (like having one person be a doctor, lawyer, AND engineer), they have specialized experts that only wake up when needed. 6/ Traditional models? All 1.8 trillion parameters active ALL THE TIME. DeepSeek? 671B total but only 37B active at once. It's like having a huge team but only calling in the experts you actually need for each task. 7/ The results are mind-blowing: - Training cost: $100M → $5M - GPUs needed: 100,000 → 2,000 - API costs: 95% cheaper - Can run on gaming GPUs instead of data center hardware 8/ "But wait," you might say, "there must be a catch!" That's the wild part - it's all open source. Anyone can check their work. The code is public. The technical papers explain everything. It's not magic, just incredibly clever engineering. 9/ Why does this matter? Because it breaks the model of "only huge tech companies can play in AI." You don't need a billion-dollar data center anymore. A few good GPUs might do it. 10/ For Nvidia, this is scary. Their entire business model is built on selling super expensive GPUs with 90% margins. If everyone can suddenly do AI with regular gaming GPUs... well, you see the problem. 11/ And here's the kicker: DeepSeek did this with a team of <200 people. Meanwhile, Meta has teams where the compensation alone exceeds DeepSeek's entire training budget... and their models aren't as good. 12/ This is a classic disruption story: Incumbents optimize existing processes, while disruptors rethink the fundamental approach. DeepSeek asked "what if we just did this smarter instead of throwing more hardware at it?" 13/ The implications are huge: - AI development becomes more accessible - Competition increases dramatically - The "moats" of big tech companies look more like puddles - Hardware requirements (and costs) plummet 14/ Of course, giants like OpenAI and Anthropic won't stand still. They're probably already implementing these innovations. But the efficiency genie is out of the bottle - there's no going back to the "just throw more GPUs at it" approach. 15/ Final thought: This feels like one of those moments we'll look back on as an inflection point. Like when PCs made mainframes less relevant, or when cloud computing changed everything. AI is about to become a lot more accessible, and a lot less expensive. The question isn't if this will disrupt the current players, but how fast. /end P.S. And yes, all this is available open source. You can literally try their models right now. We're living in wild times! 🚀 Momma, I'm going viral! No substack or gofundme to share but a few things to add/clarify: 1/ The DeepSeek app is not the same thing as the model. Apps are owned and operated by a Chinese corporation, the model itself is open source. 2/ Jevon's paradox is the counter argument. Thanks papa @satyanadella. Could be a mix shift in chip type, compute type, etc. but we're constrained by power and compute right now, not demand constrained. 3/ The techniques used are not ground breaking. It's the combination of them w/the relative model performance that is so exciting. These are common eng techniques that combined really fly in the face of more compute is the only answer for model performance. Compute is no longer a moat. 4/ Thanks to all for pointing out my NVIDIA market cap numbers miss and other nuances - will do better next time, coach. 🫡 https://threadreaderapp.com/thread/1883686162709295541.html Edited January 27 by vememah 2 2
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now