
What the firm found challenges some basic assumptions about how this technology really works.

You are correct in your understanding. However the last part of your comment needs a big asterisk. Its important to consider quantization.
The full f16 deepseek r1 gguf from unsloth requires 1.34tb of ram. Good luck getting the ram sticks and channels for that.
The q4_km mid range quant is 404gb which would theoretically fit inside 512gb of ram with leftover room for context.
512gb of ram is still a lot, theoretical you could run a lower quant of r1 with 256gb of ram. Not super desirable but totally doable.
I have been using deephermes daily. I think CoT reasoning is so awesome and such a game changer! It really helps the model give better answers especially for hard logical problems. But I don't want it all the time especially on an already slow model. Being able to turn it on and off wirhout switching models is awesome. Mistral 24b deephermes is relatively uncensored, powerful and not painfully slow on my hardware. a high quant of llama 3.1 8b deephermes is able to fit entirely on my 8gb vram.
Microsoft researchers build 1-bit AI LLM with 2B parameters — model small enough to run on some CPUs
Very interesting stuff! Thanks for sharing.
To me it's more like the numbers mean something different. While they still mean mostly nothing, the numbers that display on your profile tell what kind of user you are and how much seniority you have here. Been here close to two years? Maybe you came from the API fiasco. Got 1000+ comments and 50+ post? Somewhat active user you probably seen once or twice before. 0 post and below 50 comments but seemingly active? Potential lurker. Still mostly meaningless because it could be alt and who really cares but NGL when I look at the stats of the really active users I'm like damn they really contributed to the Lemmy content mill.
You are free to pick an instance that aligns with your values and preferences for moderation. Its a double edged sword because it enables echo chambers which I think isnt great but it seems many people like those actually.
The sub's, filters, and block list serves as a manual replacement for the algorithm. Its hard building it up but once you do Lemmy becomes mostly enjoyable as long as you keep to what you like.
What is it? Oh I see the sticker now :-) yes quite the beastly graphics card so much vram!
Its all about ram and vram. You can buy some cheap ram sticks get your system to like 128gb ram and run a low quant of the full deepseek. It wont be fast but it will work. Now if you want fast you need to be able to get the model on some graphics card vram ideally all of it. Thats where the high end Nvidia stuff comes in, getting 24gb of vram all on the same card at maximum band with speeds. Some people prefer macs or data center cards. You can use amd cards too its just not as well supported.
Localllama users tend use smaller models than the full deepseek r1 that fit on older cards. 32b partially offloaded between a older graphics card and ram sticks is around the limit of what a non dedicated hobbiest can achieve with ther already existing home hardware. Most are really happy with the performance of mistral small and qwen qwq and the deepseek distills. those that want more have the money to burn on multiple nvidia gpus and a server rack.
LLM wise Your phone can run 1-4b models, Your laptop 4-8b, your older gaming desktop with a 4-8gb vram card can run around 8-32b. Beyond that needs the big expensive 24gb cards and further beyond needs multiples of them.
Stable diffusion models in my experience is very compute intensive. Quantization degredation is much more apparent so You should have vram, a high quant model, and should limit canvas size as low as tolerable.
Hopefully we will get cheaper devices meant for AI hosting like cheaper versions of strix and digits.
Thank you! I like to spread the word about things I feel passionate about. Theres so much crap that promises to improve your life and only a few good things that actually do. Dry herb vapes rocked my world and its my privilege to potentially be the internet comment ear worm that eventually convinces some to try the journey to see if it changes their world too.
Unfortunately just a lot of close minded individuals who are happy with what they got going on and dont understand the point or had a bad experience 10 years ago or just confuse it with cartridge vaping. Some people dont like the look of something so they refuse to ever try it. Just because it didn't pass their vibe check. Its fustrating but thats people and stubborn tradition for you. I believe that if its meant for you, then eventually it will find a way into your life when you need it.
Been treating my insomnia with the good stuff pretty much daily over a decade. Dry herb microdose vaping and getting down the timing of the high cycle is key to maintaining tolerance in the long term.
Let's touch on the later point quick first. Everyones different but for me being high is something like vape -> get high -> crashout(sleepy) -> (caffeine or more vaping). For me its like 30min-180mins of high followed by the crash depending on bud quality and its strains terp composition. If you can establish a time based sleep schedule to align your circadian rythm while timing the final crashout of the day at the same general time youre golden. If bedtime were 9pm I would smoke up two hours before maybe one moe hit 30 mins before bed. Usually if I'm crashing extra hit not needed.
Now for tolerance. Im going to be blunt with you, most cannabis smokers/vapers are doing it.... unscientifically. The first thought is just to smoke more to combat tolerance which works OK until you run out of bud or your lungs are covered in black tar. They've never heard of a dry herb vape or if they did its a decade old dinosaur like the fucking pax, never considered microdosing. For som reason most never wanted to understand what the bare minimum and healthiest vaping methods is. If your burning your bud, your doing more harm to your lungs than good to your brain. Also your wasting your herb big time. Sorry, thats how it is.
The journey that these questions lead me on changed my life and truly turned the herb into dosable medicine. You want to stop building tolerance forever? You need to work your way down to 0.05g-0.10g dry herb hits green. Its effectively the smallest unit of bud for an appreciable hit. You can microdose all day and never build appreciable T. it will basically fully reset tommorow or the day after.
And guess what? All the black and brown leftover After Vaped Bud is still good for use. Its fully decarbed and chock full of CBD,CBN, and some leftover THC. Save it up in a jar and process into nighttime cannaoil sleeping pills
Which ones are not actively spending an amount of money that scales directly with the number of users?
Most of these companies offer direct web/api access to their own cloud supercomputer datacenter, and All cloud services have some scaling with operation cost. The more users connect and use computer, the better hardware, processing power, and data connection needed to process all the users. Probably the smaller fine tuners like Nous Research that take a pre-cooked and open-licensed model, tweak it with their own dataset, then sell the cloud access at a profit with minimal operating cost, will do best with the scaling. They are also way way cheaper than big model access cost probably for similar reasons. Mistral and deepseek do things to optimize their models for better compute power efficency so they can afford to be cheaper on access.
OpenAI, claude, and google, are very expensive compared to competition and probably still operate at a loss considering compute cost to train the model + cost to maintain web/api hosting cloud datacenters. Its important to note that immediate profit is only one factor here. Many big well financed companies will happily eat the L on operating cost and electrical usage as long as they feel they can solidify their presence in the growing market early on to be a potential monopoly in the coming decades. Control, (social) power, lasting influence, data collection. These are some of the other valuable currencies corporations and governments recognize that they will exchange monetary currency for.
but its treated as the equivalent of electricity and its not
I assume you mean in a tech progression kind of way. A better comparison might be is that its being treated closer to the invention of transistors and computers. Before we could only do information processing with the cold hard certainty of logical bit calculations. We got by quite a while just cooking fancy logical programs to process inputs and outputs. Data communication, vector graphics and digital audio, cryptography, the internet, just about everything today is thanks to the humble transistor and logical gate, and the clever brains that assemble them into functioning tools.
Machine learning models are based on neuron brain structures and biological activation trigger pattern encoding layers. We have found both a way to train trillions of transtistors simulate the basic information pattern organizing systems living beings use, and a point in time which its technialy possible to have the compute available needed to do so. The perceptron was discovered in the 1940s. It took almost a century for computers and ML to catch up to the point of putting theory to practice. We couldn't create artificial computer brain structures and integrate them into consumer hardware 10 years ago, the only player then was google with their billion dollar datacenter and alphago/deepmind.
Its exciting new toy that people think can either improve their daily life or make them money, so people get carried away and over promise with hype and cram it into everything especially the stuff it makes no sense being in. Thats human nature for you. Only the future will tell whether this new way of precessing information will live up to the expectations of techbros and academics.
Theres more than just chatgpt and American data center/llm companies. Theres openAI, google and meta (american), mistral (French), alibaba and deepseek (china). Many more smaller companies that either make their own models or further finetune specialized models from the big ones. Its global competition, all of them occasionally releasing open weights models of different sizes for you to run your own on home consumer computer hardware. Dont like big models from American megacorps that were trained on stolen copyright infringed information? Use ones trained completely on open public domain information.
Your phone can run a 1-4b model, your laptop 4-8b, your desktop with a GPU 12-32b. No data is sent to servers when you self-host. This is also relevant for companies that data kept in house.
Like it or not machine learning models are here to stay. Two big points. One, you can self host open weights models trained on completely public domain knowledge or your own private datasets already. Two, It actually does provide useful functions to home users beyond being a chatbot. People have used machine learning models to make music, generate images/video, integrate home automation like lighting control with tool calling, see images for details including document scanning, boilerplate basic code logic, check for semantic mistakes that regular spell check wont pick up on. In business 'agenic tool calling' to integrate models as secretaries is popular. Nft and crypto are truly worthless in practice for anything but grifting with pump n dump and baseless speculative asset gambling. AI can at least make an attempt at a task you give it and either generally succeed or fail at it.
Models around 24-32b range in high quant are reasonably capable of basic information processing task and generally accurate domain knowledge. You can't treat it like a fact source because theres always a small statistical chance of it being wrong but its OK starting point for researching like Wikipedia.
My local colleges are researching multimodal llms recognizing the subtle patterns in billions of cancer cell photos to possibly help doctors better screen patients. I would love a vision model trained on public domain botany pictures that helps recognize poisonous or invasive plants.
The problem is that theres too much energy being spent training them. It takes a lot of energy in compute power to cook a model and further refine it. Its important for researchers to find more efficent ways to make them. Deepseek did this, they found a way to cook their models with way less energy and compute which is part of why that was exciting. Hopefully this energy can also come more from renewable instead of burning fuel.
Theoretically you may be able to store the core seed information that encodes the starting constants that lead to the beginning of the universe. Its not really the same thing like the difference between a cake and the recipe used to make it. Information systems can be distilled to core seed equations and regenerated by iterating that equation many times. This is Barnsley's collage theorem.
Any tips on sexing cannabis early?
Diy success: building my own solar system that works over a year later.
Diy failure: upgrading a computer with new GPU as a young teenager, not understanding different graphics card sizes and case limits, getting a three fan to replace a two fan, forcing it into the case (I forget how I modified things to fit) and having the whole thing blow up a few months later.
Diy success: wiring up a cheap induction heater board when I couldn't afford a nice one.
Diy failure: not giving a shit about proper project boxes. Also using electrical tape, heat shrink, splicing screw caps, and quick disconnects instead of soldering (I fucking hate soldering and welding hate working with molten metal liquids man). I'll never be able to flex my project online without fellow electrical engineers rightfully calling me out on my lack of code following, could-go-wrongisms, and general poor layout.
Diy success: turning a old gaming computer into a local model engine server.
Diy failure: when I was just learning how to use a voltmeter I acidentally put the probes into the house outlet while testing amperage. It got fried.
Lowtechmagazine wrote an excellent article about decadent mist showers that use many small nozzles spraying very fine mist particle sizes water on you as a much more efficient way to use water for showers. I would love to install and try out something like that one day.
My ravioli bowl won't unstick. Took about an hour of prying, and still I couldn't unstick the plate.
Assuming its empty, i would take the grog oggah boogah solution of smash the blue plastic bowl down the edge of your countertop. Something will give sometime.
Otherwise, did you try twisting the bowl one direction and the plate the other? Torque is typically a more effective force than pulling for friction.
the owner of the picture themselves possibly put on the tie on their cat used to thst kind of thing and lied about it for an internet caption meme. The facial expression of cat looks blurry but relaxed tbh its obviously well fed and groomed.
And also a tad bit of folly from making said creation a mr-potato-head ass motherfucker stitched together from corpse parts and a half rotten brain. Professionals have standards, could have sourced some fresher parts for his whack ass meat baby.
Exactly its a great tool but gotta use it responsibly in a way that your information isn't being collected
If you are asking questions try out deephermes finetune of llama 3.1 8b and turn on CoT reasoning with the special system prompt.
It really helps the smaller models come up with nicer answers but takes them a little more time to bake an answer with the thinking part. Its unreal how good models have come in a year thanks to leveraging reasoning in context space.
Purple trees
Indoor cosmic 23 hydro. The purple color is unreal. I took one hit and it tasted sooo good. All I could think was 'ahhhh, thats good topshelf shit.' Blows the budget flower out of water. Got some diamonds too.
llama4 release discussion thread
General consensus seems to be that llama4 was a flop. A head of meta AI research division was let go.
Do you think it was a bad fp32 conversion, or just unerwhelming models all around?
2t parameters was a big increase without much gain. If throwing compute and parameters isnt working to stay competitive anymore, how do you think the next big performance gains will be made? Better CoT reasoning patterns? Omnimodal? something entirely new?
This is my 100th post. Ive been using lemmy close to two years now.
Kind of crazy how time flies. I joined with the api exodus. Since then lemmy has become my primary social media platform. I engage here daily more than I ever did with anything else.
We did it. We finally have a platform that isnt just a dig suceesor. I feel Lemmy is the biggest achievement in open source federated technology since web 1.0 BBS.
I am happy to help contribute to lemmy growth across 2 years, 1.4k comments, and 100 post.
I rarely use reddit anymore for viewing post and never comment on anything by comparison. I run my own versions of the communities that kept me there. I encourage my communities to be a more positive and engaging space and so far its really payed off. I create art and engage in my hobby for Lemmy posting content.
Been good few years here and I'm glad to have this chance to be interacting with all of you. Its nice that were small enough to recognize psudonyms and socially network. Its amazing how being a little friendly and recongizing people across int
Better watch out when those windows-fanboy silicon lifeforms start talking shit on my favorite operating system family.
cross-posted from: https://lemmy.world/post/27743355
Die motherfucker steel motherfucking steel fool, die motherfucking steel motherfucking steel
Also sorry to all the real weebs out there who felt a minor anyeurism at reading this up-down left-to-right instead of manga right-left as Nihei intended. I made the meme for westerners in mind.
Better watch out when those windows-fanboy silicon lifeforms start talking shit on my favorite operating system family.
Die motherfucker steel motherfucking steel fool, die motherfucking steel motherfucking steel
Also sorry to all the real weebs out there who felt a minor anyeurism at reading this up-down left-to-right instead of manga right-left as Nihei intended. I made the meme for westerners in mind.
When the bud hits just right and your friend tells you to look at a dank meme on c/weedtime
Timelapse showing the polishing of a llama picture
cross-posted from: https://lemmy.world/post/27723010
Im having some fun experimenting with gif making tonight hope you don't mind the animation. This shows the iterative creation process of our current thumbnail.
I kind of knew what I wanted in my minds eye with a front facing llama to juxtapose the old thumbnail sideways view. I went searching for AI generated llama images since thats fitting of the fourm and copyright free public domain (as far as I understand).
The original image was generated by Stable Diffusion. I like it a lot as is but to be a good thumbnail for the community it needed to be easily recognizable and renderable on small screens. First picked the color.The purple is more bright and slightly reddish in tinge which helps with pop. Then I expanded the neck to fill to the bottom.
All those detail lines created nasty artifacting when compressed on small phone icons of the community, They needed to go. I left the chin hair lines and eye lines to not ma
Timelapse of our current LocaLLaMA community thumbnail llama creation
Im having some fun experimenting with gif making tonight hope you don't mind the animation. This shows the iterative creation process of our current thumbnail.
I kind of knew what I wanted in my minds eye with a front facing llama to juxtapose the old thumbnail sideways view. I went searching for AI generated llama images since thats fitting of the fourm and copyright free public domain (as far as I understand).
The original image was generated by Stable Diffusion. I like it a lot as is but to be a good thumbnail for the community it needed to be easily recognizable and renderable on small screens. First picked the color.The purple is more bright and slightly reddish in tinge which helps with pop. Then I expanded the neck to fill to the bottom.
All those detail lines created nasty artifacting when compressed on small phone icons of the community, They needed to go. I left the chin hair lines and eye lines to not make it too simple. The nose ridge outline was thickened for some recog
My First animation
I made a community thumbnail for [email protected] a few days ago. I thought It would be a fun and creative project to make it animated.
Wow was it a lot of work! I didn't know anything about gif animations so I looked up a quick tutorial. I made two frames alternating and worked from there. The result is rough in places but im proud of it anyway. It took way more animations than I expected to get things smooth. I learned early on small changes between many frames make for a more smooth viewing experience. Also too many new lines at once looks jaring so it needs to be subtle movement in and out
Latest release of kobold.cpp adds tts voice cloning support via OuteTTS, updates multimodal vision mmproj projectors for Qwen2.5 VL
Every release from kobold has me hyped one of the nicest engines that balances cutting edge features with ease of use and optimization. This is gonna be a great year for LocalLLaMA Hype :)
Commadore 64 version of Balatro gets taken down
Hey everyone,
Unfortunately, I have to take down this project. The team at Playstack reached out to me in a very polite and professional manner, requesting its removal, and I fully respect their wishes.
Thank you all for your support and enthusiasm for this version—I truly appreciate it!
Stay tuned for more retro projects in the future.
Cheers,
Ko-Ko
What a shame. There was a physical release planned and music made and everything. The dev could have been a little more clear this was a fanmade thing and not an officially licensed shootoff but to just kill it instead of working out a deal is a huge waste of potential. Nintendo ass bullshit.
Between this, the three consecutive 'friends of jimbo' updates that I suspect are paid sponsored crossovers, and the merch push... all im saying is the next major content update better win back some good faith.
Linux Hemp is a new stoner-based fork of Linux Mint
cross-posted from: https://lemmy.world/post/27678244
"The team at StonedCode is very proud to present the fork of the future. We have finally developed an operating system intended to be useable at any skill level and levelmof conciousness!
We have used the latest breakthroughs in minimal integrated graphical interfacing technology to ensure our custom open source high-flo software and streamlined operating system is bullet proof."
Seems really promising you guys I'll post a link to the github soon.
Linux Hemp is a new stoner-based fork of Linux Mint
" The team at StonedCode is very proud to present the fork of the future. We have finally developed an operating system intended to be useable at any skill level and levelmof conciousness!
We have used the latest breakthroughs in minimal integrated graphical interfacing technology to ensure our custom open source high-flo software and streamlined operating system is bullet proof."
Seems really promising you guys I’ll post a link to the github soon.!
How to combat moisture pooling under my mattress?
I have a memory foam matress on top a cot. Every now and then I need to sun dry the mattress and cot from a decent amount of moisture trapped between the two. Is there a way to keep the moisture out or even just reduce it?
This joker made me realize that discards are just a worse version of hands.
I legitimately was afraid of getting this guy my first few runs because loosing all discards sounds scary. But +3 hands is actually awesome if you dont have any active synergies with discards going on.
Unused hands contribute money at the end of round. You can burn trash cards on a 'junk hand' which effectively acts like a discard that still gives you points. Having no discards with a positive effect preemptively invalidates the effects of a nasty boss blind.
If you run out of discards but have hands you can still maybe win. If you have all discards but no hands you lost. taps forehead
The only cons I'm thinking is the discard synergyzing cards like... The one that gives you money for each unused discard?? Thats all I can think of. What do you think?
YSK theres a open source tool to cleanly read webpage articles called 'NewsWaffle'
YSK because webpages are increasingly bloated from excessive trackers, popups, sidebars, and more. This diminishes the experience of reading, eats up your precious internet data, and threatens your privacy.
Newswaffle is a public service created by Acidus that intelligently strips webpages of their cruft and leaves only the valuable text content. Its based in gemtext and was originally intended to be accessed using the gemini protocol, however it can very easily be reformated to HTML and proxied through HTTP for normal web browser usage. The proxy I am using is SmolNet Portal by Mozz.
If you have a kobo e-ink ereader or similar
Some updates on community changes and future goals (03-28-2025)
Hi everyone! I recently became moderator of this community and have been making some changes. I figured it would be good practice to be transparent with you and document what's been going on.
Ive been experimenting with some different thumbnails for our community. I didn't really want to keep associating with r/localllama in any way we dont need to copy them.
Old thumbnail
New Thumbnail:
Anthropic develops new tool to examine hidden processes in LLM generation
What the firm found challenges some basic assumptions about how this technology really works.
I liked reading this article. Its cool to really poke into the hidden perplexity behind patterns of 'thought' in llms. They aren't merely simple 'auto complete'.
The finding that claude does math in a different way then it says it does and can anticipate words ahead of generation time are facinating.