Best LLMs for PC/Tech Troubleshooting?

FrankLaskey@lemmy.ml · 6 days ago

Yes I did see that as well. That does seem to be the real Achilles heel here. Will have to try it myself to see how much it exacerbates context size limitations given I would be running it on a single 24 GB VRAM GPU. I wonder if adjusting reasoning effort parameters could make a difference without affecting quality too much?

FrankLaskey@lemmy.ml · edit-2 7 days ago

Was just looking at the benchmarks for it on artificialanalysis.ai and it looks great for the size. Probably best available for general use if you’re looking for something under 40b parameters I’d say. Even more impressive is the agentic capabilities and the fact that it is actually decent in terms of hallucinations (not amazing given it’s a small-medium size model but decent).

FrankLaskey@lemmy.ml · 30 days ago

Hmmm… if an ‘organism’ has no organs, is it still an organism? 🤔

FrankLaskey@lemmy.ml · 2 months ago

Godspeed. I also nearly beat the game and then put it down and forgot most of it so ended up beating it almost twice. It’s a great game. Still haven’t finished the Blood and Wine DLC though…

FrankLaskey@lemmy.ml · 2 months ago

Finally got around to playing Caves of Qud and it took me a while to figure it out and dial in the controls on Steam Deck but now it’s starting to become clear just how insane this game is…

FrankLaskey@lemmy.ml · 5 months ago

This article in the Guardian is definitely worth a read if you’re not intimately familiar with just how it got this way… It’s 8 years old so it won’t cover recent history but does give you an idea of how it started.

And yes Robert Maxwell (father of Ghislaine) is mostly to blame.

FrankLaskey@lemmy.ml · 6 months ago

Yes. GLM 4.5 is excellent. I mostly use that along with Qwen3 and DeepSeek 3.1 Terminus (Thinking) these days.

FrankLaskey@lemmy.ml · 7 months ago

Thanks. I may give an updated system prompt like this a shot. Not sure where mine went wrong other than maybe it wasn’t being honored or seen by OpenRouter (I’m not running 120b locally, it’s too large for my set up). I’m actually a bit confused on how to set parameters with OpenRouter.

FrankLaskey@lemmy.ml · 7 months ago

Yes, I do local host several models. Mostly the Qwen3 family stuff like 30b a3b etc. Have been trying GLM 4.5 a bit through OpenRouter and I’ve been liking the style pretty well. Interesting to know I could just pop in some larger RAM dimms potentially and run even larger models locally. The thing is OR is so cheap for many of these models and with zero data retention policies I feel a bit stupid for even buying a 24 GB VRAM GPU to begin with.

FrankLaskey@lemmy.ml · 7 months ago

Is this Grok Code fast 1? I’ve noticed it’s hitting tops on OR for programming as of recently. I was going to try it out but it won’t respect my zero data retention preference unsurprisingly.

FrankLaskey@lemmy.ml · 7 months ago

Honestly it has been good enough until recently when I’ve been struggling specifically with docker networking stuff and it’s been on the struggle bus with that. Yes, I’m using OpenRouter via OpenWebUI. I used to run a lot of stuff locally (mostly 4-b it quant 32b and smaller since I only have a single 3090) but lately I’ve been trying more larger models out on OpenRouter since many of the non proprietary ones are super cheap. Like fractions of a penny for a response… Many are totally free to a point as well.

FrankLaskey@lemmy.ml · 7 months ago

I tried that! I literally told it to be concise and to limit its response to a certain number of words unless strictly necessary and it seemed to completely ignore both.

FrankLaskey@lemmy.ml · 7 months ago

I definitely have been looking out for this for a while. Wanting to replicate GPT deep research but not seeing a great way to do this. I did see that there was a OWUI tool for this but it didn’t seem particularly battle-tested so I hadn’t checked it out yet. I’ve been curious about how the new Tongyi Deep Research might be…

That said, specifically for troubleshooting somewhat esoteric (or at least quite bespoke in terms of configuration) software problems, I was hoping the larger coder focused models would have enough built-in knowledge to suss out the issues. Maybe I should be having them consistently augment their responses with web searches if this isn’t the case? I have not been clicking that button typically.

I do generally try to paste in or link as much of the documentation for whatever software I’m troubleshooting though.

FrankLaskey@lemmy.ml · 7 months ago

The coder model (480B). I initially mistakenly said the 235b one but edited that. I didn’t know you could customize quant on OpenRouter (and I thought the differences between most modern 4 bit quants and 8-bit was minimal as well…) I have tried GPT OSS 120 a bunch of time and though it seems quote unquote ‘intelligent’ enough it is just too talkative and verbose for me (plus I can’t remember the last time it responded without somehow working an elaborate comparison table into the response) and it makes it too hard to parse through things.

FrankLaskey@lemmy.ml · 7 months ago

Appreciate you sharing your experience. With this being the case and it being an order of magnitude more $$$ than Qwen3 coder, I think I’ll mostly steer clear for now. Not sure why this model seems to have such mindshare and dominance with programmers these days honestly. Other than many in the west seem somewhat biased against Chinese models.

FrankLaskey@lemmy.ml · 7 months ago

Even if this isn’t going to solve the issue of the quality of the LLM’s advice and help, it would massively simplify my current workflow which is copy/pasting logs and command responses and everything into the OWUI window. I’ll check it out. Can you use OpenRouter with VSCode to have access to more models or?

FrankLaskey@lemmy.ml · 7 months ago

Best LLMs for PC/Tech Troubleshooting?

FrankLaskey@lemmy.ml · 8 months ago

In my opinion, Qwen3-30B-A3B-2507 would be the best here. Thinking version likely best for most things as long as you don’t mind a slight penalty to speed for more accuracy. I use the quantized IQ4_XS models from Bartowski or Unsloth on HuggingFace.

I’ve seen the new OSS-20B models from OpenAI ranked well in benchmarks but I have not liked the output at all. Typically seems lazy and not very comprehensive. And makes obvious errors.

If you want even smaller and faster the Qwen3 Distill of DeepSeek R1 0528 8B is great for its size (esp if you’re trying to free up some VRAM to use larger context lengths)

FrankLaskey@lemmy.ml · 10 months ago

Not to mention battery life…

FrankLaskey@lemmy.ml · 10 months ago

It’s new so reviews are just filtering out but it’s starting to look like SteamOS powered version of the Legion Go S (Z1 Extreme version) is a pretty great handheld that uses the latest AMD chipset with a sizable assist from Linux/proton efficiencies vs Windows to drive a 15-30% performance improvement which does make some more modern games more playable though it is significantly more expensive than the deck. I watched Retro Games Corps review of it yesterday. That said, if you’re okay waiting another couple years or so I bet there will be a Steam Deck 2 release but it seems like it mainly rests on AMD to deliver a significant (“generational”) leap with upcoming mobile APUs. Valve seems keen on not releasing a follow-up to the first deck until it is significantly better in every way and the chipsets available now just aren’t quite there yet it seems.

FrankLaskey@lemmy.ml · 1 year ago

Interesting. I have always felt that the Steam deck loses quite a bit of battery percentage during sleep. I agree that it would be a fantastic quality of life update to enable to shut down or enter some form of lower power consumption hibernation state after a period of time at a certain battery level.

FrankLaskey@lemmy.ml · 1 year ago

The Tech to Build the Holodeck [Gaussian Splatting]

FrankLaskey@lemmy.ml · 1 year ago

Bioshock creator Ken Levine discusses the future of narratives in games

FrankLaskey@lemmy.ml · 1 year ago

How do Graphics Cards Work? Exploring GPU Architecture - YouTube by Branch Education (28:29 minutes) from Oct 19, 2024

FrankLaskey@lemmy.ml · 2 years ago

YSK about the app Yuka which lets you scan food and cosmetics to identify harmful chemicals or additives

FrankLaskey@lemmy.ml · 3 years ago

FrankLaskey@lemmy.ml · 3 years ago

Best LLMs for PC/Tech Troubleshooting?

Best LLMs for PC/Tech Troubleshooting?

The Tech to Build the Holodeck [Gaussian Splatting]

The Tech to Build the Holodeck [Gaussian Splatting]

Bioshock creator Ken Levine discusses the future of narratives in games

Bioshock creator Ken Levine discusses the future of narratives in games

How do Graphics Cards Work? Exploring GPU Architecture - YouTube by Branch Education (28:29 minutes) from Oct 19, 2024

How do Graphics Cards Work? Exploring GPU Architecture - YouTube by Branch Education (28:29 minutes) from Oct 19, 2024

YSK about the app Yuka which lets you scan food and cosmetics to identify harmful chemicals or additives

YSK about the app Yuka which lets you scan food and cosmetics to identify harmful chemicals or additives

Hope

Hope

Enjoying the show..

Enjoying the show..