• 7 Posts
  • 58 Comments
Joined 6 years ago
cake
Cake day: June 30th, 2020

help-circle








  • Yes, I do local host several models. Mostly the Qwen3 family stuff like 30b a3b etc. Have been trying GLM 4.5 a bit through OpenRouter and I’ve been liking the style pretty well. Interesting to know I could just pop in some larger RAM dimms potentially and run even larger models locally. The thing is OR is so cheap for many of these models and with zero data retention policies I feel a bit stupid for even buying a 24 GB VRAM GPU to begin with.



  • Honestly it has been good enough until recently when I’ve been struggling specifically with docker networking stuff and it’s been on the struggle bus with that. Yes, I’m using OpenRouter via OpenWebUI. I used to run a lot of stuff locally (mostly 4-b it quant 32b and smaller since I only have a single 3090) but lately I’ve been trying more larger models out on OpenRouter since many of the non proprietary ones are super cheap. Like fractions of a penny for a response… Many are totally free to a point as well.



  • I definitely have been looking out for this for a while. Wanting to replicate GPT deep research but not seeing a great way to do this. I did see that there was a OWUI tool for this but it didn’t seem particularly battle-tested so I hadn’t checked it out yet. I’ve been curious about how the new Tongyi Deep Research might be…

    That said, specifically for troubleshooting somewhat esoteric (or at least quite bespoke in terms of configuration) software problems, I was hoping the larger coder focused models would have enough built-in knowledge to suss out the issues. Maybe I should be having them consistently augment their responses with web searches if this isn’t the case? I have not been clicking that button typically.

    I do generally try to paste in or link as much of the documentation for whatever software I’m troubleshooting though.


  • The coder model (480B). I initially mistakenly said the 235b one but edited that. I didn’t know you could customize quant on OpenRouter (and I thought the differences between most modern 4 bit quants and 8-bit was minimal as well…) I have tried GPT OSS 120 a bunch of time and though it seems quote unquote ‘intelligent’ enough it is just too talkative and verbose for me (plus I can’t remember the last time it responded without somehow working an elaborate comparison table into the response) and it makes it too hard to parse through things.







  • It’s new so reviews are just filtering out but it’s starting to look like SteamOS powered version of the Legion Go S (Z1 Extreme version) is a pretty great handheld that uses the latest AMD chipset with a sizable assist from Linux/proton efficiencies vs Windows to drive a 15-30% performance improvement which does make some more modern games more playable though it is significantly more expensive than the deck. I watched Retro Games Corps review of it yesterday. That said, if you’re okay waiting another couple years or so I bet there will be a Steam Deck 2 release but it seems like it mainly rests on AMD to deliver a significant (“generational”) leap with upcoming mobile APUs. Valve seems keen on not releasing a follow-up to the first deck until it is significantly better in every way and the chipsets available now just aren’t quite there yet it seems.