[Guide] Arch (btw), ROCm (AMD), docker (podman), llama.cpp (server) setup

Mechanize@feddit.it · 6 months ago

I don’t have direct experience with RooCode and Cline, but I would be mighty surprised if they work with lesser models of even the old Qwen2-Coder 32B - and even that was mostly misses. I never tried the Qwen3 coder but I assume it is not drastically different.

Those small models are at most useful for some kind of smarter autocomplete, not to run a full tools framework.

BTW you could check out Aider too for a different approach, and they have a lot of benchmarks that can help you get an idea about what’s needed.

Mechanize@feddit.it · 1 year ago

Thank you for commenting!

Mechanize@feddit.it · 1 year ago

It was a pleasure! Thank you!

Mechanize@feddit.it · 1 year ago

[Guide] Arch (btw), ROCm (AMD), docker (podman), llama.cpp (server) setup

Mechanize@feddit.it · 1 year ago

I’ve never used oobabooga but if you use llama.cpp directly you can specify the number of layers that you want to run on the GPU with the -ngl flag, followed by the number.

So, as an example, a command (on linux) from the directory you have the binary, to run its server would look something like: ./llama-server -m "/path/to/model.gguf" -ngl 10

Another important flag that could interest you is -c for the context size.

This will put 10 layers of the model on the GPU, the rest will be on RAM for the CPU.

I would be surprised if you can’t just connect to the llama.cpp server or just set text-generation-webui to do the same with some setting.

At worst you can consider using ollama, which is a llama.cpp wrapper.

But probably you would want to invest the time to understand how to use llama.cpp directly and put a UI in front of it, Sillytavern is a good one for many usecases, OpenWebUI can be another but - in my experience - it tends to have more half baked features and the development jumps around a lot.

As a more general answer, no, the safetensor format doesn’t directly support quantization, as far as I know

Mechanize@feddit.it · 1 year ago

That’s the bad thing about social media. If no one was doing it before, someone is now!

Jokes aside it’s possible, but with the current LLMs I don’t think there’s really a need for something like that.

Malicious actors usually try to spend the least amount of effort possibile for generalized attacks, because you end up having to often restart when found out.

So they probably just feed an LLM with some examples to get the tone right and prompt it in a way that suits their uses.

You can generate thousands of posts while Lemmy hasn’t even started to reply to one.

If you instead want to know if anyone is taking all the comments on lemmy to feed to some model training… Yeah, of course they are. Federation makes it incredibly easy to do.

Mechanize@feddit.it · 1 year ago

Probably I’m missing something but I’ve read the parent comment as a way to highlight the hypocrisy behind making extensive use of something while, simultaneously, wanting to bar others from using it

I don’t see an insult in there, given the choice of words and the context, but maybe I’m missing something fundamental?

Mechanize@feddit.it · 1 year ago

Hangs, reboots or does it turn off? Depending on that it could be a plethora of things.

Do you get any error prompts? Any red alerts when restarting?

Did you check that the monitor works steadily on another system/OS?

Did you try another DE, even on a live usb?

Is the HDD/SSD healthy?

Are all the fans working? Is it the thermal protection?

Is the PSU healthy and or the power connection damaged?

Does your system have a centralized logging like journalctl or can you reach the single log files to check and add more information?

It could literally be anything, even aliens.

Mechanize@feddit.it · 1 year ago

Technically speaking, AFAIK, it is some years now that Ariane has started a project about a partially reusable rocket

EDIT: Reading the article I think it is the same project? I assume I’m misreading your comment

Mechanize@feddit.it · 1 year ago

I assumed it was a shitpost, instead it is a real tweet. What a time to be alive.

Jokes aside the only real reason I can fathom for the collectibles company to call their mother is because they had used it as the contact number in the registry. I would be surprised if this was some kind of intimidation tactic instead of just miscommunication - in the sense they probably just wanted to legally intimidate the itch’s owner not their immediate family. They are not 2K ^/s.

Mechanize@feddit.it · 1 year ago

AFAIK it is still a tuning of llama 3[.1], the new Base models will come with the release of 4 and the “Training Data” section of both the model cards is basically a copy paste.

Honestly I didn’t even consider the fact they would not be giving Base models anymore before reading this post and, even now, I don’t think this is the case. I went to search the announcements posts to see if there was something that could make me think about it being a possibility, but nothing came out.

It is true that they released Base models with 3.2, but there they had added a new projection layer on top of that, so the starting point was actually different. And 3.1 did supersede 3…

So I went and checked the 3.3 hardware section and compare it with the 3 one, the 3.1 one and the 3.2 one.

3	3.1	3.2	3.3
7.7M GPU hours	39.3M GPU hours	2.02M GPU hours	39.3M GPU hours

So yeah, I’m pretty sure the base of 3.3 is just 3.1 and they just renamed the model in the card and added the functional differences. The instruct and base versions of the models have the same numbers in the HW section, I’ll link them at the end just because.

All these words to say: I’ve no real proof, but I will be quite surprised if they will not release the Base version of 4.

Mark Zuckerberg on threads

Link to post on threads
zuck a day ago
Last big AI update of the year:
•⁠ ⁠Meta AI now has nearly 600M monthly actives
•⁠ ⁠Releasing Llama 3.3 70B text model that performs similarly to our 405B
•⁠ ⁠Building 2GW+ data center to train future Llama models
Next stop: Llama 4. Let’s go! 🚀

Meta for Developers

Link to post on facebook
Today we’re releasing Llama 3.3 70B which delivers similar performance to Llama 3.1 405B allowing developers to achieve greater quality and performance on text-based applications at a lower price point.
Download from Meta: –

Small note: I did delete my previous post because I had messed up the links, so I had to recheck them, whoops

Mechanize@feddit.it · 1 year ago

deleted by creator

Mechanize@feddit.it · 1 year ago

It’s probably a problem with the UEFI, the windows info got overwritten, and you can probably fix this with efibootmgr

It happened to me too, but unfortunately it was some years ago and I’m not at home to find the related notes that I took. I remember there was a windows utility to rewrite the boot loader. But probably in your case the boot partition is still okay, just the UEFI entry got overwritten and you just have to add it back manually.

Check the troubleshooting section of the wiki page to have a tip on the windows booting location

Mechanize@feddit.it · 1 year ago

Nice data, but I think we should take a broader view too:

https://data.worldbank.org/indicator/NY.GDP.MKTP.CD?end=2023&locations=RU-IN&start=2019

I semi randomly picked India because it is part of BRICS and had a similar economic trajectory: It is quite interesting playing with all those nobs and labels.

In this context I think PPP - which you showed - is a good indicator of the internal quality of living, but as far as I understand it, it has an hard time showing the difference in quality and standards of the consumer products between countries, so a dip in nominal GDP is an interesting context with the PPP adjusted rise. Less expensive things, because they are less regulated?

Aside from that Russia has almost completely pivoted to a war economy which, as far as I know, tends to give a big initial boost but it stresses and makes the real (for lack of a better term) economy crash in the long run.

What do you think about this? It is an interesting topic.

Mechanize@feddit.it · 1 year ago

I was reading @[email protected] and @[email protected] exchange and I found it an interesting - albeit moot - topic. So I went and spent the last hour to download some data and filter it: I will post some numbers with no commentary. I will add my opinions after them in a spoiler.

imf.org GDP, current prices, Billion of U.S. dollars

2023 GDP Nominal
NATO 52392,344
BRICS 27330,345

2024 GDP Nominal (estimates)
NATO 55148,819
BRICS 28442,630

imf.org GDP, current prices, Purchasing power parity; billions of international dollars

2023 GDP PPP
NATO 63996,245
BRICS 66010,889

2024 GDP PPP (estimates)
NATO 66812,821
BRICS 70911,69

imf.org GDP based on PPP, share of world

2023 GDP PPPSH
NATO 34,731
BRICS 35,824

2024 GDP PPPSH (estimates)
NATO 34,339
BRICS 36,446

BRICS

Brazil, People’s Republic of China, Egypt, Ethiopia, India, Iran, Russian Federation, South Africa, United Arab Emirates

NATO

Albania, Belgium, Bulgaria, Canada, Croatia, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Italy, Latvia, Lithuania, Luxembourg, Montenegro, Netherlands, North Macedonia, Norway, Poland, Portugal, Romania, Slovak Republic, Slovenia, Spain, Sweden, Republic of Türkiye, United Kingdom, United States

MHO

This comparison makes no sense for a multitude of reasons, starting from the difference in effective cohesion, motivation and raison d’être of the two organizations.

Even if there were multiple tries, especially by Russia, to push for more integration in the economic and military structure, you can see how it is still incredibly fractured: if you are interested you can check on the current state of the SWIFT alternatives to see how much each of the big players still pull to be the leader.

A more apt loose organization to compare BRICS to would probably be the G7, but even there it really is not the same, considering the member list and how integrated they are in other ways. Still, a better one.

Aside from that the PPP is often touted as a great way to compare completely different economies, and it has its uses to understand how people live in different countries. Its use in a comparison like this one has, IMHO, no space.

If someone comes to me with a one Billion random-currency investment, even if for them it only buys a loaf of bread but for me it means a new factory and 100 full-time employees, if they withdraw it it is a disaster.
Then again GDP is not even the parameter we should be looking into, considering the article: We should check the international trade between China and the European Union, and make consideration about that.

Last, but not least, I used the IMF numbers because they are easy to get in a nice format. They are not the best, but they are not the worst too. More info here, have fun.

Mechanize@feddit.it · 2 years ago

This is getting weird.

If I would generate an image with an AI and then take a photo of it, I could copyright the photo, even if the underlying art is not copyrightable, just like the leaves?

So, in an hypothetical way, I could hold a copyright on the photo of the image, but not on the image itself.

So if someone would find the model, seed, inference engine and prompt they could theoretically redo the image and use it, but until then they would be unable to use my photo for it?

So I would have a copyright to it through obscurity, trying to make it unfeasible to replicate?

This does sound bananas, which - to be fair - is pretty in line with my general impression of copyright laws.

Mechanize@feddit.it · 2 years ago

The only two things I can currently think of are:

Shut down the PC and try inverting the slot your monitors are plugged in in the graphic card
Right click on the Desktop -> Swap the primary Display -> Apply -> Swap it back

Honestly I’m not sure if it will help but I don’t have other ideas aside from purging the configuration, which is probably not the solution and unwarrented.

Another thing to try would be a search on the KDE bugtracker.

Mechanize@feddit.it · 2 years ago

I didn’t know about this project, so I took a quick look around.

I didn’t see any mention of Telemetry or Metrics, but I assume they can use this:

After starting Tails and connecting to Tor, Tails Upgrader automatically checks if upgrades are available and then proposes you to upgrade your USB stick. The upgrades are checked for and downloaded through Tor.

https://tails.net/doc/upgrade/index.en.html#automatic

Still, I just gave this a few minutes, so there could be more.

Mechanize@feddit.it · 2 years ago

Remedy and Annapurna announce a strategic cooperation agreement on Control 2 and bringing Control and Alan Wake to film and television

I’m not sure this is going to directly affect that, because their deal talks mainly about financing for the Control game, and the other news is about movie adaptations, so probably it is going to be another team, lead by the newly re-hired Hector Sanchez, working on that…

But who knows, this kind of things are always hard to follow from the outside

Mechanize@feddit.it · 2 years ago

So, I can’t install aur packages via pacman?

Nope, you have to do it manually or using an helper that abstracts the manual work away.

AUR packages, or to be more precise the PKGBUILD files, are recipes to compile or download stuff outside from the official repositories, manage their deps and installing them on the system.

You should always only run PKGBUILD files that you trust, they can do basically anything on your system. Checking the comments of the package in the aur repo is a good practice too.

Also Are you quoting certain nExT gEn gAmE guy?

…maybe

Mechanize@feddit.it · 2 years ago

Also in wiki they didn’t mention anything about OpenSSL?

Sorry, that was my bad, I wrote OpenSSL instead of openvpn. That one is probably needed too, but you should not have to pull it manually.

Generally speaking the ArchWiki is one of the best, more structured and well maintained source of information about Linux things even for other distros, but it can too be outdated, so you should always check if the info is valid. In this case it seems so.

In theory you should be able to just install proton-vpn-gtk-app using one of the many AUR helpers and it should Just Work™. Paru and yay are the most commonly used ones - as far as I know - and they wrap around pacman too, so you can use them to do everything packages related. Usually Arch related distro use one of them, for example EndeavourOS have yay already installed.

At worst when you try to start protonvpn the GUI will not appear or immediately crash: if that happens, usually, you can try and run the program from the Shell and see what kind of error it returns and work your way from there. Checking if the deps listed in the wiki are installed is always a great first step.

Mechanize@feddit.it · edit-2 2 years ago

Last night Organic Maps was removed from the Play Store