Bounty - 1 BTC + 100 Grin for a MacOS M1 C32 Open Source Miner with ≥0.5 Gps

I’m happy that there is some progress ongoing. I would like to know the needed electrical power, and wonder if thermal power limiting plays a role. If I got the information I will add it to this thread. Mining Hardware comparision

1 Like

Hello, scanning this thread it appears some ppl are able to mine GRIN on Apple Silicon. Are there any setup instructions someone could pls link?

Thanks

@NicolasFlamel made some progress on this bounty, but I am not sure if he want to release share his work right now since he did not cash in the bounty yet.

@gig Here’s a cuckatoo miner I made that might work on Apple silicon. https://github.com/NicolasFlamel1/Cuckatoo-Reference-Miner

To use it, install Xcode from the app store then run the following commands in a terminal to compile and run it to mine to a stratum server at 127.0.0.1:3416.

curl -LO "https://github.com/NicolasFlamel1/Cuckatoo-Reference-Miner/archive/refs/heads/master.zip"
unzip "./master.zip"
cd "./Cuckatoo-Reference-Miner-master"
make EDGE_BITS=32 TRIMMING_ROUNDS=90
"./Cuckatoo Reference Miner" --stratum_server_address 127.0.0.1 --stratum_server_port 3416

It’ll probably use lean trimming on your Mac mini M4, so I wouldn’t expect mining speeds faster than 0.03 graphs/second. But who knows.

Also, for anyone who’s curious, all the work I did toward the Mac Studio M1 Ultra bounty is in this github repo which is just the edge trimming part of the cuckatoo mining algorithm. I’m not planning on proceeding any further with it.

6 Likes

Thanks for the substantial effort you made in pursuing the bounty. I filed an issue on your repo with some questions about the solver [1].

[1] Nature of bottleneck · Issue #1 · NicolasFlamel1/Mac-Studio-M1-Ultra-Cuckatoo-Trimmer · GitHub

1 Like

Thank you so much Nicolas! I would have replied sooner but I wanted to try and get it working before doing so. It’s been years since I did any mining and since I’ve not done all the needed dyor terms like lean or mean trimming are rather baffling to me!

Connecting to 127.0.0.1 results in errors, perhaps because I need a Grin wallet to listen on the same port? Not sure. Any clarification here would be great.

So I then gave the miner args to connect to a mining pool using the -a, -p and -u flags. I failed to connect to a couple pools but then the 3rd and 4th pools I tried seemed ok and my Terminal spat out the following:

Connecting to the stratum server at grin.eu.always.vip:3344
Logging into the stratum server with username gig@protonmail.com.001
No applicable GPU found for mean trimming. Mean trimming requires 28.21 GB of GPU RAM
Using Apple M4 for lean trimming

Pipeline stages:
Trimming: 89.6633 second(s)
Solutions: 0

Pipeline stages:
Searching: 0.658004 second(s)
Trimming: 57.472 second(s)
Solutions: 0

Pipeline stages:
Searching: 0.661072 second(s)
Trimming: 57.7667 second(s)
Solutions: 0

And after some time the outputs changed a bit:

Pipeline stages:
Searching: 0.640349 second(s)
Trimming: 55.6086 second(s)
Solutions: 1

Pipeline stages:
Searching: 0.642871 second(s)
Trimming: 55.6021 second(s)
Solutions: 1

Pipeline stages:
Searching: 0.636847 second(s)
Trimming: 55.5943 second(s)
Solutions: 2
Receiving response from the stratum server failed

The dashboard of the pools I tried never indicated I was mining, and perhaps this is because the 90 rounds of trimming did not complete? Moreover my Mac mini CPU never cranked up.

I feel I’m tantalisingly close to getting this working and likely missing something obvious, and I’ll keep trying. If anybody here can illuminate me on the exact syntax I should be using, either on 127.0.0.1 or on a specific Grin mining pool, I’d be most grateful. Apologies for my noobness here but, as I said, it’s been a while ツ

Thanks!

1 Like

Those mining speeds are pretty bad. Trimming: 55.6021 second(s) is approximately 0.018 graphs/second, so all the blocks you mine will probably be stale given that Grin’s average block time is 60 seconds.

You can reduce the TRIMMING_ROUNDS value when compiling the miner which puts less work on the GPU and more work on the CPU since the CPU has to search through more edges for each graph, but this will probably only shorten your mining times by a second or two. However, mining on your Mac mini M4 with my mining software probably isn’t worth it at this point.

127.0.0.1:3416 would be if you’re running your own grin node locally as a stratum server. If you want to do that you can run the following commands in a terminal to download the grin node, enable its stratum server setting, and run it. It’ll also take a few hours to sync once you start it the first time.

curl -LO "https://github.com/mimblewimble/grin/releases/download/v5.3.3_rebuild/grin-v5.3.3_rebuild-macos-arm64.tar.gz"
tar -xf "./grin-v5.3.3_rebuild-macos-arm64.tar.gz"
"./grin" clean
sed -i "s/enable_stratum_server = false/enable_stratum_server = true/" ~/.grin/main/grin-server.toml
"./grin"

Grin node’s default stratum server settings will use the wallet listening at 127.0.0.1:3415 when creating coinbase UTXOs, so you’ll also need to run a grin wallet locally if you use the default settings. You can do that by running the following commands in another terminal. While the node and wallet are running then you can run the mining software.

curl -LO "https://github.com/mimblewimble/grin-wallet/releases/download/v5.4.0-alpha.1/grin-wallet-v5.4.0-alpha.1-macos-x86_64.tar.gz"
tar -xf "./grin-wallet-v5.4.0-alpha.1-macos-x86_64.tar.gz"
"./grin-wallet" listen

Also lean and mean trimming are referring to the cuckatoo edge trimming algorithm used during the mining process. Lean uses less memory but is slower due to random memory access patterns, and mean uses more memory but is faster because it coalesces memory accesses.

2 Likes

Thank you once again, it’s exceedingly kind of you to take the time to spell out the commands, and it is gratifying to know that Apple Silicon users will benefit from your clear instructions.

Currently syncing up to try mining on my own node and will report back. Meantime I tried a TRIMMING_ROUNDS=30 compile before connecting to a pool with the following output:

% "./Cuckatoo Reference Miner" -a grin32.eu.gaeapool.com -p 3344 -u gig.001
Cuckatoo Reference Miner v0.0.1 (Cuckatoo32, 30 trimming round(s))
Connecting to the stratum server at grin32.eu.gaeapool.com:3344
Logging into the stratum server with username gig.001
No applicable GPU found for mean trimming. Mean trimming requires 28.59 GB of GPU RAM
Using Apple M4 for lean trimming

Pipeline stages:
Trimming: 53.8445 second(s)
Solutions: 0

Pipeline stages:
Searching: 6.13421 second(s)
Trimming: 53.932 second(s)
Solutions: 0

Pipeline stages:
Searching: 5.53612 second(s)
Trimming: 53.4937 second(s)
Solutions: 0

Pipeline stages:
Searching: 5.66949 second(s)
Trimming: 53.5093 second(s)
Solutions: 0

Pipeline stages:
Searching: 6.03964 second(s)
Trimming: 53.9724 second(s)
Solutions: 0

Pipeline stages:
Searching: 5.99721 second(s)
Trimming: 53.6297 second(s)
Solutions: 0

Pipeline stages:
Searching: 5.46402 second(s)
Trimming: 55.0472 second(s)
Solutions: 0

But ultimately my mining attempts always result in:

Receiving response from the stratum server failed

And then the program quits.

I take it this is some sort of prejudice on the part of the mining pool ops determining I am not fit to join the pool? :smiley:

Regardless, I will try mining off my own node for better results, and look forward to future developments in ARM mining!

P.S. The best way I know to download Xcode without using the App Store (which I dislike) is to open Terminal and enter xcode-select --install and accept the dialog which opens. I’m sure you know this, Nicolas, but I copy it for the benefit of other curious Apple users such as myself,

mischief managed.

1 Like

It’s no problem, I’m glad to help.

The Receiving response from the stratum server failed message you’re seeing is probably caused the mining pool closing the network connection due to lack on activity. I updated the mining software to send keep alive requests to the stratum server regularly which may fix this. Download the mining software’s source code again from https://github.com/NicolasFlamel1/Cuckatoo-Reference-Miner if you want to try it with this change.

Good to know how to install Xcode without the App Store. I don’t actually use macOS very much, so I’ll keep that in mind the next time I need to install Xcode.

2 Likes

omg wow and ty!! will update after more testing

2 Likes

All synced up and I think I’m now solo mining.

TRIMMING_ROUNDS of 25 was used. This has reduced the trimming time but dramatically increased the searching time. What does this mean, and must the values of the two numbers combined be less than 60 seconds to avoid stale blocks? I will keep fiddling.

Connecting to 127.0.0.1 errored but using localhost in its place did the trick, so:

"./Cuckatoo Reference Miner" -a localhost -p 3416

I note that 3 (of my shares, blocks? not sure of the term here) were accepted, so that’s good. But I also note that my CPU idles at 0% and bursts up to 400% (4 cores) at regular intervals. Is this normal? I feel the proc has a lot more to give, ad I’m not sure if it should be working in bursts like this. Thoughts?

As you said most, but not all, are stale.

Incidentally, what are SOLUTIONS? (this number keeps climbing steadily as I mine), and where can I see my graph rate? And does the output of the miner window ever change from this sort of thing?

Pipeline stages:
Searching: 8.38568 second(s)
Trimming: 53.5419 second(s)
Solutions: 20

Thanks to you and others for your assistance.

P.S Was surprised my GPU was doing any work, but appears to be doing a fair bit. Might be able to get more out of it somehow



1 Like

Short answer

A solution is a 42-cycle in a huge random bipartite graph.

Long answer

The cuckoo cycle project page is at GitHub - tromp/cuckoo: a memory-bound graph-theoretic proof-of-work system
with various descriptions of Cuckoo Cycle family members and links to talks I gave at Grin conferences explaining them.

3 Likes

Thanks John

I was wondering why only 4 cores were operating in my 10-core Apple Silicon M4. The answer is obvious in the attached picture, namely only the 4 performance cores kick in when doing the compute. Perhaps there’s a way to get the efficiency cores working as well. The image also illustrates the CPU burst before idling. Not sure if this is normal for CPU mining Grin

2 Likes

Great job getting it all working!

With my mining software, the CPU is responsible for the Searching times and the GPU is responsible for the Trimming times. Those two pipeline stages run in parallel with the CPU searching the remaining edges in the current graph for solutions while the GPU removes the edges that are not part of a cycle in the next graph.

The downside of this pipeline approach is that the first graph for a job will take longer to process since it takes GPU + CPU time to complete. However all remaining graphs for that job then take max(GPU, CPU) time to complete. So ideally you’d want the to adjust the TRIMMING_ROUNDS value until your Searching (CPU) and Trimming (GPU) times are around the same value which would maximize your graphs/second per job over a long enough time since your graphs/second would eventually converge to 1 / max(GPU, CPU).

However, after thinking this through, I don’t think that your mining rate is fast enough to benefit from this pipeline approach. For example, assuming you reduce your TRIMMING_ROUNDS value until both your Searching and Trimming times are around 53 seconds, then the first graph for a job will take you 53 + 53 = 106 seconds to process while all remaining graphs for that job will take you 53 seconds to process. With Grin block times being on average every 60 seconds, you’ll likely not even finish processing a single graph for the current job by the time the job for the next block is available. So you’ll probably find more non-stale solutions with TRIMMING_ROUNDS=90 than you will with TRIMMING_ROUNDS=25 on your Mac mini M4.

A solution is a valid proof of work that you found for the current block that you are mining. However a solution must have a high enough difficulty to be included as a block in the Grin blockchain. This required threshold difficulty adjusts over time to keep Grin’s average block time around 60 seconds.

Also, my mining software uses at most 4 CPU cores for the edge searching algorithm. The algorithm I used doesn’t scale very well with respect to CPU cores, and I couldn’t find a way to consistently make it faster when using more than 4 CPU cores. And you’re seeing your CPU burst at regular intervals because your TRIMMING_ROUNDS value is resulting in your Searching time being equal to 1/6th of your Trimming time, so the CPU will be idle 5/6th of the time since it’s waiting for the GPU to finish trimming the next graph.

4 Likes